ciando eBooks - ein Service Ihrer Bibliothek

	Language and Computers: Markus Dickinson, Chris Brew and Detmar Meurers	5
	Brief Contents	7
	Contents	9
	What This Book Is About	13
	Overview for Instructors	15
	Acknowledgments	19
	1 Prologue: Encoding Language on Computers	21
	1.1 Where do we start?	21
	1.1.1 Encoding language	22
	1.2 Writing systems used for human languages	22
	1.2.1 Alphabetic systems	23
	1.2.2 Syllabic systems	26
	1.2.3 Logographic writing systems	28
	1.2.4 Systems with unusual realization	31
	1.2.5 Relation to language	31
	1.3 Encoding written language	32
	1.3.1 Storing information on a computer	32
	1.3.2 Using bytes to store characters	34
	1.4 Encoding spoken language	37
	1.4.1 The nature of speech	37
	1.4.2 Articulatory properties	38
	1.4.3 Acoustic properties	38
	1.4.4 Measuring speech	40
	Under the Hood 1: Reading a spectrogram	44
	1.4.5 Relating written and spoken language	44
	Under the Hood 2: Language modeling for automatic speech recognition	46
	2 Writers’ Aids	53
	2.1 Introduction	53
	2.2 Kinds of spelling errors	54
	2.2.1 Nonword errors	55
	2.2.2 Real-word errors	57
	2.3 Spell checkers	58
	2.3.1 Nonword error detection	59
	2.3.2 Isolated-word spelling correction	61
	Under the Hood 3: Dynamic programming	64
	2.4 Word correction in context	69
	2.4.1 What is grammar?	70
	Under the Hood 4: Complexity of languages	76
	2.4.2 Techniques for correcting words in context	78
	Under the Hood 5: Spell checking for web queries	82
	2.5 Style checkers	84
	3 Language Tutoring Systems	89
	3.1 Learning a language	89
	3.2 Computer-assisted language learning	91
	3.3 Why make CALL tools aware of language?	93
	3.4 What is involved in adding linguistic analysis?	96
	3.4.1 Tokenization	96
	3.4.2 Part-of-speech tagging	98
	3.4.3 Beyond words	100
	3.5 An example ICALL system: TAGARELA	101
	3.6 Modeling the learner	103
	4 Searching	111
	4.1 Introduction	111
	4.2 Searching through structured data	113
	4.3 Searching through unstructured data	115
	4.3.1 Information need	115
	4.3.2 Evaluating search results	116
	4.3.3 Example: Searching the web	117
	4.3.4 How search engines work	120
	Under the Hood 6: A brief tour of HTML	123
	4.4 Searching semi-structured data with regular expressions	127
	4.4.1 Syntax of regular expressions	128
	4.4.2 Grep: An example of using regular expressions	130
	Under the Hood 7: Finite-state automata	132
	4.5 Searching text corpora	135
	4.5.1 Why corpora?	136
	4.5.2 Annotated language corpora	137
	Under the Hood 8: Searching for linguistic patterns on the web	138
	5 Classifying Documents: From Junk Mail Detection to Sentiment Classification	147
	5.1 Automatic document classification	147
	5.2 How computers “learn ”	149
	5.2.1 Supervised learning	150
	5.2.2 Unsupervised learning	151
	5.3 Features and evidence	151
	5.4 Application: Spam filtering	153
	5.4.1 Base rates	155
	5.4.2 Payoffs	159
	5.4.3 Back to documents	159
	5.5 Some types of document classifiers	160
	5.5.1 The Naive Bayes classifier	160
	Under the Hood 9: Naive Bayes	162
	5.5.2 The perceptron	165
	5.5.3 Which classifier to use	168
	5.6 From classification algorithms to context of use	169
	6 Dialog Systems	173
	6.1 Computers that “converse”?	173
	6.2 Why dialogs happen	175
	6.3 Automating dialog	176
	6.3.1 Getting started	176
	6.3.2 Establishing a goal	177
	6.3.3 Accepting the user’s goal	177
	6.3.4 The caller plays her role	178
	6.3.5 Giving the answer	178
	6.3.6 Negotiating the end of the conversation	179
	6.4 Conventions and framing expectations	179
	6.4.1 Some framing expectations for games and sports	180
	6.4.2 The framing expectations for dialogs	180
	6.5 Properties of dialog	181
	6.5.1 Dialog moves	181
	6.5.2 Speech acts	182
	6.5.3 Conversational maxims	184
	6.6 Dialog systems and their tasks	186
	6.7 Eliza	187
	Under the Hood 10: How Eliza works	192
	6.8 Spoken dialogs	194
	6.9 How to evaluate a dialog system	195
	6.10 Why is dialog important?	196
	7 Machine Translation Systems	201
	7.1 Computers that “translate”?	201
	7.2 Applications of translation	203
	7.2.1 Translation needs	203
	7.2.2 What is machine translation really for?	204
	7.3 Translating Shakespeare	205
	7.4 The translation triangle	208
	7.5 Translation and meaning	211
	7.6 Words and meanings	213
	7.6.1 Words and other languages	213
	7.6.2 Synonyms and translation equivalents	214
	7.7 Word alignment	214
	7.8 IBM Model 1	218
	Under the Hood 11: The noisy channel model	220
	Under the Hood 12: Phrase-based statistical translation	224
	7.9 Commercial automatic translation	225
	7.9.1 Translating weather reports	225
	7.9.2 Translation in the European Union	227
	7.9.3 Prospects for translators	228
	8 Epilogue: Impact of Language Technology	235
	References	241
	Concept Index	247