The draft of the third edition of Speech and Language Processing an introduction to natural language processing, computational linguistics, and speech recognition is available for free online. It is being written by Daniel Jurafsky (Stanford University) and James H. Martin (University of Colorado at Boulder). It is a work in progress and some sections remain incomplete and others may change.
The second edition of Speech and Language Processing is available on Amazon (rather than freely online); which makes me wonder if the third edition may only be available while it is still a work in progress. The 3rd edition is clearly going to be organized differently as the draft TOC is quite a bit different than that of the 2nd. The 2nd edition can be purchased in hardcover, softcover and eText (link and pricing shown) formats or rented from Amazon. Used copies may also be available.
Like so many topics in computer science, speech and natural language processing is constantly developing new techniques and approaches. In such an evolving filed, it’s hard for any static text to provide all of the current information. That said, this text, according to many reviewers of the 2nd edition, does an excellent job of providing the foundational knowledge needed for students and professionals. Several professionals mentioned the 2nd edition was a permanent part of their reference library and still referred to it for their own endeavors.
Speech and Language Processing combines “deep linguistic analysis with robust statistical methods.” The book covers, “regular expressions, information retrieval, context free grammars, unification, first-order predicate calculus, hidden Markov and other probabilistic models, rhetorical structure theory and others.”
This textbook was written for advanced undergraduate or graduate-level students.
Speech and Language Processing Table of Contents
1 Introduction
2 Regular Expressions, Text Normalization, Edit Distance
2.1 Regular Expressions
2.2 Words
2.3 Corpora
2.4 Text Normalization
2.5 Minimum Edit Distance
2.6 Summary
Bibliographical and Historical Notes
Exercises
3 N-gram Language Models
3.1 N-Grams
3.2 Evaluating Language Models
3.3 Generalization and Zeros
3.4 Smoothing
3.5 Kneser-Ney Smoothing
3.6 The Web and Stupid Backoff
3.7 Advanced: Perplexity’s Relation to Entropy
3.8 Summary
Bibliographical and Historical Notes
Exercises
4 Naive Bayes and Sentiment Classification
4.1 Naive Bayes Classifiers
4.2 Training the Naive Bayes Classifier
4.3 Worked example
4.4 Optimizing for Sentiment Analysis
4.5 Naive Bayes for other text classification tasks
4.6 Naive Bayes as a Language Model
4.7 Evaluation: Precision, Recall, F-measure
4.8 Test sets and Cross-validation
4.9 Statistical Significance Testing
4.10 Advanced: Feature Selection
4.11 Summary
Bibliographical and Historical Notes
Exercises
5 Logistic Regression
5.1 Classification: the sigmoid
5.2 Learning in Logistic Regression
5.3 The cross-entropy loss function
5.4 Gradient Descent
5.5 Regularization
5.6 Multinomial logistic regression
5.7 Interpreting models
5.8 Advanced: Deriving the Gradient Equation
5.9 Summary
Bibliographical and Historical Notes
Exercises
6 Vector Semantics 101
6.1 Lexical Semantics
6.2 Vector Semantics
6.3 Words and Vectors
6.4 Cosine for measuring similarity
6.5 TF-IDF: Weighing terms in the vector
6.6 Applications of the tf-idf vector model
6.7 Optional: Pointwise Mutual Information (PMI)
6.8 Word2vec
6.9 Visualizing Embeddings
6.10 Semantic properties of embeddings
6.11 Bias and Embeddings
6.12 Evaluating Vector Models
6.13 Summary
Bibliographical and Historical Notes
Exercises
7 Neural Networks and Neural Language Models
7.1 Units
7.2 The XOR problem
7.3 Feed-Forward Neural Networks
7.4 Training Neural Nets
7.5 Neural Language Models
7.6 Summary
Bibliographical and Historical Notes
8 Part-of-Speech Tagging
8.1 (Mostly) English Word Classes
8.2 The Penn Treebank Part-of-Speech Tagset
8.3 Part-of-Speech Tagging
8.4 HMM Part-of-Speech Tagging
8.5 Maximum Entropy Markov Models
8.6 Bidirectionality
8.7 Part-of-Speech Tagging for Other Languages
8.8 Summary
Bibliographical and Historical Notes
Exercises
9 Sequence Processing with Recurrent Networks
9.1 Simple Recurrent Networks
9.2 Applications of RNNs
9.3 Deep Networks: Stacked and Bidirectional RNNs
9.4 Managing Context in RNNs: LSTMs and GRUs
9.5 Words, Characters and Byte-Pairs
9.6 Summary
10 Formal Grammars of English
10.1 Constituency
10.2 Context-Free Grammars
10.3 Some Grammar Rules for English
10.4 Treebanks
10.5 Grammar Equivalence and Normal Form
10.6 Lexicalized Grammars
10.7 Summary
Bibliographical and Historical Notes
Exercises
11 Syntactic Parsing
11.1 Ambiguity
11.2 CKY Parsing: A Dynamic Programming Approach
11.3 Partial Parsing
11.4 Summary
Bibliographical and Historical Notes
Exercises
12 Statistical Parsing
12.1 Probabilistic Context-Free Grammars
12.2 Probabilistic CKY Parsing of PCFGs
12.3 Ways to Learn PCFG Rule Probabilities
12.4 Problems with PCFGs
12.5 Improving PCFGs by Splitting Non-Terminals
12.6 Probabilistic Lexicalized CFGs
12.7 Probabilistic CCG Parsing
12.8 Evaluating Parsers
12.9 Human Parsing
12.10 Summary
Bibliographical and Historical Notes
Exercises
13 Dependency Parsing
13.1 Dependency Relations
13.2 Dependency Formalisms
13.3 Dependency Treebanks
13.4 Transition-Based Dependency Parsing
13.5 Graph-Based Dependency Parsing
13.6 Evaluation
13.7 Summary
Bibliographical and Historical Notes
Exercises
14 The Representation of Sentence Meaning
14.1 Computational Desiderata for Representations
14.2 Model-Theoretic Semantics
14.3 First-Order Logic
14.4 Event and State Representations
14.5 Description Logics
14.6 Summary
Bibliographical and Historical Notes
Exercises
15 Computational Semantics
16 Semantic Parsing
17 Information Extraction
17.1 Named Entity Recognition
17.2 Relation Extraction
17.3 Extracting Times
17.4 Extracting Events and their Times
17.5 Template Filling
17.6 Summary
Bibliographical and Historical Notes
Exercises
18 Semantic Role Labeling
18.1 Semantic Roles
18.2 Diathesis Alternations
18.3 Semantic Roles: Problems with Thematic Roles
18.4 The Proposition Bank
18.5 FrameNet
18.6 Semantic Role Labeling
18.7 Selectional Restrictions
18.8 Primitive Decomposition of Predicates
18.9 Summary
Bibliographical and Historical Notes
Exercises
19 Lexicons for Sentiment, Affect, and Connotation
19.1 Defining Emotion
19.2 Available Sentiment and Affect Lexicons
19.3 Creating affect lexicons by human labeling
19.4 Semi-supervised induction of affect lexicons
19.5 Supervised learning of word sentiment
19.6 Using Lexicons for Sentiment Recognition
19.7 Other tasks: Personality
19.8 Affect Recognition
19.9 Connotation Frames
19.10 Summary
Bibliographical and Historical Notes
20 Coreference Resolution and Entity Linking
21 Discourse Coherence
22 Machine Translation
23 Question Answering
23.1 IR-based Factoid Question Answering
23.2 Knowledge-based Question Answering
23.3 Using multiple information sources: IBM’s Watson
23.4 Evaluation of Factoid Answers
Bibliographical and Historical Notes
Exercises
24 Dialog Systems and Chatbots
24.1 Chatbots
24.2 Frame Based Dialog Agents
24.3 VoiceXML
24.4 Evaluating Dialog Systems
24.5 Dialog System Design
24.6 Summary
Bibliographical and Historical Notes
Exercises
25 Advanced Dialog Systems
25.1 Dialog Acts
25.2 Dialog State: Interpreting Dialog Acts
25.3 Dialog Policy
25.4 A simple policy based on local context
25.5 Natural language generation in the dialog-state model
25.6 Deep Reinforcement Learning for Dialog
25.7 Summary
Bibliographical and Historical Notes
26 Speech Recognition and Synthesis
Appendix A Hidden Markov Models
A.1 Markov Chains
A.2 The Hidden Markov Model
A.3 Likelihood Computation: The Forward Algorithm
A.4 Decoding: The Viterbi Algorithm
A.5 HMM Training: The Forward-Backward Algorithm
A.6 Summary
Bibliographical and Historical Notes
Appendix B Spelling Correction and the Noisy Channel
B.1 The Noisy Channel Model
B.2 Real-word spelling errors
B.3 Noisy Channel Model: The State of the Art
Bibliographical and Historical Notes
Exercises
Appendix C WordNet: Word Relations, Senses, and Disambiguation
C.1 Word Senses
C.2 WordNet: A Database of Lexical Relations
C.3 Word Similarity: Thesaurus Methods
C.4 Word Sense Disambiguation: Overview
C.5 Supervised Word Sense Disambiguation
C.6 WSD: Dictionary and Thesaurus Methods
C.7 Semi-Supervised WSD: Bootstrapping
C.8 Unsupervised Word Sense Induction
C.9 Summary
Bibliographical and Historical Notes
Exercises
View this Free Online Material at the source:
Speech and Language Processing