Speech and Language Processing

Written by: Daniel Jurafsky (Stanford University) and James H. Martin (University of Colorado at Boulder)

The draft of the third edition of Speech and Language Processing an introduction to natural language processing, computational linguistics, and speech recognition is available for free online. It is being written by Daniel Jurafsky (Stanford University) and James H. Martin (University of Colorado at Boulder). It is a work in progress and some sections remain incomplete and others may change.

The second edition of Speech and Language Processing is available on Amazon (rather than freely online); which makes me wonder if the third edition may only be available while it is still a work in progress. The 3rd edition is clearly going to be organized differently as the draft TOC is quite a bit different than that of the 2nd. The 2nd edition can be purchased in hardcover, softcover and eText (link and pricing shown) formats or rented from Amazon. Used copies may also be available.


Like so many topics in computer science, speech and natural language processing is constantly developing new techniques and approaches. In such an evolving filed, it’s hard for any static text to provide all of the current information. That said, this text, according to many reviewers of the 2nd edition, does an excellent job of providing the foundational knowledge needed for students and professionals. Several professionals mentioned the 2nd edition was a permanent part of their reference library and still referred to it for their own endeavors.

Speech and Language Processing combines “deep linguistic analysis with robust statistical methods.” The book covers, “regular expressions, information retrieval, context free grammars, unification, first-order predicate calculus, hidden Markov and other probabilistic models, rhetorical structure theory and others.”

This textbook was written for advanced undergraduate or graduate-level students.

Speech and Language Processing Table of Contents

1 Introduction

2 Regular Expressions, Text Normalization, Edit Distance

2.1 Regular Expressions
2.2 Words
2.3 Corpora
2.4 Text Normalization
2.5 Minimum Edit Distance
2.6 Summary
Bibliographical and Historical Notes
Exercises

3 N-gram Language Models

3.1 N-Grams
3.2 Evaluating Language Models
3.3 Generalization and Zeros
3.4 Smoothing
3.5 Kneser-Ney Smoothing
3.6 The Web and Stupid Backoff
3.7 Advanced: Perplexity’s Relation to Entropy
3.8 Summary
Bibliographical and Historical Notes
Exercises

4 Naive Bayes and Sentiment Classification

4.1 Naive Bayes Classifiers
4.2 Training the Naive Bayes Classifier
4.3 Worked example
4.4 Optimizing for Sentiment Analysis
4.5 Naive Bayes for other text classification tasks
4.6 Naive Bayes as a Language Model
4.7 Evaluation: Precision, Recall, F-measure
4.8 Test sets and Cross-validation
4.9 Statistical Significance Testing
4.10 Advanced: Feature Selection
4.11 Summary
Bibliographical and Historical Notes
Exercises

5 Logistic Regression

5.1 Classification: the sigmoid
5.2 Learning in Logistic Regression
5.3 The cross-entropy loss function
5.4 Gradient Descent
5.5 Regularization
5.6 Multinomial logistic regression
5.7 Interpreting models
5.8 Advanced: Deriving the Gradient Equation
5.9 Summary
Bibliographical and Historical Notes
Exercises

6 Vector Semantics 101

6.1 Lexical Semantics
6.2 Vector Semantics
6.3 Words and Vectors
6.4 Cosine for measuring similarity
6.5 TF-IDF: Weighing terms in the vector
6.6 Applications of the tf-idf vector model
6.7 Optional: Pointwise Mutual Information (PMI)
6.8 Word2vec
6.9 Visualizing Embeddings
6.10 Semantic properties of embeddings
6.11 Bias and Embeddings
6.12 Evaluating Vector Models
6.13 Summary
Bibliographical and Historical Notes
Exercises

7 Neural Networks and Neural Language Models

7.1 Units
7.2 The XOR problem
7.3 Feed-Forward Neural Networks
7.4 Training Neural Nets
7.5 Neural Language Models
7.6 Summary
Bibliographical and Historical Notes

8 Part-of-Speech Tagging

8.1 (Mostly) English Word Classes
8.2 The Penn Treebank Part-of-Speech Tagset
8.3 Part-of-Speech Tagging
8.4 HMM Part-of-Speech Tagging
8.5 Maximum Entropy Markov Models
8.6 Bidirectionality
8.7 Part-of-Speech Tagging for Other Languages
8.8 Summary
Bibliographical and Historical Notes
Exercises

9 Sequence Processing with Recurrent Networks

9.1 Simple Recurrent Networks
9.2 Applications of RNNs
9.3 Deep Networks: Stacked and Bidirectional RNNs
9.4 Managing Context in RNNs: LSTMs and GRUs
9.5 Words, Characters and Byte-Pairs
9.6 Summary

10 Formal Grammars of English

10.1 Constituency
10.2 Context-Free Grammars
10.3 Some Grammar Rules for English
10.4 Treebanks
10.5 Grammar Equivalence and Normal Form
10.6 Lexicalized Grammars
10.7 Summary
Bibliographical and Historical Notes
Exercises

11 Syntactic Parsing

11.1 Ambiguity
11.2 CKY Parsing: A Dynamic Programming Approach
11.3 Partial Parsing
11.4 Summary
Bibliographical and Historical Notes
Exercises

12 Statistical Parsing

12.1 Probabilistic Context-Free Grammars
12.2 Probabilistic CKY Parsing of PCFGs
12.3 Ways to Learn PCFG Rule Probabilities
12.4 Problems with PCFGs
12.5 Improving PCFGs by Splitting Non-Terminals
12.6 Probabilistic Lexicalized CFGs
12.7 Probabilistic CCG Parsing
12.8 Evaluating Parsers
12.9 Human Parsing
12.10 Summary
Bibliographical and Historical Notes
Exercises

13 Dependency Parsing

13.1 Dependency Relations
13.2 Dependency Formalisms
13.3 Dependency Treebanks
13.4 Transition-Based Dependency Parsing
13.5 Graph-Based Dependency Parsing
13.6 Evaluation
13.7 Summary
Bibliographical and Historical Notes
Exercises

14 The Representation of Sentence Meaning

14.1 Computational Desiderata for Representations
14.2 Model-Theoretic Semantics
14.3 First-Order Logic
14.4 Event and State Representations
14.5 Description Logics
14.6 Summary
Bibliographical and Historical Notes
Exercises

15 Computational Semantics

16 Semantic Parsing

17 Information Extraction

17.1 Named Entity Recognition
17.2 Relation Extraction
17.3 Extracting Times
17.4 Extracting Events and their Times
17.5 Template Filling
17.6 Summary
Bibliographical and Historical Notes
Exercises

18 Semantic Role Labeling

18.1 Semantic Roles
18.2 Diathesis Alternations
18.3 Semantic Roles: Problems with Thematic Roles
18.4 The Proposition Bank
18.5 FrameNet
18.6 Semantic Role Labeling
18.7 Selectional Restrictions
18.8 Primitive Decomposition of Predicates
18.9 Summary
Bibliographical and Historical Notes
Exercises

19 Lexicons for Sentiment, Affect, and Connotation

19.1 Defining Emotion
19.2 Available Sentiment and Affect Lexicons
19.3 Creating affect lexicons by human labeling
19.4 Semi-supervised induction of affect lexicons
19.5 Supervised learning of word sentiment
19.6 Using Lexicons for Sentiment Recognition
19.7 Other tasks: Personality
19.8 Affect Recognition
19.9 Connotation Frames
19.10 Summary
Bibliographical and Historical Notes

20 Coreference Resolution and Entity Linking

21 Discourse Coherence

22 Machine Translation

23 Question Answering

23.1 IR-based Factoid Question Answering
23.2 Knowledge-based Question Answering
23.3 Using multiple information sources: IBM’s Watson
23.4 Evaluation of Factoid Answers
Bibliographical and Historical Notes
Exercises

24 Dialog Systems and Chatbots

24.1 Chatbots
24.2 Frame Based Dialog Agents
24.3 VoiceXML
24.4 Evaluating Dialog Systems
24.5 Dialog System Design
24.6 Summary
Bibliographical and Historical Notes
Exercises

25 Advanced Dialog Systems

25.1 Dialog Acts
25.2 Dialog State: Interpreting Dialog Acts
25.3 Dialog Policy
25.4 A simple policy based on local context
25.5 Natural language generation in the dialog-state model
25.6 Deep Reinforcement Learning for Dialog
25.7 Summary
Bibliographical and Historical Notes

26 Speech Recognition and Synthesis

Appendix A Hidden Markov Models

A.1 Markov Chains
A.2 The Hidden Markov Model
A.3 Likelihood Computation: The Forward Algorithm
A.4 Decoding: The Viterbi Algorithm
A.5 HMM Training: The Forward-Backward Algorithm
A.6 Summary
Bibliographical and Historical Notes

Appendix B Spelling Correction and the Noisy Channel

B.1 The Noisy Channel Model
B.2 Real-word spelling errors
B.3 Noisy Channel Model: The State of the Art
Bibliographical and Historical Notes
Exercises

Appendix C WordNet: Word Relations, Senses, and Disambiguation

C.1 Word Senses
C.2 WordNet: A Database of Lexical Relations
C.3 Word Similarity: Thesaurus Methods
C.4 Word Sense Disambiguation: Overview
C.5 Supervised Word Sense Disambiguation
C.6 WSD: Dictionary and Thesaurus Methods
C.7 Semi-Supervised WSD: Bootstrapping
C.8 Unsupervised Word Sense Induction
C.9 Summary
Bibliographical and Historical Notes
Exercises
   

View this Free Online Material at the source:
 
Speech and Language Processing

A few other textbooks which may help you with your studies:


Real Time Web Analytics