|
Cover |
1 |
|
|
Title Page |
5 |
|
|
Copyright Page |
6 |
|
|
Contents |
9 |
|
|
Preface |
15 |
|
|
1 Introduction: Distributions and Inference for Categorical Data |
19 |
|
|
1.1 Categorical Response Data |
19 |
|
|
1.1.1 Response–Explanatory Variable Distinction |
20 |
|
|
1.1.2 Binary–Nominal–Ordinal Scale Distinction |
20 |
|
|
1.1.3 Discrete–Continuous Variable Distinction |
21 |
|
|
1.1.4 Quantitative–Qualitative Variable Distinction |
21 |
|
|
1.1.5 Organization of Book and Online Computing Appendix |
22 |
|
|
1.2 Distributions for Categorical Data |
23 |
|
|
1.2.1 Binomial Distribution |
23 |
|
|
1.2.2 Multinomial Distribution |
24 |
|
|
1.2.3 Poisson Distribution |
24 |
|
|
1.2.4 Overdispersion |
25 |
|
|
1.2.5 Connection Between Poisson and Multinomial Distributions |
25 |
|
|
1.2.6 The Chi-Squared Distribution |
26 |
|
|
1.3 Statistical Inference for Categorical Data |
26 |
|
|
1.3.1 Likelihood Functions and Maximum Likelihood Estimation |
27 |
|
|
1.3.2 Likelihood Function and ML Estimate for Binomial Parameter |
27 |
|
|
1.3.3 Wald–Likelihood Ratio–Score Test Triad |
28 |
|
|
1.3.4 Constructing Confidence Intervals by Inverting Tests |
30 |
|
|
1.4 Statistical Inference for Binomial Parameters |
31 |
|
|
1.4.1 Tests About a Binomial Parameter |
31 |
|
|
1.4.2 Confidence Intervals for a Binomial Parameter |
32 |
|
|
1.4.3 Example: Estimating the Proportion of Vegetarians |
33 |
|
|
1.4.4 Exact Small-Sample Inference and the Mid P- Value |
34 |
|
|
1.5 Statistical Inference for Multinomial Parameters |
35 |
|
|
1.5.1 Estimation of Multinomial Parameters |
35 |
|
|
1.5.2 Pearson Chi-Squared Test of a Specified Multinomial |
36 |
|
|
1.5.3 Likelihood-Ratio Chi-Squared Test of a Specified Multinomial |
36 |
|
|
1.5.4 Example: Testing Mendel's Theories |
37 |
|
|
1.5.5 Testing with Estimated Expected Frequencies |
38 |
|
|
1.5.6 Example: Pneumonia Infections in Calves |
38 |
|
|
1.5.7 Chi-Squared Theoretical Justification |
40 |
|
|
1.6 Bayesian Inference for Binomial and Multinomial Parameters |
40 |
|
|
1.6.1 The Bayesian Approach to Statistical Inference |
40 |
|
|
1.6.2 Binomial Estimation: Beta and Logit-Normal Prior Distributions |
42 |
|
|
1.6.3 Multinomial Estimation: Dirichlet Prior Distributions |
43 |
|
|
1.6.4 Example: Estimating Vegetarianism Revisited |
44 |
|
|
1.6.5 Binomial and Multinomial Estimation: Improper Priors |
44 |
|
|
Notes |
45 |
|
|
Exercises |
46 |
|
|
2 Describing Contingency Tables |
55 |
|
|
2.1 Probability Structure for Contingency Tables |
55 |
|
|
2.1.1 Contingency Tables |
55 |
|
|
2.1.2 Joint/Marginal/Conditional Distributions for Contingency Tables |
56 |
|
|
2.1.3 Example: Sensitivity and Specificity for Medical Diagnoses |
57 |
|
|
2.1.4 Independence of Categorical Variables |
58 |
|
|
2.1.5 Poisson, Binomial, and Multinomial Sampling |
58 |
|
|
2.1.6 Example: Seat Belts and Auto Accident Injuries |
59 |
|
|
2.1.7 Example: Case–Control Study of Cancer and Smoking |
60 |
|
|
2.1.8 Types of Studies: Observational Versus Experimental |
61 |
|
|
2.2 Comparing Two Proportions |
61 |
|
|
2.2.1 Difference of Proportions |
62 |
|
|
2.2.2 Relative Risk |
62 |
|
|
2.2.3 Odds Ratio |
62 |
|
|
2.2.4 Properties of the Odds Ratio |
63 |
|
|
2.2.5 Example: Association Between Heart Attacks and Aspirin Use |
64 |
|
|
2.2.6 Case–Control Studies and the Odds Ratio |
64 |
|
|
2.2.7 Relationship Between Odds Ratio and Relative Risk |
65 |
|
|
2.3 Conditional Association in Stratified 2 × 2 Tables |
65 |
|
|
2.3.1 Partial Tables |
66 |
|
|
2.3.2 Example: Racial Characteristics and the Death Penalty |
66 |
|
|
2.3.3 Conditional and Marginal Odds Ratios |
68 |
|
|
2.3.4 Marginal Independence Versus Conditional Independence |
69 |
|
|
2.3.5 Homogeneous Association |
71 |
|
|
2.3.6 Collapsibility: Identical Conditional and Marginal Associations |
71 |
|
|
2.4 Measuring Association in I × J Tables |
72 |
|
|
2.4.1 Odds Ratios in I x J Tables |
72 |
|
|
2.4.2 Association Factors |
73 |
|
|
2.4.3 Summary Measures of Association |
74 |
|
|
2.4.4 Ordinal Trends: Concordant and Discordant Pairs |
74 |
|
|
2.4.5 Ordinal Measure of Association: Gamma |
75 |
|
|
2.4.6 Probabilistic Comparisons of Two Ordinal Distributions |
76 |
|
|
2.4.7 Example: Comparing Pain Ratings After Surgery |
77 |
|
|
2.4.8 Correlation for Underlying Normality |
77 |
|
|
Exercises |
78 |
|
|
Notes |
78 |
|
|
3 Inference for Two-Way Contingency Tables |
87 |
|
|
3.1 Confidence Intervals for Association Parameters |
87 |
|
|
3.1.1 Interval Estimation of the Odds Ratio |
87 |
|
|
3.1.2 Example: Seat-Belt Use and Traffic Deaths |
88 |
|
|
3.1.3 Interval Estimation of Difference of Proportions and Relative Risk |
89 |
|
|
3.1.4 Example: Aspirin and Heart Attacks Revisited |
89 |
|
|
3.1.5 Deriving Standard Errors with the Delta Method |
90 |
|
|
3.1.6 Delta Method Applied to the Sample Logit |
91 |
|
|
3.1.7 Delta Method for the Log Odds Ratio |
91 |
|
|
3.1.8 Simultaneous Confidence Intervals for Multiple Comparisons |
93 |
|
|
3.2 Testing Independence in Two-way Contingency Tables |
93 |
|
|
3.2.1 Pearson and Likelihood-Ratio Chi-Squared Tests |
93 |
|
|
3.2.2 Example: Education and Belief in God |
95 |
|
|
3.2.3 Adequacy of Chi-Squared Approximations |
95 |
|
|
3.2.4 Chi-Squared and Comparing Proportions in 2 x 2 Tables |
96 |
|
|
3.2.5 Score Confidence Intervals Comparing Proportions |
96 |
|
|
3.2.6 Profile Likelihood Confidence Intervals |
97 |
|
|
3.3 Following-up Chi-Squared Tests |
98 |
|
|
3.3.1 Pearson Residuals and Standardized Residuals |
98 |
|
|
3.3.2 Example: Education and Belief in God Revisited |
99 |
|
|
3.3.3 Partitioning Chi-Squared |
99 |
|
|
3.3.4 Example: Origin of Schizophrenia |
101 |
|
|
3.3.5 Rules for Partitioning |
102 |
|
|
3.3.6 Summarizing the Association |
102 |
|
|
3.3.7 Limitations of Chi-Squared Tests |
102 |
|
|
3.3.8 Why Consider Independence If It's Unlikely to Be True? |
103 |
|
|
3.4 Two-Way Tables with Ordered Classifications |
104 |
|
|
3.4.1 Linear Trend Alternative to Independence |
104 |
|
|
3.4.2 Example: Is Happiness Associated with Political Ideology? |
105 |
|
|
3.4.3 Monotone Trend Alternatives to Independence |
105 |
|
|
3.4.4 Extra Power with Ordinal Tests |
106 |
|
|
3.4.5 Sensitivity to Choice of Scores |
106 |
|
|
3.4.6 Example: Infant Birth Defects by Maternal Alcohol Consumption |
107 |
|
|
3.4.7 Trend Tests for I x 2 and 2 x J Tables |
108 |
|
|
3.4.8 Nominal-Ordinal Tables |
108 |
|
|
3.5 Small-Sample Inference for Contingency Tables |
108 |
|
|
3.5.1 Fisher's Exact Test for 2 x 2 Tables |
108 |
|
|
3.5.2 Example: Fisher's Tea Drinker |
109 |
|
|
3.5.3 Two-Sided P-Values for Fisher's Exact Test |
110 |
|
|
3.5.4 Confidence Intervals Based on Conditional Likelihood |
110 |
|
|
3.5.5 Discreteness and Conservatism Issues |
111 |
|
|
3.5.6 Small-Sample Unconditional Tests of Independence |
111 |
|
|
3.5.7 Conditional Versus Unconditional Tests |
112 |
|
|
3.6 Bayesian Inference for Two-way Contingency Tables |
114 |
|
|
3.6.1 Prior Distributions for Comparing Proportions in 2 x 2 Tables |
114 |
|
|
3.6.2 Posterior Probabilities Comparing Proportions |
115 |
|
|
3.6.3 Posterior Intervals for Association Parameters |
115 |
|
|
3.6.4 Example: Urn Sampling Gives Highly Unbalanced Treatment Allocation |
116 |
|
|
3.6.5 Highest Posterior Density Intervals |
116 |
|
|
3.6.6 Testing Independence |
117 |
|
|
3.6.7 Empirical Bayes and Hierarchical Bayesian Approaches |
118 |
|
|
3.7 Extensions for Multiway Tables and Nontabulated Responses |
118 |
|
|
3.7.1 Categorical Data Need Not Be Contingency Tables |
118 |
|
|
Notes |
119 |
|
|
Exercises |
121 |
|
|
4 Introduction to Generalized Linear Models |
131 |
|
|
4.1 The Generalized Linear Model |
131 |
|
|
4.1.1 Components of Generalized Linear Models |
132 |
|
|
4.1.2 Binomial Logit Models for Binary Data |
132 |
|
|
4.1.3 Poisson Loglinear Models for Count Data |
133 |
|
|
4.1.4 Generalized Linear Models for Continuous Responses |
133 |
|
|
4.1.5 Deviance of a GLM |
133 |
|
|
4.1.6 Advantages of GLMs Versus Transforming the Data |
134 |
|
|
4.2 Generalized Linear Models for Binary Data |
135 |
|
|
4.2.1 Linear Probability Model |
135 |
|
|
4.2.2 Example: Snoring and Heart Disease |
136 |
|
|
4.2.3 Logistic Regression Model |
137 |
|
|
4.2.4 Binomial GLM for 2 x 2 Contingency Tables |
138 |
|
|
4.2.5 Probit and Inverse cdf Link Functions |
139 |
|
|
4.2.6 Latent Tolerance Motivation for Binary Response Models |
140 |
|
|
4.3 Generalized Linear Models for Counts and Rates |
140 |
|
|
4.3.1 Poisson Loglinear Models |
141 |
|
|
4.3.2 Example: Horseshoe Crab Mating |
141 |
|
|
4.3.3 Overdispersion for Poisson GLMs |
144 |
|
|
4.3.4 Negative Binomial GLMs |
145 |
|
|
4.3.5 Poisson Regression for Rates Using Offsets |
146 |
|
|
4.3.6 Example: Modeling Death Rates for Heart Valve Operations |
146 |
|
|
4.3.7 Poisson GLM of Independence in Two-Way Contingency Tables |
148 |
|
|
4.4 Moments and Likelihood for Generalized Linear Models |
148 |
|
|
4.4.1 The Exponential Dispersion Family |
148 |
|
|
4.4.2 Mean and Variance Functions for the Random Component |
149 |
|
|
4.4.3 Mean and Variance Functions for Poisson and Binomial GLMs |
150 |
|
|
4.4.4 Systematic Component and Link Function of a GLM |
150 |
|
|
4.4.5 Likelihood Equations for a GLM |
151 |
|
|
4.4.6 The Key Role of the Mean–Variance Relationship |
152 |
|
|
4.4.7 Likelihood Equations for Binomial GLMs |
152 |
|
|
4.4.8 Asymptotic Covariance Matrix of Model Parameter Estimators |
153 |
|
|
4.4.9 Likelihood Equations and cov(?) for Poisson Loglinear Model |
154 |
|
|
4.5 Inference and Model Checking for Generalized Linear Models |
154 |
|
|
4.5.1 Deviance and Goodness of Fit |
154 |
|
|
4.5.2 Deviance for Poisson GLMs |
155 |
|
|
4.5.3 Deviance for Binomial GLMs: Grouped Versus Ungrouped Data |
155 |
|
|
4.5.4 Likelihood-Ratio Model Comparison Using the Deviances |
156 |
|
|
4.5.5 Score Tests for Goodness of Fit and for Model Comparison |
157 |
|
|
4.5.6 Residuals for GLMs |
158 |
|
|
4.5.7 Covariance Matrices for Fitted Values and Residuals |
160 |
|
|
4.5.8 The Bayesian Approach for GLMs |
160 |
|
|
4.6 Fitting Generalized Linear Models |
161 |
|
|
4.6.1 Newton–Raphson Method |
161 |
|
|
4.6.2 Fisher Scoring Method |
162 |
|
|
4.6.3 Newton–Raphson and Fisher Scoring for Binary Data |
163 |
|
|
4.6.4 ML as Iterative Reweighted Least Squares |
164 |
|
|
4.6.5 Simplifications for Canonical Link Functions |
165 |
|
|
4.7 Quasi-Likelihood and Generalized Linear Models |
167 |
|
|
4.7.1 Mean–Variance Relationship Determines Quasi-likelihood Estimates |
167 |
|
|
4.7.2 Overdispersion for Poisson GLMs and Quasi-likelihood |
167 |
|
|
4.7.3 Overdispersion for Binomial GLMs and Quasi-likelihood |
168 |
|
|
4.7.4 Example: Teratology Overdispersion |
169 |
|
|
Notes |
170 |
|
|
Exercises |
171 |
|
|
5 Logistic Regression |
181 |
|
|
5.1 Interpreting Parameters in Logistic Regression |
181 |
|
|
5.1.1 Interpreting ?: Odds, Probabilities, and Linear Approximations |
182 |
|
|
5.1.2 Looking at the Data |
183 |
|
|
5.1.3 Example: Horseshoe Crab Mating Revisited |
184 |
|
|
5.1.4 Logistic Regression with Retrospective Studies |
186 |
|
|
5.1.5 Logistic Regression Is Implied by Normal Explanatory Variables |
187 |
|
|
5.2 Inference for Logistic Regression |
187 |
|
|
5.2.1 Inference About Model Parameters and Probabilities |
187 |
|
|
5.2.2 Example: Inference for Horseshoe Crab Mating Data |
188 |
|
|
5.2.3 Checking Goodness of Fit: Grouped and Ungrouped Data |
189 |
|
|
5.2.4 Example: Model Goodness of Fit for Horseshoe Crab Data |
190 |
|
|
5.2.5 Checking Goodness of Fit with Ungrouped Data by Grouping |
190 |
|
|
5.2.6 Wald Inference Can Be Suboptimal |
192 |
|
|
5.3 Logistic Models with Categorical Predictors |
193 |
|
|
5.3.1 ANOVA-Type Representation of Factors |
193 |
|
|
5.3.2 Indicator Variables Represent a Factor |
193 |
|
|
5.3.3 Example: Alcohol and Infant Malformation Revisited |
194 |
|
|
5.3.4 Linear Logit Model for I × 2 Contingency Tables |
195 |
|
|
5.3.5 Cochran–Armitage Trend Test |
196 |
|
|
5.3.6 Example: Alcohol and Infant Malformation Revisited |
197 |
|
|
5.3.7 Using Directed Models Can Improve Inferential Power |
197 |
|
|
5.3.8 Noncentral Chi-Squared Distribution and Power for Narrower Alternatives |
198 |
|
|
5.3.9 Example: Skin Damage and Leprosy |
199 |
|
|
5.3.10 Model Smoothing Improves Precision of Estimation |
200 |
|
|
5.4 Multiple Logistic Regression |
200 |
|
|
5.4.1 Logistic Models for Multiway Contingency Tables |
201 |
|
|
5.4.2 Example: AIDS and AZT Use |
202 |
|
|
5.4.3 Goodness of Fit as a Likelihood-Ratio Test |
204 |
|
|
5.4.4 Model Comparison by Comparing Deviances |
205 |
|
|
5.4.5 Example: Horseshoe Crab Satellites Revisited |
205 |
|
|
5.4.6 Quantitative Treatment of Ordinal Predictor |
207 |
|
|
5.4.7 Probability-Based and Standardized Interpretations |
208 |
|
|
5.4.8 Estimating an Average Causal Effect |
209 |
|
|
5.5 Fitting Logistic Regression Models |
210 |
|
|
5.5.1 Likelihood Equations for Logistic Regression |
210 |
|
|
5.5.2 Asymptotic Covariance Matrix of Parameter Estimators |
211 |
|
|
5.5.3 Distribution of Probability Estimators |
212 |
|
|
5.5.4 Newton–Raphson Method Applied to Logistic Regression |
212 |
|
|
Notes |
213 |
|
|
Exercises |
214 |
|
|
6 Building, Checking, and Applying Logistic Regression Models |
225 |
|
|
6.1 Strategies in Model Selection |
225 |
|
|
6.1.1 How Many Explanatory Variables Can Be in the Model? |
226 |
|
|
6.1.2 Example: Horseshoe Crab Mating Data Revisited |
226 |
|
|
6.1.3 Stepwise Procedures: Forward Selection and Backward Elimination |
227 |
|
|
6.1.4 Example: Backward Elimination for Horseshoe Crab Data |
228 |
|
|
6.1.5 Model Selection and the "Correct" Model |
229 |
|
|
6.1.6 AIC: Minimizing Distance of the Fit from the Truth |
230 |
|
|
6.1.7 Example: Using Causal Hypotheses to Guide Model Building |
231 |
|
|
6.1.8 Alternative Strategies, Including Model Averaging |
233 |
|
|
6.2 Logistic Regression Diagnostics |
233 |
|
|
6.2.1 Residuals: Pearson, Deviance, and Standardized |
233 |
|
|
6.2.2 Example: Heart Disease and Blood Pressure |
234 |
|
|
6.2.3 Example: Admissions to Graduate School at Florida |
236 |
|
|
6.2.4 Influence Diagnostics for Logistic Regression |
238 |
|
|
6.3 Summarizing the Predictive Power of a Model |
239 |
|
|
6.3.1 Summarizing Predictive Power: R and R-Squared Measures |
239 |
|
|
6.3.2 Summarizing Predictive Power: Likelihood and Deviance Measures |
240 |
|
|
6.3.3 Summarizing Predictive Power: Classification Tables |
241 |
|
|
6.3.4 Summarizing Predictive Power: ROC Curves |
242 |
|
|
6.3.5 Example: Evaluating Predictive Power for Horseshoe Crab Data |
242 |
|
|
6.4 Mantel–Haenszel and Related Methods for Multiple 2 × 2 Tables |
243 |
|
|
6.4.1 Using Logistic Models to Test Conditional Independence |
244 |
|
|
6.4.2 Cochran–Mantel–Haenszel Test of Conditional Independence |
245 |
|
|
6.4.3 Example: Multicenter Clinical Trial Revisited |
246 |
|
|
6.4.4 CMH Test Is Advantageous for Sparse Data |
246 |
|
|
6.4.5 Estimation of Common Odds Ratio |
247 |
|
|
6.4.6 Meta-analyses for Summarizing Multiple 2 x 2 Tables |
248 |
|
|
6.4.7 Meta-analyses for Multiple 2 x 2 Tables: Difference of Proportions |
249 |
|
|
6.4.8 Collapsibility and Logistic Models for Contingency Tables |
250 |
|
|
6.4.9 Testing Homogeneity of Odds Ratios |
250 |
|
|
6.4.10 Summarizing Heterogeneity in Odds Ratios |
251 |
|
|
6.4.11 Propensity Scores in Observational Studies |
251 |
|
|
6.5 Detecting and Dealing with Infinite Estimates |
251 |
|
|
6.5.1 Complete or Quasi-complete Separation |
252 |
|
|
6.5.2 Example: Multicenter Clinical Trial with Few Successes |
253 |
|
|
6.5.3 Remedies When at Least One ML Estimate Is Infinite |
254 |
|
|
6.6 Sample Size and Power Considerations |
255 |
|
|
6.6.1 Sample Size and Power for Comparing Two Proportions |
255 |
|
|
6.6.2 Sample Size Determination in Logistic Regression |
256 |
|
|
6.6.3 Sample Size in Multiple Logistic Regression |
257 |
|
|
6.6.4 Power for Chi–Squared Tests in Contingency Tables |
257 |
|
|
6.6.5 Power for Testing Conditional Independence |
258 |
|
|
6.6.6 Effects of Sample Size on Model Selection and Inference |
259 |
|
|
Notes |
259 |
|
|
Exercises |
261 |
|
|
7 Alternative Modeling of Binary Response Data |
269 |
|
|
7.1 Probit and Complementary Log-log Models |
269 |
|
|
7.1.1 Probit Models: Three Latent Variable Motivations |
270 |
|
|
7.1.2 Probit Models: Interpreting Effects |
270 |
|
|
7.1.3 Probit Model Fitting |
271 |
|
|
7.1.4 Example: Modeling Flour Beetle Mortality |
272 |
|
|
7.1.5 Complementary Log–Log Link Models |
273 |
|
|
7.1.6 Example: Beetle Mortality Revisited |
275 |
|
|
7.2 Bayesian Inference for Binary Regression |
275 |
|
|
7.2.1 Prior Specifications for Binary Regression Models |
275 |
|
|
7.2.2 Example: Risk Factors for Endometrial Cancer Grade |
276 |
|
|
7.2.3 Bayesian Logistic Regression for Retrospective Studies |
278 |
|
|
7.2.4 Probability–Based Prior Specifications for Binary Regression Models |
278 |
|
|
7.2.5 Example: Modeling the Probability a Trauma Patient Survives |
279 |
|
|
7.2.6 Bayesian Fitting for Probit Models |
281 |
|
|
7.2.7 Bayesian Model Checking for Binary Regression |
283 |
|
|
7.3 Conditional Logistic Regression |
283 |
|
|
7.3.1 Conditional Likelihood |
283 |
|
|
7.3.2 Small-Sample Inference for a Logistic Regression Parameter |
285 |
|
|
7.3.3 Small-Sample Conditional Inference for 2 x 2 Contingency Tables |
285 |
|
|
7.3.4 Small-Sample Conditional Inference for Linear Logit Model |
286 |
|
|
7.3.5 Small-Sample Tests of Conditional Independence in 2 x 2 x K Tables |
287 |
|
|
7.3.6 Example: Promotion Discrimination |
287 |
|
|
7.3.7 Discreteness Complications of Using Exact Conditional Inference |
288 |
|
|
7.4 Smoothing: Kernels, Penalized Likelihood, Generalized Additive Models |
288 |
|
|
7.4.1 How Much Smoothing? The Variance/Bias Trade-off |
288 |
|
|
7.4.2 Kernel Smoothing |
289 |
|
|
7.4.3 Example: Smoothing to Portray Probability of Kyphosis |
290 |
|
|
7.4.4 Nearest Neighbors Smoothing |
290 |
|
|
7.4.5 Smoothing Using Penalized Likelihood Estimation |
291 |
|
|
7.4.6 Why Shrink Estimates Toward 0? |
293 |
|
|
7.4.7 Firth's Penalized Likelihood for Logistic Regression |
293 |
|
|
7.4.8 Example: Complete Separation but Finite Logistic Estimates |
293 |
|
|
7.4.9 Generalized Additive Models |
294 |
|
|
7.4.10 Example: GAMs for Horseshoe Crab Mating Data |
295 |
|
|
7.4.11 Advantages/Disadvantages of Various Smoothing Methods |
295 |
|
|
7.5 Issues in Analyzing High–Dimensional Categorical Data |
296 |
|
|
7.5.1 Issues in Selecting Explanatory Variables |
296 |
|
|
7.5.2 Adjusting for Multiplicity: The Bonferroni Method |
297 |
|
|
7.5.3 Adjusting for Multiplicity: The False Discovery Rate |
298 |
|
|
7.5.4 Other Variable Selection Methods with High–Dimensional Data |
299 |
|
|
7.5.5 Examples: High–Dimensional Applications in Genomics |
300 |
|
|
7.5.6 Example: Motif Discovery for Protein Sequences |
301 |
|
|
7.5.7 Example: The Netflix Prize |
302 |
|
|
7.5.8 Example: Credit Scoring |
303 |
|
|
Notes |
303 |
|
|
Exercises |
305 |
|
|
8 Models for Multinomial Responses |
311 |
|
|
8.1 Nominal Responses: Baseline–Category Logit Models |
311 |
|
|
8.1.1 Baseline–Category Logits |
311 |
|
|
8.1.2 Example: Alligator Food Choice |
312 |
|
|
8.1.3 Estimating Response Probabilities |
314 |
|
|
8.1.4 Fitting Baseline–Category Logistic Models |
315 |
|
|
8.1.5 Multicategory Logit Model as a Multivariate GLM |
317 |
|
|
8.1.6 Multinomial Probit Models |
317 |
|
|
8.1.7 Example: Effect of Menu Pricing |
318 |
|
|
8.2 Ordinal Responses: Cumulative Logit Models |
319 |
|
|
8.2.1 Cumulative Logits |
319 |
|
|
8.2.2 Proportional Odds Form of Cumulative Logit Model |
319 |
|
|
8.2.3 Latent Variable Motivation for Proportional Odds Structure |
321 |
|
|
8.2.4 Example: Happiness and Traumatic Events |
322 |
|
|
8.2.5 Checking the Proportional Odds Assumption |
324 |
|
|
8.3 Ordinal Responses: Alternative Models |
326 |
|
|
8.3.1 Cumulative Link Models |
326 |
|
|
8.3.2 Cumulative Probit and Log-Log Models |
326 |
|
|
8.3.3 Example: Happiness Revisited with Cumulative Probits |
327 |
|
|
8.3.4 Adjacent–Categories Logit Models |
327 |
|
|
8.3.5 Example: Happiness Revisited |
328 |
|
|
8.3.6 Continuation–Ratio Logit Models |
329 |
|
|
8.3.7 Example: Developmental Toxicity Study with Pregnant Mice |
330 |
|
|
8.3.8 Stochastic Ordering Location Effects Versus Dispersion Effects |
331 |
|
|
8.3.9 Summarizing Predictive Power of Explanatory Variables |
332 |
|
|
8.4 Testing Conditional Independence in I × J × K Tables |
332 |
|
|
8.4.1 Testing Conditional Independence Using Multinomial Models |
333 |
|
|
8.4.2 Example: Homosexual Marriage and Religious Fundamentalism |
334 |
|
|
8.4.3 Generalized Cochran-Mantel–Haenszel Tests for I x J x K Tables |
335 |
|
|
8.4.4 Example: Homosexual Marriage Revisited |
337 |
|
|
8.4.5 Related Score Tests for Multinomial Logit Models |
337 |
|
|
8.5 Discrete-Choice Models |
338 |
|
|
8.5.1 Conditional Logits for Characteristics of the Choices |
338 |
|
|
8.5.2 Multinomial Logit Model Expressed as Discrete-Choice Model |
339 |
|
|
8.5.3 Example: Shopping Destination Choice |
339 |
|
|
8.5.4 Multinomial Probit Discrete–Choice Models |
339 |
|
|
8.5.5 Extensions: Nested Logit and Mixed Logit Models |
340 |
|
|
8.5.6 Extensions: Discrete Choice with Ordered Categories |
340 |
|
|
8.6 Bayesian Modeling of Multinomial Responses |
341 |
|
|
8.6.1 Bayesian Fitting of Cumulative Link Models |
341 |
|
|
8.6.2 Example: Cannabis Use and Mother's Age |
342 |
|
|
8.6.3 Bayesian Fitting of Multinomial Logit and Probit Models |
343 |
|
|
8.6.4 Example: Alligator Food Choice Revisited |
344 |
|
|
Notes |
344 |
|
|
Exercises |
347 |
|
|
9 Loglinear Models for Contingency Tables |
357 |
|
|
9.1 Loglinear Models for Two-way Tables |
357 |
|
|
9.1.1 Independence Model for a Two-Way Table |
357 |
|
|
9.1.2 Interpretation of Loglinear Model Parameters |
358 |
|
|
9.1.3 Saturated Model for a Two-Way Table |
358 |
|
|
9.1.4 Alternative Parameter Constraints |
359 |
|
|
9.1.5 Hierarchical Versus Nonhierarchical Models |
359 |
|
|
9.1.6 Multinomial Models for Cell Probabilities |
360 |
|
|
9.2 Loglinear Models for Independence and Interaction in Three-way Tables |
360 |
|
|
9.2.1 Types of Independence |
360 |
|
|
9.2.2 Homogeneous Association and Three-Factor Interaction |
362 |
|
|
9.2.3 Interpretation of Loglinear Model Parameters |
363 |
|
|
9.2.4 Example: Alcohol, Cigarette, and Marijuana Use |
364 |
|
|
9.3 Inference for Loglinear Models |
366 |
|
|
9.3.1 Chi-Squared Goodness-of-Fit Tests |
366 |
|
|
9.3.2 Inference about Conditional Associations |
366 |
|
|
9.4 Loglinear Models for Higher Dimensions |
368 |
|
|
9.4.1 Models for Four–Way Contingency Tables |
368 |
|
|
9.4.2 Example: Automobile Accidents and Seat-Belt Use |
368 |
|
|
9.4.3 Large Samples and Statistical Versus Practical Significance |
370 |
|
|
9.4.4 Dissimilarity Index |
370 |
|
|
9.5 Loglinear—Logistic Model Connection |
371 |
|
|
9.5.1 Using Logistic Models to Interpret Loglinear Models |
371 |
|
|
9.5.2 Example: Auto Accidents and Seat-Belts Revisited |
372 |
|
|
9.5.3 Equivalent Loglinear and Logistic Models |
372 |
|
|
9.5.4 Example: Detecting Gene–Environment Interactions in Case–Control Studies |
373 |
|
|
9.6 Loglinear Model Fitting: Likelihood Equations and Asymptotic Distributions |
374 |
|
|
9.6.1 Minimal Sufficient Statistics |
374 |
|
|
9.6.2 Likelihood Equations for Loglinear Models |
375 |
|
|
9.6.3 Unique ML Estimates Match Data in Sufficient Marginal Tables |
376 |
|
|
9.6.4 Direct Versus Iterative Calculation of Fitted Values |
376 |
|
|
9.6.5 Decomposable Models |
377 |
|
|
9.6.6 Chi-Squared Goodness-of-Fit Tests |
377 |
|
|
9.6.7 Covariance Matrix of ML Parameter Estimators |
378 |
|
|
9.6.8 Connection Between Multinomial and Poisson Loglinear Models |
379 |
|
|
9.6.9 Distribution of Probability Estimators |
380 |
|
|
9.6.10 Proof of Uniqueness of ML Estimates |
381 |
|
|
9.6.11 Pseudo ML for Complex Sampling Designs |
381 |
|
|
9.7 Loglinear Model Fitting: Iterative Methods and Their Application |
382 |
|
|
9.7.1 Newton-Raphson Method |
382 |
|
|
9.7.2 Iterative Proportional Fitting |
383 |
|
|
9.7.3 Comparison of IPF and Newton–Raphson Iterative Methods |
384 |
|
|
9.7.4 Raking a Table: Contingency Table Standardization |
385 |
|
|
Notes |
386 |
|
|
Exercises |
387 |
|
|
10 Building and Extending Loglinear Models |
395 |
|
|
10.1 Conditional Independence Graphs and Collapsibility |
395 |
|
|
10.1.1 Conditional Independence Graphs |
395 |
|
|
10.1.2 Graphical Loglinear Models |
396 |
|
|
10.1.3 Collapsibility in Three–Way Contingency Tables |
397 |
|
|
10.1.4 Collapsibility for Multiway Tables |
398 |
|
|
10.2 Model Selection and Comparison |
398 |
|
|
10.2.1 Considerations in Model Selection |
398 |
|
|
10.2.2 Example: Model Building for Student Survey |
399 |
|
|
10.2.3 Loglinear Model Comparison Statistics |
401 |
|
|
10.2.4 Partitioning Chi-Squared with Model Comparisons |
402 |
|
|
10.2.5 Identical Marginal and Conditional Tests of Independence |
402 |
|
|
10.3 Residuals for Detecting Cell-Specific Lack of Fit |
403 |
|
|
10.3.1 Residuals for Loglinear Models |
403 |
|
|
10.3.2 Example: Student Survey Revisited |
403 |
|
|
10.3.3 Identical Loglinear and Logistic Standardized Residuals |
404 |
|
|
10.4 Modeling Ordinal Associations |
404 |
|
|
10.4.1 Linear-by-Linear Association Model for Two-Way Tables |
405 |
|
|
10.4.2 Corresponding Logistic Model for Adjacent Responses |
406 |
|
|
10.4.3 Likelihood Equations and Model Fitting |
407 |
|
|
10.4.4 Example: Sex and Birth Control Opinions Revisited |
407 |
|
|
10.4.5 Directed Ordinal Test of Independence |
409 |
|
|
10.4.6 Row Effects and Column Effects Association Models |
409 |
|
|
10.4.7 Example: Estimating Category Scores for Premarital Sex |
410 |
|
|
10.4.8 Ordinal Variables in Models for Multiway Tables |
410 |
|
|
10.5 Generalized Loglinear and Association Models, Correlation Models, and Correspondence Analysis |
411 |
|
|
10.5.1 Generalized Loglinear Model |
411 |
|
|
10.5.2 Multiplicative Row and Column Effects Model |
412 |
|
|
10.5.3 Example: Mental Health and Parents' SES |
413 |
|
|
10.5.4 Correlation Models |
413 |
|
|
10.5.5 Correspondence Analysis |
414 |
|
|
10.5.6 Model Selection and Score Choice for Ordinal Variables |
416 |
|
|
10.6 Empty Cells and Sparseness in Modeling Contingency Tables |
416 |
|
|
10.6.1 Empty Cells: Sampling Versus Structural Zeros |
416 |
|
|
10.6.2 Existence of Estimates in Loglinear Models |
417 |
|
|
10.6.3 Effects of Sparseness on X2, G2, and Model-Based Tests |
418 |
|
|
10.6.4 Alternative Sparse Data Asymptotics |
419 |
|
|
10.6.5 Adding Constants to Cells of a Contingency Table |
419 |
|
|
10.7 Bayesian Loglinear Modeling |
419 |
|
|
10.7.1 Estimating Loglinear Model Parameters in Two-Way Tables |
420 |
|
|
10.7.2 Example: Polarized Opinions by Political Party |
420 |
|
|
10.7.3 Bayesian Loglinear Modeling of Multidimensional Tables |
421 |
|
|
10.7.4 Graphical Conditional Independence Models |
422 |
|
|
Notes |
422 |
|
|
Exercises |
425 |
|
|
11 Models for Matched Pairs |
431 |
|
|
11.1 Comparing Dependent Proportions |
432 |
|
|
11.1.1 Confidence Intervals Comparing Dependent Proportions |
432 |
|
|
11.1.2 McNemar Test Comparing Dependent Proportions |
433 |
|
|
11.1.3 Example: Changes in Presidential Election Voting |
433 |
|
|
11.1.4 Increased Precision with Dependent Samples |
434 |
|
|
11.1.5 Small-Sample Test Comparing Dependent Proportions |
434 |
|
|
11.1.6 Connection Between McNemar and Cochran-Mantel–Haenszel Tests |
435 |
|
|
11.1.7 Subject-Specific and Population–Averaged (Marginal) Tables |
436 |
|
|
11.2 Conditional Logistic Regression for Binary Matched Pairs |
436 |
|
|
11.2.1 Subject–Specific Versus Marginal Models for Matched Pairs |
436 |
|
|
11.2.2 Logistic Models with Subject-Specific Probabilities |
437 |
|
|
11.2.3 Conditional ML Inference for Binary Matched Pairs |
438 |
|
|
11.2.4 Random Effects in Binary Matched-Pairs Model |
439 |
|
|
11.2.5 Conditional Logistic Regression for Matched Case–Control Studies |
439 |
|
|
11.2.6 Conditional Logistic Regression for Matched Pairs with Multiple Predictors |
440 |
|
|
11.2.7 Marginal Models and Subject-Specific Models: Extensions |
441 |
|
|
11.3 Marginal Models for Square Contingency Tables |
442 |
|
|
11.3.1 Marginal Models for Nominal Classifications |
442 |
|
|
11.3.2 Example: Regional Migration |
443 |
|
|
11.3.3 Marginal Models for Ordinal Classifications |
443 |
|
|
11.3.4 Example: Opinions on Premarital and Extramarital Sex |
444 |
|
|
11.4 Symmetry, Quasi-Symmetry, and Quasi-Independence |
444 |
|
|
11.4.1 Symmetry as Logistic and Loglinear Models |
445 |
|
|
11.4.2 Quasi-symmetry |
445 |
|
|
11.4.3 Marginal Homogeneity and Quasi-symmetry |
447 |
|
|
11.4.4 Quasi–independence |
447 |
|
|
11.4.5 Example: Migration Revisited |
448 |
|
|
11.4.6 Ordinal Quasi-symmetry |
449 |
|
|
11.4.7 Example: Premarital and Extramarital Sex Revisited |
450 |
|
|
11.5 Measuring Agreement Between Observers |
450 |
|
|
11.5.1 Agreement: Departures from Independence |
451 |
|
|
11.5.2 Using Quasi–independence to Analyze Agreement |
451 |
|
|
11.5.3 Quasi-symmetry and Agreement Modeling |
452 |
|
|
11.5.4 Kappa: A Summary Measure of Agreement |
452 |
|
|
11.5.5 Weighted Kappa: Quantifying Disagreement |
453 |
|
|
11.5.6 Extensions to Multiple Observers |
453 |
|
|
11.6 Bradley-Terry Model for Paired Preferences |
454 |
|
|
11.6.1 Bradley-Terry Model |
454 |
|
|
11.6.2 Example: Major League Baseball Rankings |
454 |
|
|
11.6.3 Example: Home Team Advantage in Baseball |
455 |
|
|
11.6.4 Bradley-Terry Model and Quasi-symmetry |
456 |
|
|
11.6.5 Extensions to Ties and Ordinal Pairwise Evaluations |
457 |
|
|
11.7 Marginal Models and Quasi-Symmetry Models for Matched Sets |
457 |
|
|
11.7.1 Marginal Homogeneity, Complete Symmetry, and Quasi-symmetry |
457 |
|
|
11.7.2 Types of Marginal Symmetry |
458 |
|
|
11.7.3 Comparing Binary Marginal Distributions in Multiway Tables |
458 |
|
|
11.7.4 Example: Attitudes Toward Legalized Abortion |
459 |
|
|
11.7.5 Marginal Homogeneity for a Multicategory Response |
460 |
|
|
11.7.6 Wald and Generalized CMH Score Tests of Marginal Homogeneity |
460 |
|
|
Notes |
461 |
|
|
Exercises |
463 |
|
|
12 Clustered Categorical Data: Marginal and Transitional Models |
473 |
|
|
12.1 Marginal Modeling: Maximum Likelihood Approach |
474 |
|
|
12.1.1 Example: Longitudinal Study of Mental Depression |
474 |
|
|
12.1.2 Modeling a Repeated Multinomial Response |
476 |
|
|
12.1.3 Example: Insomnia Clinical Trial |
476 |
|
|
12.1.4 ML Fitting of Marginal Logistic Models: Constraints on Cell Probabilities |
477 |
|
|
12.1.5 ML Fitting of Marginal Logistic Models: Other Methods |
479 |
|
|
12.2 Marginal Modeling: Generalized Estimating Equations (GEEs) Approach |
480 |
|
|
12.2.1 Generalized Estimating Equations Methodology: Basic Ideas |
480 |
|
|
12.2.2 Example: Longitudinal Mental Depression Revisited |
481 |
|
|
12.2.3 Example: Multinomial GEE Approach for Insomnia Trial |
482 |
|
|
12.3 Quasi-Likelihood and Its GEE Multivariate Extension: Details |
483 |
|
|
12.3.1 The Univariate Quasi-likelihood Method |
483 |
|
|
12.3.2 Properties of Quasi–likelihood Estimators |
484 |
|
|
12.3.3 Sandwich Covariance Adjustment for Variance Misspecification |
485 |
|
|
12.3.4 GEE Multivariate Methodology: Technical Details |
486 |
|
|
12.3.5 Working Associations Characterized by Odds Ratios |
488 |
|
|
12.3.6 GEE Approach: Multinomial Responses |
488 |
|
|
12.3.7 Dealing with Missing Data |
489 |
|
|
12.4 Transitional Models: Markov Chain and Time Series Models |
491 |
|
|
12.4.1 Markov Chains |
491 |
|
|
12.4.2 Example: Changes in Evapotranspiration Rates |
492 |
|
|
12.4.3 Transitional Models with Explanatory Variables |
493 |
|
|
12.4.4 Example: Child's Respiratory Illness and Maternal Smoking |
494 |
|
|
12.4.5 Example: Initial Response in Matched Pair as a Covariate |
495 |
|
|
12.4.6 Transitional Models and Loglinear Conditional Models |
496 |
|
|
Notes |
496 |
|
|
Exercises |
497 |
|
|
13 Clustered Categorical Data: Random Effects Models |
507 |
|
|
13.1 Random Effects Modeling of Clustered Categorical Data |
507 |
|
|
13.1.1 Generalized Linear Mixed Model |
508 |
|
|
13.1.2 Logistic GLMM with Random Intercept for Binary Matched Pairs |
509 |
|
|
13.1.3 Example: Changes in Presidential Voting Revisited |
510 |
|
|
13.1.4 Extension: Rasch Model and Item Response Models |
510 |
|
|
13.1.5 Random Effects Versus Conditional ML Approaches |
511 |
|
|
13.2 Binary Responses: Logistic-Normal Model |
512 |
|
|
13.2.1 Shared Random Effect Implies Nonnegative Marginal Correlations |
512 |
|
|
13.2.2 Interpreting Heterogeneity in Logistic-Normal Models |
512 |
|
|
13.2.3 Connections Between Random Effects Models and Marginal Models |
513 |
|
|
13.2.4 Comments About GLMMs Versus Marginal Models |
515 |
|
|
13.3 Examples of Random Effects Models for Binary Data |
516 |
|
|
13.3.1 Example: Small–Area Estimation of Binomial Proportions |
516 |
|
|
13.3.2 Modeling Repeated Binary Responses: Attitudes About Abortion |
518 |
|
|
13.3.3 Example: Longitudinal Mental Depression Study Revisited |
520 |
|
|
13.3.4 Example: Capture–Recapture Prediction of Population Size |
521 |
|
|
13.3.5 Example: Heterogeneity Among Multicenter Clinical Trials |
523 |
|
|
13.3.6 Meta-analysis Using a Random Effects Approach |
525 |
|
|
13.3.7 Alternative Formulations of Random Effects Models |
525 |
|
|
13.3.8 Example: Matched Pairs with a Bivariate Binary Response |
526 |
|
|
13.3.9 Time Series Models Using Autocorrelated Random Effects |
527 |
|
|
13.3.10 Example: Oxford and Cambridge Annual Boat Race |
528 |
|
|
13.4 Random Effects Models for Multinomial Data |
529 |
|
|
13.4.1 Cumulative Logit Model with Random Intercept |
529 |
|
|
13.4.2 Example: Insomnia Study Revisited |
529 |
|
|
13.4.3 Example: Combining Measures on Ordinal Items |
530 |
|
|
13.4.4 Example: Cluster Sampling |
531 |
|
|
13.4.5 Baseline-Category Logit Models with Random Effects |
532 |
|
|
13.4.6 Example: Effectiveness of Housing Program |
532 |
|
|
13.5 Multilevel Modeling |
533 |
|
|
13.5.1 Hierarchical Random Terms: Partitioning Variability |
534 |
|
|
13.5.2 Example: Children's Care for an Unmarried Mother |
534 |
|
|
13.6 GLMM Fitting, Inference, and Prediction |
537 |
|
|
13.6.1 Marginal Likelihood and Maximum Likelihood Fitting |
537 |
|
|
13.6.2 Gauss–Hermite Quadrature Methods for ML Fitting |
538 |
|
|
13.6.3 Monte Carlo and EM Methods for ML Fitting |
538 |
|
|
13.6.4 Laplace and Penalized Quasi-likelihood Approximations to ML |
539 |
|
|
13.6.5 Inference for GLMM Parameters |
540 |
|
|
13.6.6 Prediction Using Random Effects |
540 |
|
|
13.7 Bayesian Multivariate Categorical Modeling |
541 |
|
|
13.7.1 Marginal Homogeneity Analyses for Matched Pairs |
541 |
|
|
13.7.2 Bayesian Approaches to Meta-analysis and Multicenter Trials |
541 |
|
|
13.7.3 Example: Bayesian Analyses for a Multicenter Trial |
542 |
|
|
13.7.4 Bayesian GLMMs and Marginal Models |
542 |
|
|
Notes |
543 |
|
|
Exercises |
545 |
|
|
14 Other Mixture Models for Discrete Data |
553 |
|
|
14.1 Latent Class Models |
553 |
|
|
14.1.1 Independence Given a Latent Categorical Variable |
554 |
|
|
14.1.2 Fitting Latent Class Models |
555 |
|
|
14.1.3 Example: Latent Class Model for Rater Agreement |
556 |
|
|
14.1.4 Example: Latent Class Models for Capture-Recapture |
558 |
|
|
14.1.5 Example: Latent Class Transitional Models |
559 |
|
|
14.2 Nonparametric Random Effects Models |
560 |
|
|
14.2.1 Logistic Models with Unspecified Random Effects Distribution |
560 |
|
|
14.2.2 Example: Attitudes About Legalized Abortion |
560 |
|
|
14.2.3 Example: Nonparametric Mixing of Logistic Regressions |
561 |
|
|
14.2.4 Is Misspecification of Random Effects a Serious Problem? |
561 |
|
|
14.2.5 Rasch Mixture Model |
563 |
|
|
14.2.6 Example: Modeling Rater Agreement Revisited |
563 |
|
|
14.2.7 Nonparametric Mixtures and Quasi-symmetry |
564 |
|
|
14.2.8 Example: Attitudes About Legalized Abortion Revisited |
565 |
|
|
14.3 Beta-Binomial Models |
566 |
|
|
14.3.1 Beta-Binomial Distribution |
566 |
|
|
14.3.2 Models Using the Beta-Binomial Distribution |
567 |
|
|
14.3.3 Quasi-likelihood with Beta-Binomial Type Variance |
567 |
|
|
14.3.4 Example: Teratology Overdispersion Revisited |
568 |
|
|
14.3.5 Conjugate Mixture Models |
570 |
|
|
14.4 Negative Binomial Regression |
570 |
|
|
14.4.1 Gamma Mixture of Poissons Is Negative Binomial |
571 |
|
|
14.4.2 Negative Binomial Regression Modeling |
571 |
|
|
14.4.3 Example: Frequency of Knowing Homicide Victims |
572 |
|
|
14.5 Poisson Regression with Random Effects |
573 |
|
|
14.5.1 A Poisson GLMM |
574 |
|
|
14.5.2 Marginal Model Implied by Poisson GLMM |
574 |
|
|
14.5.3 Example: Homicide Victim Frequency Revisited |
575 |
|
|
14.5.4 Negative Binomial Models versus Poisson GLMMs |
575 |
|
|
Notes |
575 |
|
|
Exercises |
576 |
|
|
15 Non-Model-Based Classification and Clustering |
583 |
|
|
15.1 Classification: Linear Discriminant Analysis |
583 |
|
|
15.1.1 Classification with Normally Distributed Predictors |
584 |
|
|
15.1.2 Example: Horseshoe Crab Satellites Revisited |
585 |
|
|
15.1.3 Multicategory Classification and Other Versions of Discriminant Analysis |
586 |
|
|
15.1.4 Classification Methods for High Dimensions |
587 |
|
|
15.1.5 Discriminant Analysis Versus Logistic Regression |
587 |
|
|
15.2 Classification: Tree-Structured Prediction |
588 |
|
|
15.2.1 Classification Trees |
588 |
|
|
15.2.2 Example: Classification Tree for a Health Care Application |
589 |
|
|
15.2.3 How Does the Classification Tree Grow? |
590 |
|
|
15.2.4 Pruning a Tree and Checking Prediction Accuracy |
591 |
|
|
15.2.5 Classification Trees Versus Logistic Regression |
592 |
|
|
15.2.6 Support Vector Machines for Classification |
593 |
|
|
15.3 Cluster Analysis for Categorical Data |
594 |
|
|
15.3.1 Supervised Versus Unsupervised Learning |
595 |
|
|
15.3.2 Measuring Dissimilarity Between Observations |
595 |
|
|
15.3.3 Clustering Algorithms: Partitions and Hierarchies |
596 |
|
|
15.3.4 Example: Clustering States on Election Results |
597 |
|
|
Notes |
599 |
|
|
Exercises |
600 |
|
|
16 Large- and Small-Sample Theory for Multinomial Models |
605 |
|
|
16.1 Delta Method |
605 |
|
|
16.1.1 O, o Rates of Convergence |
606 |
|
|
16.1.2 Delta Method for a Function of a Random Variable |
606 |
|
|
16.1.3 Delta Method for a Function of a Random Vector |
607 |
|
|
16.1.4 Asymptotic Normality of Functions of Multinomial Counts |
608 |
|
|
16.1.5 Delta Method for a Vector Function of a Random Vector |
609 |
|
|
16.1.6 Joint Asymptotic Normality of Log Odds Ratios |
609 |
|
|
16.2 Asymptotic Distributions of Estimators of Model Parameters and Cell Probabilities |
610 |
|
|
16.2.1 Asymptotic Distribution of Model Parameter Estimator |
610 |
|
|
16.2.2 Asymptotic Distribution of Cell Probability Estimators |
611 |
|
|
16.2.3 Model Smoothing Is Beneficial |
612 |
|
|
16.3 Asymptotic Distributions of Residuals and Goodness-of-fit Statistics |
612 |
|
|
16.3.1 Joint Asymptotic Normality of p and ? |
612 |
|
|
16.3.2 Asymptotic Distribution of Pearson and Standardized Residuals |
613 |
|
|
16.3.3 Asymptotic Distribution of Pearson X2 Statistic |
614 |
|
|
16.3.4 Asymptotic Distribution of Likelihood-Ratio Statistic |
615 |
|
|
16.3.5 Asymptotic Noncentral Distributions |
616 |
|
|
16.4 Asymptotic Distributions for Logit/Loglinear Models |
617 |
|
|
16.4.1 Asymptotic Covariance Matrices |
617 |
|
|
16.4.2 Connection with Poisson Loglinear Models |
618 |
|
|
16.5 Small-Sample Significance Tests for Contingency Tables |
619 |
|
|
16.5.1 Exact Conditional Distribution for I x J Tables Under Independence |
619 |
|
|
16.5.2 Exact Tests of Independence for I x J Tables |
620 |
|
|
16.5.3 Example: Sexual Orientation and Party ID |
620 |
|
|
16.6 Small-Sample Confidence Intervals for Categorical Data |
621 |
|
|
16.6.1 Small-Sample CIs for a Binomial Parameter |
621 |
|
|
16.6.2 CIs Based on Tests Using the Mid P- Value |
623 |
|
|
16.6.3 Example: Proportion of Vegetarians Revisited |
623 |
|
|
16.6.4 Small-Sample CIs for Odds Ratios |
624 |
|
|
16.6.5 Example: Fisher's Tea Taster Revisited |
625 |
|
|
16.6.6 Small-Sample CIs for Logistic Regression Parameters |
625 |
|
|
16.6.7 Example: Diarrhea and an Antibiotic |
626 |
|
|
16.6.8 Unconditional Small-Sample CIs for Difference of Proportions |
627 |
|
|
16.7 Alternative Estimation Theory for Parametric Models |
628 |
|
|
16.7.1 Weighted Least Squares for Categorical Data |
628 |
|
|
16.7.2 Inference Using the WLS Approach to Model Fitting |
629 |
|
|
16.7.3 Scope of WLS Versus ML Estimation |
630 |
|
|
16.7.4 Minimum Chi-Squared Estimators |
631 |
|
|
16.7.5 Minimum Discrimination Information |
632 |
|
|
Notes |
633 |
|
|
Exercises |
634 |
|
|
17 Historical Tour of Categorical Data Analysis |
641 |
|
|
17.1 Pearson-Yule Association Controversy |
641 |
|
|
17.2 R. A. Fisher's Contributions |
643 |
|
|
17.3 Logistic Regression |
645 |
|
|
17.4 Multiway Contingency Tables and Loglinear Models |
647 |
|
|
17.5 Bayesian Methods for Categorical Data |
651 |
|
|
17.6 A Look Forward, and Backward |
652 |
|
|
Appendix A Statistical Software for Categorical Data Analysis |
655 |
|
|
Appendix B Chi-Squared Distribution Values |
659 |
|
|
References |
661 |
|
|
Author Index |
707 |
|
|
Example Index |
719 |
|
|
Subject Index |
723 |
|