ciando eBooks - ein Service Ihrer Bibliothek

	Preface	1
	Variability, Information, and Prediction	16
	The Curse of Dimensionality	18
	The Two Extremes	19
	Perspectives on the Curse	20
	Sparsity	21
	Exploding Numbers of Models	23
	Multicollinearity and Concurvity	24
	The Effect of Noise	25
	Coping with the Curse	26
	Selecting Design Points	26
	Local Dimension	27
	Parsimony	32
	Two Techniques	33
	The Bootstrap	33
	Cross-Validation	42
	Optimization and Search	47
	Univariate Search	47
	Multivariate Search	48
	General Searches	49
	Constraint Satisfaction and Combinatorial Search	50
	Notes	53
	Hammersley Points	53
	Edgeworth Expansions for the Mean	54
	Bootstrap Asymptotics for the Studentized Mean	56
	Exercises	58
	Local Smoothers	68
	Early Smoothers	70
	Transition to Classical Smoothers	74
	Global Versus Local Approximations	75
	LOESS	79
	Kernel Smoothers	82
	Statistical Function Approximation	83
	The Concept of Kernel Methods and the Discrete Case	88
	Kernels and Stochastic Designs: Density Estimation	93
	Stochastic Designs: Asymptotics for Kernel Smoothers	96
	Convergence Theorems and Rates for Kernel Smoothers	101
	Kernel and Bandwidth Selection	105
	Linear Smoothers	110
	Nearest Neighbors	111
	Applications of Kernel Regression	115
	A Simulated Example	115
	Ethanol Data	117
	Exercises	122
	Spline Smoothing	132
	Interpolating Splines	132
	Natural Cubic Splines	138
	Smoothing Splines for Regression	141
	Model Selection for Spline Smoothing	144
	Spline Smoothing Meets Kernel Smoothing	145
	Asymptotic Bias, Variance, and MISE for Spline Smoothers	146
	Ethanol Data Example -- Continued	148
	Splines Redux: Hilbert Space Formulation	151
	Reproducing Kernels	153
	Constructing an RKHS	156
	Direct Sum Construction for Splines	161
	Explicit Forms	164
	Nonparametrics in Data Mining and Machine Learning	167
	Simulated Comparisons	169
	What Happens with Dependent Noise Models?	172
	Higher Dimensions and the Curse of Dimensionality	174
	Notes	178
	Sobolev Spaces: Definition	178
	Exercises	179
	New Wave Nonparametrics	186
	Additive Models	187
	The Backfitting Algorithm	188
	Concurvity and Inference	192
	Nonparametric Optimality	195
	Generalized Additive Models	196
	Projection Pursuit Regression	199
	Neural Networks	204
	Backpropagation and Inference	207
	Barron's Result and the Curse	212
	Approximation Properties	213
	Barron's Theorem: Formal Statement	215
	Recursive Partitioning Regression	217
	Growing Trees	219
	Pruning and Selection	222
	Regression	223
	Bayesian Additive Regression Trees: BART	225
	MARS	225
	Sliced Inverse Regression	230
	ACE and AVAS	233
	Notes	235
	Proof of Barron's Theorem	235
	Exercises	239
	Supervised Learning: Partition Methods	246
	Multiclass Learning	248
	Discriminant Analysis	250
	Distance-Based Discriminant Analysis	251
	Bayes Rules	256
	Probability-Based Discriminant Analysis	260
	Tree-Based Classifiers	264
	Splitting Rules	264
	Logic Trees	268
	Random Forests	269
	Support Vector Machines	277
	Margins and Distances	277
	Binary Classification and Risk	280
	Prediction Bounds for Function Classes	283
	Constructing SVM Classifiers	286
	SVM Classification for Nonlinearly Separable Populations	294
	SVMs in the General Nonlinear Case	297
	Some Kernels Used in SVM Classification	303
	Kernel Choice, SVMs and Model Selection	304
	Support Vector Regression	305
	Multiclass Support Vector Machines	308
	Neural Networks	309
	Notes	311
	Hoeffding's Inequality	311
	VC Dimension	312
	Exercises	315
	Alternative Nonparametrics	322
	Ensemble Methods	323
	Bayes Model Averaging	325
	Bagging	327
	Stacking	331
	Boosting	333
	Other Averaging Methods	341
	Oracle Inequalities	343
	Bayes Nonparametrics	349
	Dirichlet Process Priors	349
	Polya Tree Priors	351
	Gaussian Process Priors	353
	The Relevance Vector Machine	359
	RVM Regression: Formal Description	360
	RVM Classification	364
	Hidden Markov Models -- Sequential Classification	367
	Notes	369
	Proof of Yang's Oracle Inequality	369
	Proof of Lecue's Oracle Inequality	372
	Exercises	374
	Computational Comparisons	379
	Computational Results: Classification	380
	Comparison on Fisher's Iris Data	380
	Comparison on Ripley's Data	383
	Computational Results: Regression	390
	Vapnik's sinc Function	391
	Friedman's Function	403
	Conclusions	406
	Systematic Simulation Study	411
	No Free Lunch	414
	Exercises	416
	Unsupervised Learning: Clustering	419
	Centroid-Based Clustering	422
	K-Means Clustering	423
	Variants	426
	Hierarchical Clustering	427
	Agglomerative Hierarchical Clustering	428
	Divisive Hierarchical Clustering	436
	Theory for Hierarchical Clustering	440
	Partitional Clustering	444
	Model-Based Clustering	446
	Graph-Theoretic Clustering	461
	Spectral Clustering	466
	Bayesian Clustering	472
	Probabilistic Clustering	472
	Hypothesis Testing	475
	Computed Examples	477
	Ripley's Data	479
	Iris Data	489
	Cluster Validation	494
	Notes	498
	Derivatives of Functions of a Matrix:	498
	Kruskal's Algorithm: Proof	498
	Prim's Algorithm: Proof	499
	Exercises	499
	Learning in High Dimensions	506
	Principal Components	508
	Main Theorem	509
	Key Properties	511
	Extensions	513
	Factor Analysis	515
	Finding and	517
	Finding K	519
	Estimating Factor Scores	520
	Projection Pursuit	521
	Independent Components Analysis	524
	Main Definitions	524
	Key Results	526
	Computational Approach	528
	Nonlinear PCs and ICA	529
	Nonlinear PCs	530
	Nonlinear ICA	531
	Geometric Summarization	531
	Measuring Distances to an Algebraic Shape	532
	Principal Curves and Surfaces	533
	Supervised Dimension Reduction: Partial Least Squares	536
	Simple PLS	536
	PLS Procedures	537
	Properties of PLS	539
	Supervised Dimension Reduction: Sufficient Dimensions in Regression	540
	Visualization I: Basic Plots	544
	Elementary Visualization	547
	Projections	554
	Time Dependence	556
	Visualization II: Transformations	559
	Chernoff Faces	559
	Multidimensional Scaling	560
	Self-Organizing Maps	566
	Exercises	573
	Variable Selection	582
	Concepts from Linear Regression	583
	Subset Selection	585
	Variable Ranking	588
	Overview	590
	Traditional Criteria	591
	Akaike Information Criterion (AIC)	593
	Bayesian Information Criterion (BIC)	596
	Choices of Information Criteria	598
	Cross Validation	600
	Shrinkage Methods	612
	Shrinkage Methods for Linear Models	614
	Grouping in Variable Selection	628
	Least Angle Regression	630
	Shrinkage Methods for Model Classes	633
	Cautionary Notes	644
	Bayes Variable Selection	645
	Prior Specification	648
	Posterior Calculation and Exploration	656
	Evaluating Evidence	660
	Connections Between Bayesian and Frequentist Methods	663
	Computational Comparisons	666
	The n > p Case	666
	When p > n	678
	Notes	680
	Code for Generating Data in Section 10.5	680
	Exercises	684
	Multiple Testing	692
	Analyzing the Hypothesis Testing Problem	694
	A Paradigmatic Setting	694
	Counts for Multiple Tests	697
	Measures of Error in Multiple Testing	698
	Aspects of Error Control	700
	Controlling the Familywise Error Rate	703
	One-Step Adjustments	703
	Stepwise p-Value Adjustments	706
	PCER and PFER	708
	Null Domination	709
	Two Procedures	710
	Controlling the Type I Error Rate	715
	Adjusted p-Values for PFER/PCER	719
	Controlling the False Discovery Rate	720
	FDR and other Measures of Error	722
	The Benjamini-Hochberg Procedure	723
	A BH Theorem for a Dependent Setting	724
	Variations on BH	726
	Controlling the Positive False Discovery Rate	732
	Bayesian Interpretations	732
	Aspects of Implementation	736
	Bayesian Multiple Testing	740
	Fully Bayes: Hierarchical	741
	Fully Bayes: Decision theory	744
	Notes	749
	Proof of the Benjamini-Hochberg Theorem	749
	Proof of the Benjamini-Yekutieli Theorem	752
	References	756
	Index	785