Hilfe Warenkorb Konto Anmelden
 
 
   Schnellsuche   
     zur Expertensuche                      
Principles and Theory for Data Mining and Machine Learning
  Großes Bild
 
Principles and Theory for Data Mining and Machine Learning
von: Bertrand Clarke, Ernest Fokoue, Hao Helen Zhang
Springer-Verlag, 2009
ISBN: 9780387981352
793 Seiten, Download: 8680 KB
 
Format:  PDF
geeignet für: Apple iPad, Android Tablet PC's Online-Lesen PC, MAC, Laptop

Typ: B (paralleler Zugriff)

 

 
eBook anfordern
Inhaltsverzeichnis

  Preface 1  
  Variability, Information, and Prediction 16  
     The Curse of Dimensionality 18  
        The Two Extremes 19  
     Perspectives on the Curse 20  
        Sparsity 21  
        Exploding Numbers of Models 23  
        Multicollinearity and Concurvity 24  
        The Effect of Noise 25  
     Coping with the Curse 26  
        Selecting Design Points 26  
        Local Dimension 27  
        Parsimony 32  
     Two Techniques 33  
        The Bootstrap 33  
        Cross-Validation 42  
     Optimization and Search 47  
        Univariate Search 47  
        Multivariate Search 48  
        General Searches 49  
        Constraint Satisfaction and Combinatorial Search 50  
     Notes 53  
        Hammersley Points 53  
        Edgeworth Expansions for the Mean 54  
        Bootstrap Asymptotics for the Studentized Mean 56  
     Exercises 58  
  Local Smoothers 68  
     Early Smoothers 70  
     Transition to Classical Smoothers 74  
        Global Versus Local Approximations 75  
        LOESS 79  
     Kernel Smoothers 82  
        Statistical Function Approximation 83  
        The Concept of Kernel Methods and the Discrete Case 88  
        Kernels and Stochastic Designs: Density Estimation 93  
        Stochastic Designs: Asymptotics for Kernel Smoothers 96  
        Convergence Theorems and Rates for Kernel Smoothers 101  
        Kernel and Bandwidth Selection 105  
        Linear Smoothers 110  
     Nearest Neighbors 111  
     Applications of Kernel Regression 115  
        A Simulated Example 115  
        Ethanol Data 117  
     Exercises 122  
  Spline Smoothing 132  
     Interpolating Splines 132  
     Natural Cubic Splines 138  
     Smoothing Splines for Regression 141  
        Model Selection for Spline Smoothing 144  
        Spline Smoothing Meets Kernel Smoothing 145  
     Asymptotic Bias, Variance, and MISE for Spline Smoothers 146  
        Ethanol Data Example -- Continued 148  
     Splines Redux: Hilbert Space Formulation 151  
        Reproducing Kernels 153  
        Constructing an RKHS 156  
        Direct Sum Construction for Splines 161  
        Explicit Forms 164  
        Nonparametrics in Data Mining and Machine Learning 167  
     Simulated Comparisons 169  
        What Happens with Dependent Noise Models? 172  
        Higher Dimensions and the Curse of Dimensionality 174  
     Notes 178  
        Sobolev Spaces: Definition 178  
     Exercises 179  
  New Wave Nonparametrics 186  
     Additive Models 187  
        The Backfitting Algorithm 188  
        Concurvity and Inference 192  
        Nonparametric Optimality 195  
     Generalized Additive Models 196  
     Projection Pursuit Regression 199  
     Neural Networks 204  
        Backpropagation and Inference 207  
        Barron's Result and the Curse 212  
        Approximation Properties 213  
        Barron's Theorem: Formal Statement 215  
     Recursive Partitioning Regression 217  
        Growing Trees 219  
        Pruning and Selection 222  
        Regression 223  
        Bayesian Additive Regression Trees: BART 225  
     MARS 225  
     Sliced Inverse Regression 230  
     ACE and AVAS 233  
     Notes 235  
        Proof of Barron's Theorem 235  
     Exercises 239  
  Supervised Learning: Partition Methods 246  
     Multiclass Learning 248  
     Discriminant Analysis 250  
        Distance-Based Discriminant Analysis 251  
        Bayes Rules 256  
        Probability-Based Discriminant Analysis 260  
     Tree-Based Classifiers 264  
        Splitting Rules 264  
        Logic Trees 268  
        Random Forests 269  
     Support Vector Machines 277  
        Margins and Distances 277  
        Binary Classification and Risk 280  
        Prediction Bounds for Function Classes 283  
        Constructing SVM Classifiers 286  
        SVM Classification for Nonlinearly Separable Populations 294  
        SVMs in the General Nonlinear Case 297  
        Some Kernels Used in SVM Classification 303  
        Kernel Choice, SVMs and Model Selection 304  
        Support Vector Regression 305  
        Multiclass Support Vector Machines 308  
     Neural Networks 309  
     Notes 311  
        Hoeffding's Inequality 311  
        VC Dimension 312  
     Exercises 315  
  Alternative Nonparametrics 322  
     Ensemble Methods 323  
        Bayes Model Averaging 325  
        Bagging 327  
        Stacking 331  
        Boosting 333  
        Other Averaging Methods 341  
        Oracle Inequalities 343  
     Bayes Nonparametrics 349  
        Dirichlet Process Priors 349  
        Polya Tree Priors 351  
        Gaussian Process Priors 353  
     The Relevance Vector Machine 359  
        RVM Regression: Formal Description 360  
        RVM Classification 364  
     Hidden Markov Models -- Sequential Classification 367  
     Notes 369  
        Proof of Yang's Oracle Inequality 369  
        Proof of Lecue's Oracle Inequality 372  
     Exercises 374  
  Computational Comparisons 379  
     Computational Results: Classification 380  
        Comparison on Fisher's Iris Data 380  
        Comparison on Ripley's Data 383  
     Computational Results: Regression 390  
        Vapnik's sinc Function 391  
        Friedman's Function 403  
        Conclusions 406  
     Systematic Simulation Study 411  
     No Free Lunch 414  
     Exercises 416  
  Unsupervised Learning: Clustering 419  
     Centroid-Based Clustering 422  
        K-Means Clustering 423  
        Variants 426  
     Hierarchical Clustering 427  
        Agglomerative Hierarchical Clustering 428  
        Divisive Hierarchical Clustering 436  
        Theory for Hierarchical Clustering 440  
     Partitional Clustering 444  
        Model-Based Clustering 446  
        Graph-Theoretic Clustering 461  
        Spectral Clustering 466  
     Bayesian Clustering 472  
        Probabilistic Clustering 472  
        Hypothesis Testing 475  
     Computed Examples 477  
        Ripley's Data 479  
        Iris Data 489  
     Cluster Validation 494  
     Notes 498  
        Derivatives of Functions of a Matrix: 498  
        Kruskal's Algorithm: Proof 498  
        Prim's Algorithm: Proof 499  
     Exercises 499  
  Learning in High Dimensions 506  
     Principal Components 508  
        Main Theorem 509  
        Key Properties 511  
        Extensions 513  
     Factor Analysis 515  
        Finding and 517  
        Finding K 519  
        Estimating Factor Scores 520  
     Projection Pursuit 521  
     Independent Components Analysis 524  
        Main Definitions 524  
        Key Results 526  
        Computational Approach 528  
     Nonlinear PCs and ICA 529  
        Nonlinear PCs 530  
        Nonlinear ICA 531  
     Geometric Summarization 531  
        Measuring Distances to an Algebraic Shape 532  
        Principal Curves and Surfaces 533  
     Supervised Dimension Reduction: Partial Least Squares 536  
        Simple PLS 536  
        PLS Procedures 537  
        Properties of PLS 539  
     Supervised Dimension Reduction: Sufficient Dimensions in Regression 540  
     Visualization I: Basic Plots 544  
        Elementary Visualization 547  
        Projections 554  
        Time Dependence 556  
     Visualization II: Transformations 559  
        Chernoff Faces 559  
        Multidimensional Scaling 560  
        Self-Organizing Maps 566  
     Exercises 573  
  Variable Selection 582  
     Concepts from Linear Regression 583  
        Subset Selection 585  
        Variable Ranking 588  
        Overview 590  
     Traditional Criteria 591  
        Akaike Information Criterion (AIC) 593  
        Bayesian Information Criterion (BIC) 596  
        Choices of Information Criteria 598  
        Cross Validation 600  
     Shrinkage Methods 612  
        Shrinkage Methods for Linear Models 614  
        Grouping in Variable Selection 628  
        Least Angle Regression 630  
        Shrinkage Methods for Model Classes 633  
        Cautionary Notes 644  
     Bayes Variable Selection 645  
        Prior Specification 648  
        Posterior Calculation and Exploration 656  
        Evaluating Evidence 660  
        Connections Between Bayesian and Frequentist Methods 663  
     Computational Comparisons 666  
        The n > p Case 666  
        When p > n 678  
     Notes 680  
        Code for Generating Data in Section 10.5 680  
     Exercises 684  
  Multiple Testing 692  
     Analyzing the Hypothesis Testing Problem 694  
        A Paradigmatic Setting 694  
        Counts for Multiple Tests 697  
        Measures of Error in Multiple Testing 698  
        Aspects of Error Control 700  
     Controlling the Familywise Error Rate 703  
        One-Step Adjustments 703  
        Stepwise p-Value Adjustments 706  
     PCER and PFER 708  
        Null Domination 709  
        Two Procedures 710  
        Controlling the Type I Error Rate 715  
        Adjusted p-Values for PFER/PCER 719  
     Controlling the False Discovery Rate 720  
        FDR and other Measures of Error 722  
        The Benjamini-Hochberg Procedure 723  
        A BH Theorem for a Dependent Setting 724  
        Variations on BH 726  
     Controlling the Positive False Discovery Rate 732  
        Bayesian Interpretations 732  
        Aspects of Implementation 736  
     Bayesian Multiple Testing 740  
        Fully Bayes: Hierarchical 741  
        Fully Bayes: Decision theory 744  
     Notes 749  
        Proof of the Benjamini-Hochberg Theorem 749  
        Proof of the Benjamini-Yekutieli Theorem 752  
  References 756  
  Index 785  


nach oben


  Mehr zum Inhalt
Kapitelübersicht
Kurzinformation
Inhaltsverzeichnis
Leseprobe
Blick ins Buch
Fragen zu eBooks?

  Navigation
Belletristik / Romane
Computer
Geschichte
Kultur
Medizin / Gesundheit
Philosophie / Religion
Politik
Psychologie / Pädagogik
Ratgeber
Recht
Reise / Hobbys
Sexualität / Erotik
Technik / Wissen
Wirtschaft

  Info
Hier gelangen Sie wieder zum Online-Auftritt Ihrer Bibliothek
© 2008-2024 ciando GmbH | Impressum | Kontakt | F.A.Q. | Datenschutz