|
Preface |
5 |
|
|
Contents |
7 |
|
|
1 Data Mining and Information Systems: Quo Vadis? |
14 |
|
|
Robert Stahlbock, Stefan Lessmann, and Sven F. Crone |
14 |
|
|
1.1 Introduction |
14 |
|
|
1.2 Special Issues in Data Mining |
16 |
|
|
1.2.1 Confirmatory Data Analysis |
16 |
|
|
1.2.2 Knowledge Discovery from Supervised Learning |
17 |
|
|
1.2.3 Classification Analysis |
19 |
|
|
1.2.4 Hybrid Data Mining Procedures |
21 |
|
|
1.2.5 Web Mining |
23 |
|
|
1.2.6 Privacy-Preserving Data Mining |
24 |
|
|
1.3 Conclusion and Outlook |
25 |
|
|
References |
26 |
|
|
Part I Confirmatory Data Analysis |
29 |
|
|
2 Response-Based Segmentation Using Finite Mixture Partial Least Squares |
30 |
|
|
Christian M. Ringle, Marko Sarstedt, and Erik A. Mooi |
30 |
|
|
2.1 Introduction |
31 |
|
|
2.1.1 On the Use of PLS Path Modeling |
31 |
|
|
2.1.2 Problem Statement |
33 |
|
|
2.1.3 Objectives and Organization |
34 |
|
|
2.2 Partial Least Squares Path Modeling |
35 |
|
|
2.3 Finite Mixture Partial Least Squares Segmentation |
37 |
|
|
2.3.1 Foundations |
37 |
|
|
2.3.2 Methodology |
39 |
|
|
2.3.3 Systematic Application of FIMIX-PLS |
42 |
|
|
2.4 Application of FIMIX-PLS |
45 |
|
|
2.4.1 On Measuring Customer Satisfaction |
45 |
|
|
2.4.2 Data and Measures |
45 |
|
|
2.4.3 Data Analysis and Results |
47 |
|
|
2.5 Summary and Conclusion |
55 |
|
|
References |
56 |
|
|
Part II Knowledge Discovery from Supervised Learning |
61 |
|
|
3 Building Acceptable Classification Models |
62 |
|
|
David Martens and Bart Baesens |
62 |
|
|
3.1 Introduction |
63 |
|
|
3.2 Comprehensibility of Classification Models |
64 |
|
|
3.2.1 Measuring Comprehensibility |
66 |
|
|
3.2.2 Obtaining Comprehensible Classification Models |
67 |
|
|
3.2.2.1 Building Rule-Based Models |
67 |
|
|
3.2.2.2 Combining Output Types |
67 |
|
|
3.2.2.3 Visualization |
67 |
|
|
3.3 Justifiability of Classification Models |
68 |
|
|
3.3.1 Taxonomy of Constraints |
69 |
|
|
3.3.2 Monotonicity Constraint |
71 |
|
|
3.3.3 Measuring Justifiability |
72 |
|
|
3.3.4 Obtaining Justifiable Classification Models |
77 |
|
|
3.4 Conclusion |
79 |
|
|
References |
80 |
|
|
4 Mining Interesting Rules Without Support Requirement: A General Universal Existential Upward Closure Property |
84 |
|
|
Yannick Le Bras, Philippe Lenca, and Stéphane Lallich |
84 |
|
|
4.1 Introduction |
85 |
|
|
4.2 State of the Art |
86 |
|
|
4.3 An Algorithmic Property of Confidence |
89 |
|
|
4.3.1 On UEUC Framework |
89 |
|
|
4.3.2 The UEUC Property |
89 |
|
|
4.3.3 An Efficient Pruning Algorithm |
90 |
|
|
4.3.4 Generalizing the UEUC Property |
91 |
|
|
4.4 A Framework for the Study of Measures |
93 |
|
|
4.4.1 Adapted Functions of Measure |
93 |
|
|
4.4.1.1 Association Rules |
93 |
|
|
4.4.1.2 Contingency Tables |
93 |
|
|
4.4.1.3 Minimal Joint Domain |
1 |
|
|
4.4.2 Expression of a Set of Measures of Ddconf |
96 |
|
|
4.5 Conditions for GUEUC |
99 |
|
|
4.5.1 A Sufficient Condition |
99 |
|
|
4.5.2 A Necessary Condition |
100 |
|
|
4.5.3 Classification of the Measures |
101 |
|
|
4.6 Conclusion |
103 |
|
|
References |
104 |
|
|
5 Classification Techniques and Error Control in Logic Mining |
108 |
|
|
Giovanni Felici, Bruno Simeone, and Vincenzo Spinelli |
108 |
|
|
5.1 Introduction |
109 |
|
|
5.2 Brief Introduction to Box Clustering |
111 |
|
|
5.3 BC-Based Classifier |
113 |
|
|
5.4 Best Choice of a Box System |
117 |
|
|
5.5 Bi-criterion Procedure for BC-Based Classifier |
120 |
|
|
5.6 Examples |
121 |
|
|
5.6.1 The Data Sets |
121 |
|
|
5.6.2 Experimental Results with BC |
122 |
|
|
5.6.3 Comparison with Decision Trees |
124 |
|
|
5.7 Conclusions |
126 |
|
|
References |
126 |
|
|
Part III Classification Analysis |
129 |
|
|
6 An Extended Study of the Discriminant Random Forest |
130 |
|
|
Tracy D. Lemmond, Barry Y. Chen, Andrew O. Hatch,and William G. Hanley |
130 |
|
|
6.1 Introduction |
130 |
|
|
6.2 Random Forests |
131 |
|
|
6.3 Discriminant Random Forests |
132 |
|
|
6.3.1 Linear Discriminant Analysis |
133 |
|
|
6.3.2 The Discriminant Random Forest Methodology |
134 |
|
|
6.4 DRF and RF: An Empirical Study |
135 |
|
|
6.4.1 Hidden Signal Detection |
136 |
|
|
6.4.1.1 Training on T1, Testing on J2 |
137 |
|
|
6.4.1.2 Prediction Performance for J2 with Cross-validation |
138 |
|
|
6.4.2 Radiation Detection |
139 |
|
|
6.4.3 Significance of Empirical Results |
143 |
|
|
6.4.4 Small Samples and Early Stopping |
144 |
|
|
6.4.5 Expected Cost |
150 |
|
|
6.5 Conclusions |
150 |
|
|
References |
152 |
|
|
7 Prediction with the SVM Using Test Point Margins |
154 |
|
|
Süreyya Özögür-Akyüz, Zakria Hussain, and John Shawe-Taylor |
154 |
|
|
7.1 Introduction |
154 |
|
|
7.2 Methods |
158 |
|
|
7.3 Data Set Description |
161 |
|
|
7.4 Results |
161 |
|
|
7.5 Discussion and Future Work |
162 |
|
|
References |
164 |
|
|
8 Effects of Oversampling Versus Cost-Sensitive Learning for Bayesian and SVM Classifiers |
166 |
|
|
Alexander Liu, Cheryl Martin, Brian La Cour, and Joydeep Ghosh |
166 |
|
|
8.1 Introduction |
166 |
|
|
8.2 Resampling |
168 |
|
|
8.2.1 Random Oversampling |
168 |
|
|
8.2.2 Generative Oversampling |
168 |
|
|
8.3 Cost-Sensitive Learning |
169 |
|
|
8.4 Related Work |
170 |
|
|
8.5 A Theoretical Analysis of Oversampling Versus Cost-Sensitive Learning |
171 |
|
|
8.5.1 Bayesian Classification |
171 |
|
|
8.5.2 Resampling Versus Cost-Sensitive Learning in Bayesian Classifiers |
172 |
|
|
8.5.3 Effect of Oversampling on Gaussian Naive Bayes |
173 |
|
|
8.5.3.1 Random Oversampling |
174 |
|
|
8.5.3.2 Generative Oversampling |
174 |
|
|
8.5.3.3 Comparison to Cost-Sensitive Learning |
175 |
|
|
8.5.4 Effects of Oversampling for Multinomial Naive Bayes |
175 |
|
|
8.6 Empirical Comparison of Resampling and Cost-SensitiveLearning |
177 |
|
|
8.6.1 Explaining Empirical Differences Between Resampling and Cost-Sensitive Learning |
177 |
|
|
8.6.2 Naive Bayes Comparisons on Low-Dimensional Gaussian Data |
178 |
|
|
8.6.2.1 Gaussian Naive Bayes on Artificial, Low-Dimensional Data |
179 |
|
|
8.6.2.2 A Note on ROC and AUC |
180 |
|
|
8.6.2.3 Gaussian Naive Bayes on Real, Low-Dimensional Data |
1 |
|
|
8.6.3 Multinomial Naive Bayes |
183 |
|
|
8.6.4 SVMs |
185 |
|
|
8.6.5 Discussion |
188 |
|
|
8.7 Conclusion |
189 |
|
|
Appendix |
190 |
|
|
References |
197 |
|
|
9 The Impact of Small Disjuncts on Classifier Learning |
200 |
|
|
Gary M. Weiss |
200 |
|
|
9.1 Introduction |
200 |
|
|
9.2 An Example: The Vote Data Set |
202 |
|
|
9.3 Description of Experiments |
204 |
|
|
9.4 The Problem with Small Disjuncts |
205 |
|
|
9.5 The Effect of Pruning on Small Disjuncts |
209 |
|
|
9.6 The Effect of Training Set Size on Small Disjuncts |
217 |
|
|
9.7 The Effect of Noise on Small Disjuncts |
220 |
|
|
9.8 The Effect of Class Imbalance on Small Disjuncts |
224 |
|
|
9.9 Related Work |
227 |
|
|
9.10 Conclusion |
230 |
|
|
References |
232 |
|
|
Part IV Hybrid Data Mining Procedures |
234 |
|
|
10 Predicting Customer Loyalty Labels in a Large Retail Database: A Case Study in Chile |
235 |
|
|
Cristián J. Figueroa |
235 |
|
|
10.1 Introduction |
235 |
|
|
10.2 Related Work |
237 |
|
|
10.3 Objectives of the Study |
239 |
|
|
10.3.1 Supervised and Unsupervised Learning |
240 |
|
|
10.3.2 Unsupervised Algorithms |
240 |
|
|
10.3.2.1 Self-Organizing Map |
240 |
|
|
10.3.2.2 Sammon Mapping |
242 |
|
|
10.3.2.3 Curvilinear Component Analysis |
243 |
|
|
10.3.3 Variables for Segmentation |
244 |
|
|
10.3.4 Exploratory Data Analysis |
245 |
|
|
10.3.5 Results of the Segmentation |
246 |
|
|
10.4 Results of the Classifier |
247 |
|
|
10.5 Business Validation |
250 |
|
|
10.5.1 In-Store Minutes Charges for Prepaid Cell Phones |
251 |
|
|
10.5.2 Distribution of Products in the Store |
252 |
|
|
10.6 Conclusions and Discussion |
254 |
|
|
Appendix |
256 |
|
|
References |
258 |
|
|
11 PCA-Based Time Series Similarity Search |
260 |
|
|
Leonidas Karamitopoulos, Georgios Evangelidis, and Dimitris Dervos |
260 |
|
|
11.1 Introduction |
261 |
|
|
11.2 Background |
263 |
|
|
11.2.1 Review of PCA |
263 |
|
|
11.2.2 Implications of PCA in Similarity Search |
264 |
|
|
11.2.3 Related Work |
266 |
|
|
11.3 Proposed Approach |
268 |
|
|
11.4 Experimental Methodology |
270 |
|
|
11.4.1 Data Sets |
270 |
|
|
11.4.2 Evaluation Methods |
271 |
|
|
11.4.3 Rival Measures |
272 |
|
|
11.5 Results |
273 |
|
|
11.5.1 1-NN Classification |
273 |
|
|
11.5.2 k-NN Similarity Search |
276 |
|
|
11.5.3 Speeding Up the Calculation of APEdist |
277 |
|
|
11.6 Conclusion |
279 |
|
|
References |
279 |
|
|
12 Evolutionary Optimization of Least-Squares Support Vector Machines |
282 |
|
|
Arjan Gijsberts, Giorgio Metta, and Léon Rothkrantz |
282 |
|
|
12.1 Introduction |
283 |
|
|
12.2 Kernel Machines |
283 |
|
|
12.2.1 Least-Squares Support Vector Machines |
284 |
|
|
12.2.2 Kernel Functions |
285 |
|
|
12.2.2.1 Conditions for Kernels |
285 |
|
|
12.3 Evolutionary Computation |
286 |
|
|
12.3.1 Genetic Algorithms |
286 |
|
|
12.3.2 Evolution Strategies |
287 |
|
|
12.3.3 Genetic Programming |
288 |
|
|
12.4 Related Work |
288 |
|
|
12.4.1 Hyperparameter Optimization |
289 |
|
|
12.4.2 Combined Kernel Functions |
289 |
|
|
12.5 Evolutionary Optimization of Kernel Machines |
291 |
|
|
12.5.1 Hyperparameter Optimization |
291 |
|
|
12.5.2 Kernel Construction |
292 |
|
|
12.5.3 Objective Function |
293 |
|
|
12.6 Results |
294 |
|
|
12.6.1 Data Sets |
294 |
|
|
12.6.2 Results for Hyperparameter Optimization |
295 |
|
|
12.6.3 Results for EvoKMGP |
298 |
|
|
12.7 Conclusions and Future Work |
299 |
|
|
References |
300 |
|
|
13 Genetically Evolved kNN Ensembles |
303 |
|
|
Ulf Johansson, Rikard König, and Lars Niklasson |
303 |
|
|
13.1 Introduction |
303 |
|
|
13.2 Background and Related Work |
305 |
|
|
13.3 Method |
306 |
|
|
13.3.1 Data sets |
309 |
|
|
13.4 Results |
311 |
|
|
13.5 Conclusions |
316 |
|
|
References |
317 |
|
|
Part V Web-Mining |
318 |
|
|
14 Behaviorally Founded Recommendation Algorithm for Browsing Assistance Systems |
319 |
|
|
Peter Géczy, Noriaki Izumi, Shotaro Akaho, and Kôiti Hasida |
319 |
|
|
14.1 Introduction |
319 |
|
|
14.1.1 Related Works |
320 |
|
|
14.1.2 Our Contribution and Approach |
321 |
|
|
14.2 Concept Formalization |
321 |
|
|
14.3 System Design |
325 |
|
|
14.3.1 A Priori Knowledge of Human--System Interactions |
325 |
|
|
14.3.2 Strategic Design Factors |
325 |
|
|
14.3.3 Recommendation Algorithm Derivation |
327 |
|
|
14.4 Practical Evaluation |
329 |
|
|
14.4.1 Intranet Portal |
330 |
|
|
14.4.2 System Evaluation |
332 |
|
|
14.4.3 Practical Implications and Limitations |
333 |
|
|
14.5 Conclusions and Future Work |
334 |
|
|
References |
335 |
|
|
15 Using Web Text Mining to Predict Future Events: A Testof the Wisdom of Crowds Hypothesis |
337 |
|
|
Scott Ryan and Lutz Hamel |
337 |
|
|
15.1 Introduction |
337 |
|
|
15.2 Method |
339 |
|
|
15.2.1 Hypotheses and Goals |
339 |
|
|
15.2.2 General Methodology |
341 |
|
|
15.2.3 The 2006 Congressional and Gubernatorial Elections |
341 |
|
|
15.2.4 Sporting Events and Reality Television Programs |
342 |
|
|
15.2.5 Movie Box Office Receipts and Music Sales |
343 |
|
|
15.2.6 Replication |
344 |
|
|
15.3 Results and Discussion |
345 |
|
|
15.3.1 The 2006 Congressional and Gubernatorial Elections |
345 |
|
|
15.3.2 Sporting Events and Reality Television Programs |
347 |
|
|
15.3.3 Movie and Music Album Results |
349 |
|
|
15.4 Conclusion |
350 |
|
|
References |
351 |
|
|
Part VI Privacy-Preserving Data Mining |
353 |
|
|
16 Avoiding Attribute Disclosure with the (Extended) p-Sensitive k-Anonymity Model |
354 |
|
|
Traian Marius Truta and Alina Campan |
354 |
|
|
16.1 Introduction |
354 |
|
|
16.2 Privacy Models and Algorithms |
355 |
|
|
16.2.1 The p-Sensitive k-Anonymity Model and Its Extension |
355 |
|
|
16.2.2 Algorithms for the p-Sensitive k-Anonymity Model |
358 |
|
|
16.3 Experimental Results |
361 |
|
|
16.3.1 Experiments for p-Sensitive k-Anonymity |
361 |
|
|
16.3.2 Experiments for Extended p-Sensitive k-Anonymity |
363 |
|
|
16.4 New Enhanced Models Based on p-Sensitive k-Anonymity |
367 |
|
|
16.4.1 Constrained p-Sensitive k-Anonymity |
367 |
|
|
16.4.2 p-Sensitive k-Anonymity in Social Networks |
371 |
|
|
16.5 Conclusions and Future Work |
373 |
|
|
References |
373 |
|
|
17 Privacy-Preserving Random Kernel Classification of Checkerboard Partitioned Data |
375 |
|
|
Olvi L. Mangasarian and Edward W. Wild |
375 |
|
|
17.1 Introduction |
375 |
|
|
17.2 Privacy-Preserving Linear Classifier for Checkerboard Partitioned Data |
379 |
|
|
17.3 Privacy-Preserving Nonlinear Classifier for Checkerboard Partitioned Data |
381 |
|
|
17.4 Computational Results |
382 |
|
|
17.5 Conclusion and Outlook |
384 |
|
|
References |
386 |
|