Suchen und Finden
Service
Data Mining - Practical Machine Learning Tools and Techniques
Ian H. Witten, Eibe Frank, Mark A. Hall
Verlag Elsevier Reference Monographs, 2011
ISBN 9780080890364 , 664 Seiten
3. Auflage
Format PDF, ePUB, OL
Kopierschutz DRM
Front cover
1
Data Mining: Practical Machine Learning Tools and Techniques
2
Copyright page
5
Table of contents
6
List of Figures
16
List of Tables
20
Preface
22
Updated and revised content
26
Acknowledgments
30
About the Authors
34
PART I: Introduction to Data Mining
36
Chapter 1: What’s It All About?
38
Data mining and machine learning
38
Simple examples: the weather and other problems
44
Fielded applications
56
Machine learning and statistics
63
Generalization as search
64
Data mining and ethics
68
Further reading
71
Chapter 2: Input: Concepts, Instances, and Attributes
74
What’s a concept?
75
What’s in an example?
77
What’s in an attribute?
84
Preparing the input
86
Further reading
95
Chapter 3: Output: Knowledge Representation
96
Tables
96
Linear models
97
Trees
99
Rules
102
Instance-based representation
113
Clusters
116
Further reading
118
Chapter 4: Algorithms: The Basic Methods
120
InFerring rudimentary rules
121
Statistical modeling
125
Divide-and-conquer: constructing decision trees
134
Covering algorithms: constructing rules
143
Mining association rules
151
Linear models
159
Instance-based learning
166
Clustering
173
Multi-instance learning
176
Further reading
178
Weka implementations
180
Chapter 5: Credibility: Evaluating What’s Been Learned
182
Training and testing
183
Predicting performance
185
Cross-validation
187
Other estimates
189
Comparing data mining schemes
191
Predicting probabilities
194
Counting the cost
198
Evaluating numeric prediction
215
Minimum description length principle
218
Applying the MDL principle to clustering
221
Further reading
222
Part 2: Advanced Data Mining
224
Chapter 6: Implementations: Real Machine Learning Schemes
226
Decision trees
227
Classification rules
238
Association rules
251
Extending linear models
258
Instance-based learning
279
Numeric prediction with local linear models
286
Bayesian networks
296
Clustering
308
Semisupervised learning
329
Multi-instance learning
333
Weka implementations
338
Chapter 7: Data Transformations
340
Attribute selection
342
Discretizing numeric attributes
349
Projections
357
Sampling
365
Cleansing
366
Transforming multiple classes to binary ones
373
Calibrating class probabilities
378
Further reading
381
Weka implementations
383
Chapter 8: Ensemble Learning
386
Combining multiple models
386
Bagging
387
Randomization
391
Boosting
393
Additive regression
397
Interpretable ensembles
400
Stacking
404
Further reading
406
Weka implementations
407
Chapter 9: Moving on: Applications and Beyond
410
Applying data mining
410
Learning from massive datasets
413
Data stream learning
415
Incorporating domain knowledge
419
Text mining
421
Web mining
424
Adversarial situations
428
Ubiquitous data mining
430
Further reading
432
PART III: The Weka Data Mining Workbench
436
Chapter 10: Introduction to Weka
438
What’s in weka?
438
How do you use it?
439
What else can you do?
440
How do you get it?
441
Chapter 11: The Explorer
442
Getting started
442
Exploring the explorer
451
Filtering algorithms
467
Learning algorithms
480
Metalearning algorithms
509
Clustering algorithms
515
Association-rule learners
520
Attribute selection
522
Chapter 12: The Knowledge Flow Interface
530
Getting started
530
Components
533
Configuring and connecting the components
535
Incremental learning
537
Chapter 13: The Experimenter
540
Getting started
540
Simple setup
545
Advanced setup
546
The analyze panel
547
Distributing processing over several machines
550
Chapter 14: The Command-Line Interface
554
Getting started
554
The structure of weka
554
Command-line options
561
Chapter 15: Embedded Machine Learning
566
A simple data mining application
566
Chapter 16: Writing New Learning Schemes
574
An example classifier
574
Conventions for implementing classifiers
590
Chapter 17: Tutorial Exercises for the Weka Explorer
594
Introduction to the explorer interface
594
Nearest-neighbor learning and decision trees
601
Classification boundaries
606
Preprocessing and parameter tuning
609
Document classification
613
Mining association rules
617
References
622
Index
642