Suchen und Finden
Service
Chemometrics with R - Multivariate Data Analysis in the Natural Sciences and Life Sciences
Ron Wehrens
Verlag Springer-Verlag, 2011
ISBN 9783642178412 , 286 Seiten
Format PDF, OL
Kopierschutz Wasserzeichen
Chemometrics with R
3
Preface
7
Contents
11
1 Introduction
15
Part I Preliminaries
19
2 Data
20
3 Preprocessing
26
3.1 Dealing with Noise
26
3.2 Baseline Removal
31
3.3 Aligning Peaks – Warping
33
3.3.1 Parametric Time Warping
35
3.3.2 Dynamic Time Warping
39
3.3.3 Practicalities
44
3.4 Peak Picking
44
3.5 Scaling
46
3.6 Missing Data
51
3.7 Conclusion
52
Part II Exploratory Analysis
53
4 Principal Component Analysis
54
4.1 The Machinery
55
4.2 Doing It Yourself
57
4.3 Choosing the Number of PCs
59
4.3.1 Statistical Tests
60
4.4 Projections
62
4.5 R Functions for PCA
64
4.6 Related Methods
68
4.6.1 Multidimensional Scaling
68
4.6.2 Independent Component Analysis and Projection Pursuit
71
4.6.3 Factor Analysis
74
4.6.4 Discussion
76
5 Self-Organizing Maps
78
5.1 Training SOMs
79
5.2 Visualization
82
5.3 Application
84
5.4 R Packages for SOMs
87
5.5 Discussion
88
6 Clustering
90
6.1 Hierarchical Clustering
91
6.2 Partitional Clustering
96
6.2.1 K-Means
96
6.2.2 K-Medoids
98
6.3 Probabilistic Clustering
101
6.4 Comparing Clusterings
106
6.5 Discussion
108
Part III Modelling
111
7 Classification
112
7.1 Discriminant Analysis
113
7.1.1 Linear Discriminant Analysis
114
7.1.2 Crossvalidation
118
7.1.3 Fisher LDA
120
7.1.4 Quadratic Discriminant Analysis
123
7.1.5 Model-Based Discriminant Analysis
125
7.1.6 Regularized Forms of Discriminant Analysis
127
Diagonal Discriminant Analysis
128
Shrunken Centroid Discriminant Analysis
129
7.2 Nearest-Neighbour Approaches
131
7.3 Tree-Based Approaches
135
7.3.1 Recursive Partitioning and Regression Trees
135
Constructing the Tree
139
7.3.2 Discussion
144
7.4 More Complicated Techniques
144
7.4.1 Support Vector Machines
145
Extensions to More than Two Classes
148
Finding the Right Parameters
149
7.4.2 Artificial Neural Networks
150
8 Multivariate Regression
154
8.1 Multiple Regression
154
8.1.1 Limits of Multiple Regression
156
8.2 PCR
158
8.2.1 The Algorithm
158
8.2.2 Selecting the Optimal Number of Components
161
8.3 Partial Least Squares (PLS) Regression
164
8.3.1 The Algorithm(s)
165
8.3.2 Interpretation
169
PLS Packages for R
172
8.4 Ridge Regression
172
8.5 Continuum Methods
174
8.6 Some Non-Linear Regression Techniques
174
8.6.1 SVMs for Regression
174
8.6.2 ANNs for Regression
177
8.7 Classification as a Regression Problem
179
8.7.1 Regression for LDA
179
8.7.2 Discussion
181
Part IV Model Inspection
182
9 Validation
183
9.1 Representativity and Independence
184
9.2 Error Measures
186
9.3 Model Selection
187
9.4 Crossvalidation Revisited
189
9.4.1 LOO Crossvalidation
189
9.4.2 Leave-Multiple-Out Crossvalidation
191
9.4.3 Double Crossvalidation
191
9.5 The Jackknife
192
9.6 The Bootstrap
194
9.6.1 Error Estimation with the Bootstrap
195
9.6.2 Confidence Intervals for Regression Coefficients
198
9.6.3 Other R Packages for Bootstrapping
203
9.7 Integrated Modelling and Validation
203
9.7.1 Bagging
204
9.7.2 Random Forests
205
9.7.3 Boosting
210
10 Variable Selection
213
10.1 Tests for Coefficient Significance
214
10.1.1 Confidence Intervals for Individual Coefficients
215
10.1.2 Tests Based on Overall Error Contributions
218
10.2 Explicit Coefficient Penalization
221
10.3 Global Optimization Methods
225
10.3.1 Simulated Annealing
226
10.3.2 Genetic Algorithms
233
10.3.3 Discussion
240
Part V Applications
241
11 Chemometric Applications
242
11.1 Outlier Detection with Robust PCA
242
11.1.1 Robust PCA
243
11.1.2 Discussion
247
11.2 Orthogonal Signal Correction and OPLS
247
11.3 Discrimination with Fat Data Matrices
250
11.3.1 PCDA
251
11.3.2 PLSDA
255
A Word of Warning
257
11.4 Calibration Transfer
258
11.5 Multivariate Curve Resolution
262
11.5.1 Theory
263
11.5.2 Finding Suitable Initial Estimates
264
Evolving Factor Analysis
264
OPA { the Orthogonal Projection Approach
266
11.5.3 Applying MCR
268
11.5.4 Constraints
270
11.5.5 Combining Data Sets
272
Part VI Appendices
275
A R Packages Used in this Book
276
References
277
Index
286