Assessing and Improving Prediction and Classification
Title: Assessing and Improving Prediction and Classification: Theory and Algorithms in C++Author: Timothy MastersLength: 517 pagesEdition: 1st ed.Language: EnglishPublisher: ApressPublication Date: 2017-12-20ISBN-10: 1484233352ISBN-13: 9781484233351This book is dedicated to Master Hidy Ochiai with the utmost respect,admiration, and gratitude. His incomparable teaching of washin-Ryukarate has raised my confidence, my physical ability, and my mentalacuity far beyond anything I could have imaginedFor this i will ever be gratefulTable of contentsAbout the authorAbout the technical reviewersPreface■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■"■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■Chapter 1: Assessment of Numeric Predictions■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■a■■■■■■■■NotationOverview of performance measuresConsistency and Evolutionary Stability2369Selection bias and the need for three datasetsCross validation and Walk-Forward Testing14Bias in cross validation mmmmmmm 150 verlap considerations......……15Assessing Nonstationarity Using Walk-Forward Testing17Nested cross validation revisited18Common performance measuresMean Squared Error...mmemmemmmmmm..amsmmsmaenmMean absolute error23R-Squared123RMS Error….4Nonparametric CorrelationSuccess Ratios26Alternatives to common performance measures27Stratification for Consistency27Confidence intervals29The confidence setSerial correlation…32TABLE OF CONTENTSMultiplicative Data,,,,,,,,,,,…,,…32Normally Distributed Errors33Empirical Quantiles as Confidence Intervals35Confidence Bounds for Quantiles37Tolerance lntervals…40Chapter 2: Assessment of class Predictions■■■■■■■■■■■■■■■■■■■口■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■口■■45The confusion matrix46Expected GainROC (Receiver Operating Characteristic) Curves48Hits, False Alarms, and related measures w mComputing the ROC CurveArea Under the roc curveCost and the roc curveOptimizing ROC-Based Statistics60Optimizing the threshold: Now or Later?61Maximizing Precision.….,.,.,.,………,64Generalized Targets emma.Maximizing Total Gain66Maximizing Mean Gain…,.,.,,.,,.,,.,.,.,.,,,,,,,67Maximizing the Standardized Mean Gain67Confidence in Classification Decisions69Hypothesis Testing70Confidence in the confidenceBayesian Methods81Multiple classes85Hypothesis Testing VS. Bayes' MethodFinal Thoughts on Hypothesis Testing91Confidence Intervals for Future performance98TABLE OF CONTENTSChapter 3: Resampling for Assessing Parameter Estimates mamma■■■■■■■国■口101Bias and variance of statistical estimators102Plug-in Estimators and Empirical Distributions...................... 103Bⅰ as of an estimator,…104Variance of an estimator mmm 105Bootstrap estimation of Bias and variance..s...amen. 106Code for bias and variance estimationPlug-in Estimators Can Provide Better BootstrapsA Model Parameter EXample116Confidence intervals ,Is the interval backward?125Improving the Percentile Method128Hypothesis Tests for Parameter ValuesBootstrapping Ratio Statistics137Jackknife estimates of bias and varianceBootstrapping Dependent Data151Estimating the Extent of Autocorrelation.........,……………,152The Stationary bootstrap155Choosing a Block Size for the Stationary Bootstrap.................... 158The Tapered block Bootstrap.amnnnnnnnnnnnnnneannna. 163Choosing a block Size for the Tapered block Bootstrap .What If the block Size Is Wrong?172Chapter 4: Resampling for Assessing Prediction and classification■■■■■■■国■口185Partitioning the Error.Cross validation,189Bootstrap Estimation of population Error.mmcmmcmmmnmmmnmmnmcmmmm 191Efron's EO Estimate of population Error195Efron's E632 Estimate of Population Error198Comparing the error Estimators for Prediction....,…,…………199Comparing the Error Estimators for Classification203TABLE OF CONTENTSChapter 5: Miscellaneous Resampling Techniques B mammammaaa 205Bagging,,…206A Quasi-theoretical Justification207The component Models,..........,,.,.,.,,….209Code for bagging………210Adaboost215Binary Adaboost for Pure Classification Models215Probabilistic Sampling for Inflexible Models.Binary Adaboost When the Model Provides confidence.AdaBoost mh for more than two classes234Adaboost, oc for more than two classes243Comparing the Boosting Algorithms259A Binary classification Problem259A Multiple-Class Problem.Final Thoughts on Boosting mmmmmamm.Permutation Training and testing264The Permutation Training algorithm…,…,266Partitioning the Training Performance267A Demonstration of Permutation Training…,…270Chapter 6: Combining Numeric Predictions mmmmmammmmmmmmammmn 279Simple average .ae.......nn.279Code for Averaging Predictions…..,.,.,,…,……281Unconstrained linear combinations283Constrained linear combinationsConstrained combination of unbiased models291Variance- Weighted Interpolation.………………293Combination by Kernel regression Smoothing295Code for the grnn300Comparing the Combination Methods305TABLE OF CONTENTSChapter 7: combining Classification Models m.■■■■■■■■■■■■■■■■■■口■■■■■■■■■■口■■a■■309Introduction and notation310Reduction vs.0 dering…,,,…311The Majority rule .meamen…312Code for the Majority rule313The borda count316The Average Rule318Code for the average rule…318The median alternativeThe product rule320TheMaxMaxandMaXMinRues,320The intersection method32The Union rule328Logistic Regression332Code for the Combined Weight Method.The logit transform and maximum likelihood estimation ..mm.m 340Code for Logistic Regression.Separate Weight sets…348Model selection by Local Accuracy350Code for Local Accuracy selection e......nt352Maximizing the Fuzzy Integral362What does this have to do with classifier combinationCode for the Fuzzy Integral...................... 366Pairwise Coupling..................,……374Pairwise Threshold optimization……383A Cautionary Note384Comparing the Combination Methods385Small Training set, Three models…….386Large Training Set, Three Models387Small Training Set, Three Good Models, OneLarge Training Set, Three Good Models, One Worthless389TABLE OF CONTENTSSmall Training Set, Worthless and Noisy Models Included..mmm.mm.m.m.m.m.m.......m....Large Training Set, Worthless and Noisy Models Included .ana..ammnnnanememnanon 391Five classes392Chapter 8: Gating Methods ma man393Preordained specialization……393Learned Specialization aamtnaannaanaaaataannaaanaaaanannan, 395After-the-Fact Specialization395Code for After-the-Fact Specialization.Some Experimental Results403General Regression Gating……405Code for GRNN Gating................ 408Experiments with GRNN Gating415Chapter 9: Information and Entropy417Entropy………417Entropy of a Continuous Random variable mnananantneamnennonnnnnonmn 420Partitioning a Continuous Variable for Entropy. anmmmmmnnammemnmmmmn, 421An Example of Improving Entropy...a426Joint and conditional Entropy428Code for Conditional Entropy.Mutual information433Fano's bound and selection of predictor variablesConfusion matrices and mutual Information,438Extending Fano's Bound for Upper Limits440Simple algorithms for Mutual Information.The TEst_DIS Program449Continuous mutual Information452The parzen window method452Adaptive Partitioning…461The TEsT_ cON Program475TABLE OF CONTENTSPredictor Selection Using Mutual Information…......,,……,476Maximizing Relevance While Minimizing Redundancy476The MI DISC and MI coNT Programs.479A Contrived Example of Information Minus Redundancy480A Superior Selection Algorithm for Binary Variables .. mmmememanana... 482Screening Without Redundancy.Asymmetric Information Measuresmnmmennmmemmnnannnnn.Uncertainty Reduction...,,,……495Transfer Entropy: Schreiber's Information Transfer499References mmm 509Index…u1513
用户评论