A sensitivity study of bias and variance of k-fold cross-validation in prediction error estimation

Rodríguez Fernández, Juan Diego; Pérez Martínez, Aritz; Lozano Alonso, José Antonio

dc.contributor.author	Rodríguez Fernández, Juan Diego
dc.contributor.author	Pérez Martínez, Aritz
dc.contributor.author	Lozano Alonso, José Antonio
dc.date.accessioned	2011-11-09T20:21:17Z
dc.date.available	2011-11-09T20:21:17Z
dc.date.issued	2009
dc.identifier.uri	http://hdl.handle.net/10810/4628
dc.description.abstract	In the machine learning field the performance of a classifier is usually measured in terms of prediction error. In most real-world problems, the error cannot be exactly calculated and it must be estimated. Therefore, it’s important to choose an appropriate estimator of the error. This paper analyzes the statistical properties (bias and variance) of the k-fold cross-validation classification error estimator (k-cv). Our main contribution is a novel theoretical decomposition of the variance of the k-cv considering its sources of variance: sensitivity to changes in the training set and sensitivity to changes in the folds. The paper also compares the bias and variance of the estimator for different values of k. The empirical study has been performed in artificial domains because they allow the exact computation of the implied quantities and we can specify rigorously the conditions of experimentation. The empirical study has been performed for two different classifiers (naïve Bayes and nearest neighbor), different number of folds (2, 5, 10, n) and sample sizes, and training sets coming from assorted probability distributions.	es
dc.language.iso	eng	es
dc.relation.ispartofseries	EHU-KZAA-TR;2009-00-1
dc.rights	info:eu-repo/semantics/openAccess	es
dc.title	A sensitivity study of bias and variance of k-fold cross-validation in prediction error estimation	es
dc.type	info:eu-repo/semantics/report	es
dc.departamentoes	Ciencia de la computación e inteligencia artificial	es_ES
dc.departamentoeu	Konputazio zientziak eta adimen artifiziala	es_ES

Files in this item

Name:: tr09-00-1.pdf
Size:: 1.891Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Informes técnicos y Documentos de trabajo

Show simple item record