Search results for key=Moo1992 : 1 match found.

Refereed full papers (journals, book chapters, international conferences)

1992

John E. Moody, The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems, Advances in Neural Information Processing Systems, 4, pp. 847-854, 1992.

We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions. The principal result is the following relationship (computed to second order) between the expected test set and training set errors: beginequation left < varepsilon _test left ( lambda right ) right >_ xi xi ^ prime approx left < varepsilon _train left ( lambda right ) right >_ xi + 2 sigma ^2_eff fracp _eff( lambda )n. labeleq :1 endequation Here, n is the size of the training sample xi , sigma ²_eff is the effective noise variance in the response variable(s), lambda is a regularization or weight decay parameter, and p_eff( lambda ) is the effective number of parameters in the nonlinear model. The expectations left < right > of training set and test set errors are taken over possible training sets xi and training and test sets xi ^ prime respectively. The effective number of parameters p_eff( lambda ) usually differs from the true number of model parameters p for nonlinear or regularized models; this theoretical conclusion is supported by Monte Carlo experiments. In addition to the surprising result that p_eff( lambda ) neq p, we propose an estimate of ( refeq :1) called the generalized prediction error (GPE) which generalizes well established estimates of prediction risk such as Akaike's FPE and AIC, Mallows C_p, and Barron's PSE to the nonlinear setting.