Page 56 - Compact Numerical Methods For Computers
P. 56
46 Compact numerical methods for computers
TABLE 3.1. Index numbers (1940 = 100) for farm money income
and agricultural use of nitrogen, phosphate, potash and petroleum in
the United States (courtesy Dr S Chin).
Income Nitrogen Phosphate Potash Petroleum
305 563 262 461 221
342 658 291 473 222
331 676 294 513 221
339 749 302 516 218
354 834 320 540 217
369 973 350 596 218
378 1079 386 650 218
368 1151 401 676 225
405 1324 446 769 228
438 1499 492 870 230
438 1690 510 907 237
451 1735 534 932 235
485 1778 559 956 236
2
there are (m - k) degrees of freedom and the corrected R is
(3.52)
2
R and provide measures of the goodness of fit of our model which are not
dependent on the scale of the data.
Using the last four columns of table 3.1 together with a column of ones for the
matrix A in algorithm 2, with the first column of the table as the dependent
variable b, a Data General NOVA operating in 23-bit binary floating-point
arithmetic computes the singular values:
5298·55, 345·511, 36·1125, 21·4208 and 5·13828E-2.
The ratio of the smallest of these to the largest is only very slightly larger than the
-22
machine precision, 2 , and we may therefore expect that a great number of
extremely different models may give very similar degees of approximation to the
data. Solutions (a), (b), (c) and (d) in table 3.2 therefore present the solutions
corresponding to all, four, three and two principal components, respectively. Note
that these have 8, 9, 10 and 11 degrees of freedom because we estimate the
coefficients of the principal components, then transform these to give solutions in
terms of our original variables. The solution given by only three principal
components is almost as good as that for all components, that is. a conventional
least-squares solution. However, the coefficients in solutions (a), (b) and (c) are
very different.
Neither the algorithms in this book nor those anywhere else can make a clear
and final statement as to which solution is ‘best’. Here questions of statistical
significance will not be addressed, though they would probably enter into consi-
deration if we were trying to identify and estimate a model intended for use in