Page 291 - Soil and water contamination, 2nd edition
P. 291

278                                                  Soil and Water Contamination

                           . 0  682  . 0  812
                       t                 . 1  75
                            . 0  182 /  6

                       The critical  t for 5 degrees of freedom is 2.571 at a significance level of 0.05.  This
                       means that to be able to reject the null hypothesis that the mean difference between the
                       observed and predicted values differs significantly from zero, t must be 2.571 or less (or
                       –2.571 or less). In this case, t = –1.75, so we cannot reject the null hypothesis. However,
                       note that a decision not to reject the null hypothesis does not necessarily mean that the
                       null hypothesis is true, only that there is insufficient evidence against the hypothesis.
                                              2
                       Given the reasonably large R  and the fact that we cannot reject the null hypothesis
                       that the mean difference between the observed and predicted values differs significantly
                       from zero, we may decide to accept the model. It is possible that the deviation from the
                       1:1 line is a random effect given the small sample size of the example (n = 6). As in all
                       statistical tests, more confidence can be gained by examining a larger sample of lakes.


                    15.3  CONSIDERATIONS AFFECTING MODEL CHOICE

                    The outcomes from model verification , calibration , and validation  may justify reconsidering
                    the model’s structure, i.e. the set of mathematical equations, including the mathematical
                    solution technique. If the model parameters have to be calibrated, the calibration data
                    set needs to contain sufficient information to identify these parameters. It has been
                    demonstrated (Van der Perk, 1997) that given a calibrated data set, the model parameters
                    become less identifiable if the model becomes more complex; this is because both the errors
                    in the parameter estimates and their mutual correlation increase. This, in turn, enhances
                    the uncertainty of the model outcomes unless the correlations between the errors in the
                    model parameters are taken into account. On the other hand, the accuracy of the model
                    outcome generally increases with increasing model complexity, up to a maximum. Beyond
                    this maximum, the effect of increasing uncertainty due to a poorer parameter identifiability
                    becomes manifest. This means that more complex models do not necessarily yield better
                    model outcomes and that a single ‘most adequate model’ can be found for a given set of
                    calibration data. This supports the general plea for the development and use of simple and
                    straightforward models that describe the most relevant processes and contain no redundant
                    parameters (process descriptions).
                       Note that the model input variables are also often modelled: for example, through
                    interpolation of measurements – either a simple interpolation using external data such as soil
                    maps, or a data-driven interpolation or simulation (see Burrough and McDonnell, 1998). In
                    some cases the input variables must be simulated using another physically-based model, and
                    so this simulation becomes part of the model structure. This is, for example, the case when
                    time series of rainfall simulated by an atmospheric circulation model are input into a regional
                    groundwater quality model or a catchment-based phosphorus transport model. Whether a
                    spatially distributed model  or a ‘lumped’ model  is chosen depends on the degree of spatial
                    variability  relative to the degree of uncertainty in the end result. If the spatial variability is
                    much larger than the uncertainty in the model result, then spatially distributed modelling
                    makes sense from a predictive perspective.
                       In conclusion, a model structure is mostly chosen based on an assessment of the purpose
                    of the model, prior knowledge from the literature or experience, a heuristic evaluation of the
                    important processes, and the availability of data. However, it is not always possible to choose
                    the desired model structure: for example, when a commercial software package with fixed
                    process descriptions is used. Even when an appropriate model has been chosen, the quality of










                                                                                            10/1/2013   6:45:23 PM
        Soil and Water.indd   290                                                           10/1/2013   6:45:23 PM
        Soil and Water.indd   290
   286   287   288   289   290   291   292   293   294   295   296