Page 86 - Glucose Monitoring Devices
P. 86

The state-of-the-art modeling method by Vettoretti et al.  85




                  Definition of training and test sets
                  Let us suppose we have a dataset containing n tot SMBG measurements, x i , i ¼ 1, .,
                  n tot , and n tot reference samples, r i , i ¼ 1, ., n tot , collected by high-precision and ac-
                  curacy laboratory instruments. For each pair (x i ; r i ), the SMBG absolute error, e abs ,
                                                                                  i
                                           rel
                  and the SMBG relative error, e , are calculated as follows:
                                           i
                                     e abs  ¼ x i   r i ;  e rel  ¼  x i   r i $100  (5.2)
                                      i             i
                                                          r i
                     Note that here we refer to absolute error and relative error as the signed differ-
                  ence and the signed percent difference between the SMBG measurement and its
                  reference sample, respectively.
                     Error data are then divided into two parts. The first part, with cardinality n training ,
                  is used as training set to derive the model of SMBG error PDF in the following steps
                  B and C. The second part, with cardinality n test , is used as test set to validate the
                  model in step D. Absolute and relative errors of training set can be displayed in a
                  scatter plot versus reference glucose to visually assess if the characteristics of the
                  error distribution (e.g., mean and dispersion) significantly vary across the reference
                  glucose range.

                  Constant-SD zones identification
                  Changes in the dispersion of absolute and relative errors with reference glucose are
                  quantified in the training set by analyzing the sample SD. In particular, first a uni-
                  form grid g i , i ¼ 1, ., n g , where n g is the number of points in the grid, is defined
                  in the glucose range with step S (e.g., S ¼ 5 mg/dL). Then, intervals centered at
                  points g i , i ¼ 1, ., n g , with half-width L (e.g., L ¼ 15 mg/dL) are defined. Finally,
                  the sample SD of absolute and relative errors in each interval g i   L is calculated,
                  which approximates the error SD (absolute or relative) at the glucose point g i .
                  The plot of sample SD values versus glucose points g i , i ¼ 1, ., n g , allows to visu-
                  alize how the error SD (absolute or relative) varies across the glucose range and
                  identify zones of the glucose range in which either absolute or relative error presents
                  an approximately constant-SD distribution.


                  Maximum-likelihood fitting
                  In each constant-SD zone, the distribution of the error (absolute or relative) in the
                  training set, here represented by the continuous random variable Y, is fitted by
                  ML using a certain PDF model. In particular, we recommend first to test the error
                  distribution for normality in each zone (e.g., using the Lilliefors test). Then, if the
                  test cannot reject the hypothesis of normality, the Gaussian PDF model can be
                  adopted, as there is no sufficient evidence to justify the use of a more complex
                  model. Conversely, if the normality test rejects the normality hypothesis, a different
                  PDF model should be used. A convenient choice is the skew-normal PDF model,
   81   82   83   84   85   86   87   88   89   90   91