Page 86 - Glucose Monitoring Devices
P. 86
The state-of-the-art modeling method by Vettoretti et al. 85
Definition of training and test sets
Let us suppose we have a dataset containing n tot SMBG measurements, x i , i ¼ 1, .,
n tot , and n tot reference samples, r i , i ¼ 1, ., n tot , collected by high-precision and ac-
curacy laboratory instruments. For each pair (x i ; r i ), the SMBG absolute error, e abs ,
i
rel
and the SMBG relative error, e , are calculated as follows:
i
e abs ¼ x i r i ; e rel ¼ x i r i $100 (5.2)
i i
r i
Note that here we refer to absolute error and relative error as the signed differ-
ence and the signed percent difference between the SMBG measurement and its
reference sample, respectively.
Error data are then divided into two parts. The first part, with cardinality n training ,
is used as training set to derive the model of SMBG error PDF in the following steps
B and C. The second part, with cardinality n test , is used as test set to validate the
model in step D. Absolute and relative errors of training set can be displayed in a
scatter plot versus reference glucose to visually assess if the characteristics of the
error distribution (e.g., mean and dispersion) significantly vary across the reference
glucose range.
Constant-SD zones identification
Changes in the dispersion of absolute and relative errors with reference glucose are
quantified in the training set by analyzing the sample SD. In particular, first a uni-
form grid g i , i ¼ 1, ., n g , where n g is the number of points in the grid, is defined
in the glucose range with step S (e.g., S ¼ 5 mg/dL). Then, intervals centered at
points g i , i ¼ 1, ., n g , with half-width L (e.g., L ¼ 15 mg/dL) are defined. Finally,
the sample SD of absolute and relative errors in each interval g i L is calculated,
which approximates the error SD (absolute or relative) at the glucose point g i .
The plot of sample SD values versus glucose points g i , i ¼ 1, ., n g , allows to visu-
alize how the error SD (absolute or relative) varies across the glucose range and
identify zones of the glucose range in which either absolute or relative error presents
an approximately constant-SD distribution.
Maximum-likelihood fitting
In each constant-SD zone, the distribution of the error (absolute or relative) in the
training set, here represented by the continuous random variable Y, is fitted by
ML using a certain PDF model. In particular, we recommend first to test the error
distribution for normality in each zone (e.g., using the Lilliefors test). Then, if the
test cannot reject the hypothesis of normality, the Gaussian PDF model can be
adopted, as there is no sufficient evidence to justify the use of a more complex
model. Conversely, if the normality test rejects the normality hypothesis, a different
PDF model should be used. A convenient choice is the skew-normal PDF model,