Page 229 - Mechanical Engineers' Handbook (Volume 2)

P. 229

218 Data Acquisition and Display Systems

this approach is that one has to keep a list of previous values at least as long as the time
span one wishes to average.
A simpler approach is the ﬁrst-order digital ﬁlter, where a portion of the current sample
is combined with a portion of previous samples. This composite value, since it contains more
than one measured value, will tend to discard transitory information and retain information
that has existed over more than one scan. The formula for a ﬁrst-order digital ﬁlter is de-
scribed in (c) above, where is a factor selected by the user. The more one wants the data
ﬁltered, the smaller the choice of ; the less one wants ﬁltered, the larger the choice of .
Alpha must be between 0 and 1 inclusive.
The moving average or digital ﬁlter can tend to make the appearance of important events
to be later than the event occurred in the real world. This can be mitigated somewhat for
moving averages by including data centered on the point of time of interest in the moving-
average calculation. These ﬁlters can be cascaded, that is, the output of a ﬁlter can be used
as the input to another ﬁlter.
A danger with any ﬁlter is that valuable information might be lost. This is related to
the concept of compression, which is covered in the next section. When data are not con-
tinuous, with peaks or exceptions being important elements to record, simple ﬁlters such as
moving-average or digital ﬁlters are not adequate. Some laboratory instruments such as a
gas chromatograph may have proﬁles that correspond to certain types of data (a peak may
correspond to the existence of an element). The data acquisition system can be trained to
look for these proﬁles through a preexisting set of instructions or the human operator could
indicate which proﬁles correspond to an element and the data acquisition system would build
a set of rules. An example is to record the average of a sample of a set of data but also
record the minimum and maximum. In situations where moisture or other physical attributes
are measured, this is a common practice. Voice recognition systems often operate on a similar
set of procedures. The operator speaks some words on request into the computer and it builds
an internal proﬁle of how the operator speaks to use later on new words. One area where
pattern matching is used is in error-correcting serial data transmission. When serial data are
being transmitted, a common practice to reduce errors is to insert known patterns into the
data stream before and after the data. What if there is noise on the line? One can then look
for a start-of-message pattern and an end-of-message pattern. Any data coming over the line
that are not bracketed by these characters would be ignored or ﬂagged as extraneous trans-
mission.

4.4 Compression Techniques
For high-speed or long-duration data collection sessions there may be massive amounts of
data collected. It is a difﬁcult decision to determine how much detail to retain. The trade-
offs are not just in space but also in the time required to store and later retrieve the data.
Sampling techniques, also covered in other chapters, provide a way of retaining much of the
important features of the data while eliminating the less important noise or redundant data.
As an example, 1000 points of data collected each second for 1 year in a database could
easily approach 0.5 terabyte when index ﬁles and other overhead are taken into account.
Even if one has a large disk farm, the time required to get to a speciﬁc data element can be
prohibitive. Often, systems are indexed by time of collection of the data point. This speeds
up retrieval when a speciﬁc time frame is desired but is notoriously slow when relationships
between data are explored or when events are searched for based on value and not on time.
Approaches to reducing data volume include the following:
• Reduce the volume of data stored by various compression tools, such as discovering
repeating data, storing one copy, and then indicating how many times the data are

224 225 226 227 228 229 230 231 232 233 234