Page 229 - Mechanical Engineers' Handbook (Volume 2)
P. 229
218 Data Acquisition and Display Systems
this approach is that one has to keep a list of previous values at least as long as the time
span one wishes to average.
A simpler approach is the first-order digital filter, where a portion of the current sample
is combined with a portion of previous samples. This composite value, since it contains more
than one measured value, will tend to discard transitory information and retain information
that has existed over more than one scan. The formula for a first-order digital filter is de-
scribed in (c) above, where is a factor selected by the user. The more one wants the data
filtered, the smaller the choice of ; the less one wants filtered, the larger the choice of .
Alpha must be between 0 and 1 inclusive.
The moving average or digital filter can tend to make the appearance of important events
to be later than the event occurred in the real world. This can be mitigated somewhat for
moving averages by including data centered on the point of time of interest in the moving-
average calculation. These filters can be cascaded, that is, the output of a filter can be used
as the input to another filter.
A danger with any filter is that valuable information might be lost. This is related to
the concept of compression, which is covered in the next section. When data are not con-
tinuous, with peaks or exceptions being important elements to record, simple filters such as
moving-average or digital filters are not adequate. Some laboratory instruments such as a
gas chromatograph may have profiles that correspond to certain types of data (a peak may
correspond to the existence of an element). The data acquisition system can be trained to
look for these profiles through a preexisting set of instructions or the human operator could
indicate which profiles correspond to an element and the data acquisition system would build
a set of rules. An example is to record the average of a sample of a set of data but also
record the minimum and maximum. In situations where moisture or other physical attributes
are measured, this is a common practice. Voice recognition systems often operate on a similar
set of procedures. The operator speaks some words on request into the computer and it builds
an internal profile of how the operator speaks to use later on new words. One area where
pattern matching is used is in error-correcting serial data transmission. When serial data are
being transmitted, a common practice to reduce errors is to insert known patterns into the
data stream before and after the data. What if there is noise on the line? One can then look
for a start-of-message pattern and an end-of-message pattern. Any data coming over the line
that are not bracketed by these characters would be ignored or flagged as extraneous trans-
mission.
4.4 Compression Techniques
For high-speed or long-duration data collection sessions there may be massive amounts of
data collected. It is a difficult decision to determine how much detail to retain. The trade-
offs are not just in space but also in the time required to store and later retrieve the data.
Sampling techniques, also covered in other chapters, provide a way of retaining much of the
important features of the data while eliminating the less important noise or redundant data.
As an example, 1000 points of data collected each second for 1 year in a database could
easily approach 0.5 terabyte when index files and other overhead are taken into account.
Even if one has a large disk farm, the time required to get to a specific data element can be
prohibitive. Often, systems are indexed by time of collection of the data point. This speeds
up retrieval when a specific time frame is desired but is notoriously slow when relationships
between data are explored or when events are searched for based on value and not on time.
Approaches to reducing data volume include the following:
• Reduce the volume of data stored by various compression tools, such as discovering
repeating data, storing one copy, and then indicating how many times the data are