Page 79 -
P. 79

09-ch02-039-082-9780123814791
                          HAN

          42    Chapter 2 Getting to Know Your Data          2011/6/1  3:15 Page 42  #4



                         the patient undergoes a medical test that has two possible outcomes. The attribute
                         medical test is binary, where a value of 1 means the result of the test for the patient
                         is positive, while 0 means the result is negative.

                           A binary attribute is symmetric if both of its states are equally valuable and carry
                         the same weight; that is, there is no preference on which outcome should be coded
                         as 0 or 1. One such example could be the attribute gender having the states male and
                         female.
                           A binary attribute is asymmetric if the outcomes of the states are not equally impor-
                         tant, such as the positive and negative outcomes of a medical test for HIV. By convention,
                         we code the most important outcome, which is usually the rarest one, by 1 (e.g., HIV
                         positive) and the other by 0 (e.g., HIV negative).


                   2.1.4 Ordinal Attributes
                         An ordinal attribute is an attribute with possible values that have a meaningful order or
                         ranking among them, but the magnitude between successive values is not known.

            Example 2.3 Ordinal attributes. Suppose that drink size corresponds to the size of drinks available at
                         a fast-food restaurant. This nominal attribute has three possible values: small, medium,
                         and large. The values have a meaningful sequence (which corresponds to increasing
                         drink size); however, we cannot tell from the values how much bigger, say, a medium
                         is than a large. Other examples of ordinal attributes include grade (e.g., A+, A, A−, B+,
                         and so on) and professional rank. Professional ranks can be enumerated in a sequential
                         order: for example, assistant, associate, and full for professors, and private, private first
                         class, specialist, corporal, and sergeant for army ranks.
                           Ordinal attributes are useful for registering subjective assessments of qualities that
                         cannot be measured objectively; thus ordinal attributes are often used in surveys for
                         ratings. In one survey, participants were asked to rate how satisfied they were as cus-
                         tomers. Customer satisfaction had the following ordinal categories: 0: very dissatisfied,
                         1: somewhat dissatisfied, 2: neutral, 3: satisfied, and 4: very satisfied.

                           Ordinal attributes may also be obtained from the discretization of numeric quantities
                         by splitting the value range into a finite number of ordered categories as described in
                         Chapter 3 on data reduction.
                           The central tendency of an ordinal attribute can be represented by its mode and its
                         median (the middle value in an ordered sequence), but the mean cannot be defined.
                           Note that nominal, binary, and ordinal attributes are qualitative. That is, they describe
                         a feature of an object without giving an actual size or quantity. The values of such
                         qualitative attributes are typically words representing categories. If integers are used,
                         they represent computer codes for the categories, as opposed to measurable quantities
                         (e.g., 0 for small drink size, 1 for medium, and 2 for large). In the following subsec-
                         tion we look at numeric attributes, which provide quantitative measurements of an
                         object.
   74   75   76   77   78   79   80   81   82   83   84