Page 139 - Data Architecture
P. 139

Chapter 4.3: Parallel Processing
               Fig. 4.3.1 A lot of data.


           There are so much data that need to be handled by big data that trying to load, access,
           and manipulate the data is a real challenge. It is safe to say that no computer is capable of
           handling all the data that can be accumulated in the big data environment.


           The only possible strategy is to use multiple processors to handle the volume of data
           found in big data. In order to understand why it is mandatory to use multiple processors,
           consider the (old) story about the farmer that drives his crop to the marketplace in a
           wagon. When the farmer is first starting out, he doesn’t have much of a crop. He uses a

           donkey to pull the wagon. But as the years pass by, the farmer raises bigger crops. Soon,
           he needs a bigger wagon. And he needs a horse to pull the wagon. Then, one day, the
           crop that is put in the wagon becomes immense, and the farmer doesn’t just need a horse.
           The farmer needs a large Clydesdale horse.


           Time passes, and the farmer prospers even more, and the crop continues to grow. One
           day, even a Clydesdale horse is not large enough to pull the wagon. The day comes where
           multiple horses are required to pull the wagon. Now, the farmer has a whole new set of
           problems. A new rigging is required. A trained driver is required to coordinate the team of
           horses that pull the wagon.


           The same phenomenon occurs where there are lots of data. Multiple processors are
           required to load and manipulate the volumes of data found in big data.


           In a previous chapter, there was a discussion of the “Roman census” method. The Roman
           census method is one of the ways in which parallelization of processing for the
           management of large amounts of data can occur.


           Fig. 4.3.2 depicts the parallelization that occurs in the Roman census approach.






















                                                                                                               139
   134   135   136   137   138   139   140   141   142   143   144