Page 378 - Data Architecture

P. 378

Chapter 9.2: Analyzing Repetitive Data

Fig. 9.2.8 Two approaches to accessing repetitive data.

With any index, there is a cost. There is the cost of initially building the index. Then,
there is the cost of keeping the index current. Then, there is the cost of storage for the
index. In the world of big data, indexes are typically built by technology called
“crawlers.” The crawler technology is constantly searching the big data creating new
index records. As long as the data remain stable and unchanged, the data have to only be
indexed once. But if data are added or if data are deleted, then there need to be constant
updates to the index in order to keep the index current. And in any case, there is the cost
of storage for the index itself.

Fig. 9.2.9 shows the costs of building and maintaining an index.

378

373 374 375 376 377 378 379 380 381 382 383