Page 223 -
P. 223
Chapter 5 Database Processing
222
Q5-8 2026?
With ever-cheaper data storage and data communications, we can be sure that the volume of
database data will continue to grow, probably exponentially, through 2026. All that data contains
patterns that can be used to conceive information to help businesses and organizations achieve
their strategies. That will make business intelligence, discussed in Chapter 9, even more important.
Furthermore, as databases become bigger and bigger, they’re more attractive as targets for theft or
mischief, as we recently saw at Sony Entertainment. Those risks will make database security even
more important, as we discuss in Chapter 10.
Additionally, the DBMS landscape is changing. While for years relational DBMS products
were the only game in town, the Internet changed that by posing new processing requirements.
As compared to traditional database applications, some Internet applications process many,
many more transactions against much simpler data. A tweet has a much simpler data structure
than the configuration of a Kenworth truck, but there are so many more tweets than truck
configurations!
Also, traditional relational DBMS products devote considerable code and processing power to
support what are termed ACID (atomic, consistent, isolated, durable) transactions. In essence, this
acronym means that either all of a transaction is processed or none of it is (atomic), that transactions
are processed in the same manner (consistent) whether processed alone or in the presence of millions
of other transactions (isolated), and that once a transaction is stored it never goes away—even in the
presence of failure (durable).
ACID transactions are critical to traditional commercial applications. Even in the presence of
machine failure, Vanguard must process both the sell and the buy sides of a transaction; it cannot
process part of a transaction. Also, what it stores today must be stored tomorrow. But many new
Internet applications don’t need ACID. Who cares if, one time out of 1 million, only half of your
tweet is stored? Or if it’s stored today and disappears tomorrow?
These new requirements have led to three new categories of DBMS:
1. NoSQL DBMS. This acronym is misleading. It really should be NotRelational DBMS. It
refers to new DBMS products that support very high transaction rates processing relatively
simple data structures, replicated on many servers in the cloud, without ACID transaction
support. MongoDB, Cassandra, Bigtable, and Dynamo are NoSQL products
2. NewSQL DBMS. These DBMS products process very high levels of transactions, like the
NoSQL DBMS, but provide ACID support. They may or may not support the relational
model. Such products are a hotbed of development with new vendors popping up nearly
every day. Leading products are yet unknown.
3. In-memory DBMS. This category consists of DBMS products that process databases
in main memory. This technique has become possible because today’s computer memo-
ries can be enormous and can hold an entire database at one time, or at least very large
chunks of it. Usually these products support or extend the relational model. SAP HANA
is a computer with an in-memory DBMS that provides high volume ACID transaction
support simultaneously with complex relational query processing. Tableau Software’s
reporting products are supported by a proprietary in-memory DBMS using an extension
to SQL.
Does the emergence of these new products mean the death knell for relational databases? It
seems unlikely because organizations have created thousands of traditional relational databases
with millions of lines of application code that process SQL statements against relational data
structures. No organization wants to endure the expense and effort of converting those databases
and code to something else. There is also a strong social trend among older technologists to
hang onto the relational model. However, these new products are loosening the stronghold that