Page 14 - Building Big Data Applications
P. 14

8 Building Big Data Applications











                The traditional application processing happens when an application requests for
             either a read or write operation to the backend database. The request transits through
             the network often passing through an edge server to the application server and then to
             the database, and finally reverts back once the operation is complete. There are several
             factors to consider for improving and sustaining the performance which includes:

               Robust networks which can perform without any inhibition on throughput
               Fast performing edge servers that can manage thousands of users and queries
               Application servers with minimal interface delays and API’s for performing the
                queries and operations
               Databases that are tuned for heavy transactional performance with high
                throughput.
                All of these are very well-known issues when it comes to application performance and
             sustained maintenance of the same. The issue grows more complex when we need to use
             the data warehouse or a large database or an analytical model for these types of oper-
             ations. The common issues that we need to solve include:

               Data reduction in dimensionality to accommodate the most needed and used attri-
                butes. This often results in multiple business intelligence projects that have a
                never-ending status.
               Data relationships management often becomes a burden or overload on the
                system.
               Delivering key analytics takes cycles to execute whether database or analytic
                model.
               Data lineage cannot be automatically traced.
               Data auditability is limited.
               Data aggregates cannot be drilled down or drilled across for all queries.
                The issue is not with data alone, the core issue lies beneath the data layer, the infra-
             structure. The database is a phenomenal analytic resource and the schemas defined within
             the database are needed for all the queries and the associated multi-dimensional analytics.
             However,to load the schemas weneed todefine a fixed set ofattributes from the dimensions
             as they are in source systems. These attributes are often gathered as business requirements,
             which is where we have a major missing point, the attributes are often defined by one
             business team and adding more attributes means issues, and we deliver too many database
             solutionsanditbecomesanightmare.Thisiswhere wehavecreated a majorchangewiththe
   9   10   11   12   13   14   15   16   17   18   19