Page 127 - Building Big Data Applications
P. 127

124   Building Big Data Applications


             data by using the product and geography information. This dataset can be queried
             interactively and can be used for what-if type of causal analysis by different users across
             the organization.
                Another form of visualization of big data is delivered through the use of statistical
             software such as R, SAS, and KXEN, where the predefined models for different statistical
             functions can use the data extracted from the discovery environment and integrate the
             same with corporate and other datasets to drive the statistical visualizations. Very
             popular software that uses R for accomplishing this type of functionality is RStudio.
                All the goods that we are discussing in the visualization can be successfully completed
             in the enterprise today, with the effective implementation of several algorithms. These
             algorithms will be implemented as portions of formulation and transformation of data
             across the artificial intelligence, machine learning, and neural networks. These different
             implementations will be deployed for both unsupervised learning and supervised
             learning, and we will benefit in visualization from both the techniques. The algorithms
             include the following and several proprietary implementations of similar algorithms
             within the enterprise.
               Recommender
               Collocations
               Dimensional reduction
               Expectation maximization
               Bayesian
               Locally weighted linear regression
               Logistic regression
               K-means clustering
               Fuzzy K-means
               Canopy clustering
               Mean shift clustering
               Hierarchical clustering
               Dirichlet process clustering
               Random forests
               Support vector machines
               Pattern mining
               Collaborative filtering
               Spectral clustering
               Stochastic singular value decomposition

                The teams in the enterprise for this visualization and associated algorithms are the
             teams of the data scientist.
   122   123   124   125   126   127   128   129   130   131   132