Page 127 - Building Big Data Applications
P. 127
124 Building Big Data Applications
data by using the product and geography information. This dataset can be queried
interactively and can be used for what-if type of causal analysis by different users across
the organization.
Another form of visualization of big data is delivered through the use of statistical
software such as R, SAS, and KXEN, where the predefined models for different statistical
functions can use the data extracted from the discovery environment and integrate the
same with corporate and other datasets to drive the statistical visualizations. Very
popular software that uses R for accomplishing this type of functionality is RStudio.
All the goods that we are discussing in the visualization can be successfully completed
in the enterprise today, with the effective implementation of several algorithms. These
algorithms will be implemented as portions of formulation and transformation of data
across the artificial intelligence, machine learning, and neural networks. These different
implementations will be deployed for both unsupervised learning and supervised
learning, and we will benefit in visualization from both the techniques. The algorithms
include the following and several proprietary implementations of similar algorithms
within the enterprise.
Recommender
Collocations
Dimensional reduction
Expectation maximization
Bayesian
Locally weighted linear regression
Logistic regression
K-means clustering
Fuzzy K-means
Canopy clustering
Mean shift clustering
Hierarchical clustering
Dirichlet process clustering
Random forests
Support vector machines
Pattern mining
Collaborative filtering
Spectral clustering
Stochastic singular value decomposition
The teams in the enterprise for this visualization and associated algorithms are the
teams of the data scientist.