Page 61 - Building Big Data Applications
P. 61
Chapter 2 Infrastructure and technology 55
FIGURE 2.16 Sqoop1 architecture.
Use Hive and HDFS for data processing
Use Oozie for scheduling and managing jobs.
Installing Sqoopdcurrently you can download and install Sqoop from Apache
Foundation homepage or from any Hadoop distribution. The installation is manual and
needs configuration steps to be followed without any miss (Fig. 2.16).
Sqoop is completely driven by the client side installation and heavily depends on
JDBC technology as the first release of Sqoop was developed in Java. In this workflow
shown in Fig. 2.17, you can import and export the data from any database with simple
commands that you can execute from a command line interface (CLI), for example.
FIGURE 2.17 Hive process flow. Image sourcedHUG discussions.