Page 62 - Building Big Data Applications
P. 62

56 Building Big Data Applications


                Import syntaxdsqoop import –connect jdbc:mysql://localhost/testdb \–table
             PERSON –username test –password ****.
               This command will generate a series of tasks
                  Generate SQL code
                  Execute SQL code
                  Generate maps/reduces jobs
                  Execute MapReduce jobs
                  Transfer data to local files or HDFS
                Export syntaxdsqoop export –connect jdbc:mysql://localhost/testdb \ –table
             CLIENTS_INTG –username test –password **** \ –export-dir/user/localadmin/CLIENTS
               This command will generate a series of tasks
                  Generate MapReduce jobs
                  Execute MapReduce jobs
                  Transfer data from local files or HDFS
                  Compile SQL code
                  Create or insert into CLIENTS_INTO table

                There are many features of Sqoop1 that are easy to learn and implement, on the
             command line you can specify if the import is directly to Hive, HDFS, or HBASE. There
             are direct connectors to the most popular databases Oracle, SQL Server, MySQL,
             Teradata, and PostgreSQL.
                There are evolving challenges with Sqoop1 including the following:
               Cryptic command line arguments
               Nonsecure connectivitydsecurity risk
               No metadata repositorydlimited reuse
               Program driven installation and management

                Sqoop2 is the next generation of data transfer architecture that is designed to solve
             the limitations of Sqoop1 namely (Fig. 2.18).
               Sqoop2 has a web-enabled UI
               Sqooptwo will be driven by a Sqoop Server architecture
               Sqoop2 will provide greater connector flexibility, apart from JDBC many native
                connectivity options can be customized by providers
               Sqoop2 will have a REST API interface
               Sqoop2 will have its own metadata store
               Sqoop2 will add credentials management capabilities, this will provide trusted
                connection capabilities

                The proposed architecture of Sqoop2 is shown in Fig. 2.19. For more information on
             Sqoop status and issues please see the Apache Foundation website.
   57   58   59   60   61   62   63   64   65   66   67