Hive tutorial is a stepping stone in becoming an expert in querying, summarizing and analyzing billions or trillions of records with the use of industry-wide popular HiveQL on the Hadoop distributed ﬁle system. This tutorial familiarizes you with the features and scope of the language for better query optimization and processing. With SQL like dialect, queries can be written using simple DDL and DML commands to specify or alter the database, table or views and perform operations on them. This will focus on the various types of queries that can be executed on the Hive, along with the execution plan for MapReduce jobs at the back end.
Hive also supports a number of data science applications like :
In order to learn HiveQL, basic knowledge of SQL, Hadoop architecture and Unix/Linux shell scripting commands will be helpful. Understanding the logical approach to a problem enables building queries and ETL jobs.
HiveQL tutorial is targeted to cater to the petabytes of data analysis by Big data professionals/engineers and analysts in the ﬁeld of Banking, Retail, Insurance and many more. This tutorial will help Hadoop developers in automating ETL jobs to summarize large data sets on the Hadoop ecosystem. Database architects and administrators also have many concepts to learn from this comprehensive tutorial.