EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Hadoop Tutorial Impala Shell
Secondary Sidebar
Hadoop Tutorial
  • Advanced
    • What is Yarn in Hadoop
    • Hadoop?Administrator
    • Hadoop DistCp
    • Hadoop Administrator Jobs
    • Hadoop Schedulers
    • Hadoop Distributed File System (HDFS)
    • Hadoop Streaming
    • Apache Hadoop Ecosystem
    • Distributed Cache in Hadoop
    • Hadoop Ecosystem Components
    • Hadoop YARN Architecture
    • HDFS Architecture
    • What is HDFS
    • HDFS Federation
    • Apache HBase
    • HBase Architecture
    • What is Hbase
    • HBase Shell Commands
    • What is MapReduce in Hadoop
    • Mapreduce Combiner
    • MapReduce Architecture
    • MapReduce Word Count
    • Impala Shell
    • HBase Create Table
  • Basics
    • What is Hadoop
    • Career in Hadoop
    • Advantages of Hadoop
    • Uses of Hadoop
    • Hadoop Versions
    • HADOOP Framework
    • Hadoop Architecture
    • Hadoop Configuration
    • Hadoop Components
    • Hadoop WordCount
    • Hadoop Database
    • Hadoop Ecosystem
    • Hadoop Tools
    • Install Hadoop
    • Is Hadoop Open Source
    • What is Hadoop Cluster
    • Hadoop Namenode
    • Hadoop data lake
    • Hadoop fsck
    • HDFS File System
    • Hadoop Distributed File System
  • Commands
    • Hadoop Commands
    • Hadoop fs Commands
    • Hadoop FS Command List
    • HDFS Commands
    • HDFS ls
    • Hadoop Stack
    • HBase Commands
  • Interview Questions
    • Hadoop Admin Interview Questions
    • Hadoop Cluster Interview Questions
    • Hadoop developer interview Questions
    • HBase Interview Questions

Related Courses

Data Science Certification

Online Machine Learning Training

Hadoop Certification

MapReduce Certification Course

Impala Shell

By Priya PedamkarPriya Pedamkar

Impala Shell

Introduction to Impala Shell

Impala is a massively parallel processing SQL like engine based on top of the Hadoop ecosystem. Impala Shell is used to processing a huge volume of data stored in the Hadoop cluster. Impala is developed by Cloudera distribution to overcome the slow processing of hive queries. Impala works better in comparison to a hive when a dataset is not huge. It can be used to share the database of the hive as it can connect hive metastore easily. Also, it can be integrated with HBASE or Amazon S3.

Impala is not map-reduce based, as it stores intermediate results in-memory, unlike map-reduce. It supports various HDFS supported file formats (like parquet, Avro, delimited text, etc) and compression codecs like snappy, gzip, etc. Impala-Shell is a command-line tool used to create a database, tables, and also to run interactive queries to fetch data, run Adhoc queries or execute the script files. Impala shell can be started using an impala-shell command on the terminal. Impala queries can be executed using shell, Perl or python scripts.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Basic Commands in Impala Shell

Below are the basic commands to execute a command on impala-shell:

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,171 ratings)

1. Connect: This command is used to connect to the running impala instance. The default port connected is 21000.

Command:

Connect

2. Show databases: This command will list existing databases in impala.

Command:

Show databases

3. Create a database: This command is used to create a database.

Syntax:

create database <database_name>

Command:

create database if not exists <database name>;

4. Create table: This command is used to create tables.

Syntax:

create table <table_name>

Command:

create table if not exists <database_name>.<table_name>(field_name1 data_type, field_name2 data_type);

5. Describe: This command is used to describe the table and see the schema of the table like datatypes, number of columns and constraints, etc.

Command:

desc <database_name>;

6. Drop: This command is used to drop tables, database, functions.

Command:

drop database <database_name>

7. Insert: This command is used to insert records in either overwrite or append record in tables.

Command:

Insert into <database_name>.<table_name> values(field_value1, field_value2);

8. Select: This command is used to fetch data from the table or view.

Command:

Select column1, column2 from <database_name>.<table_name>;

9. Use: This command is used to select the database. Statements will be executed over the tables inside the database.

Command:

Use <database_name>

10. Help: This command is used to list all the commands.

Command:

help;

11. History: Impala stores the history of all the commands used in impala-shell. This command is used to display all the commands used in an impala-shell.

Command:

History;

12. Alter: Alter table statement is used to alter the properties of the table. Also, an alter table can be used to change the structure of the table. Change in the structure like adding or dropping columns, keys, constraints can be achieved using the alter command. Also, it can be used to change the partitions of the table. Also, you can rename the table and can change the properties of a table like an owner, etc. This command is mainly used to alter partition or alter the location where the table is pointing.

Command:

Alter table <database_name>.<table_name> drop partition(partition_name);

13. Invalidate Metadata: This command is used to refresh metadata for the tables. As impala shares metastore with hive so if there are any changes made in metadata of the tables outside of impala invalidate metadata is needed to run. As it will invalidate the existing metadata of the tables. It will reload metadata for tables before querying the table. This command is very expensive when compared to refresh. Invalidate metadata needs to be executed when there is any change in jars, or when some table will not be queried, or when new tables are added.

Command:

Invalidate metadata hive_db_name.table_name;

14. Refresh: This command is used to reload metadata about the table from metastore whenever there is a change in metadata outside of impala. Refresh will remove the inconsistency between hive metastore and impala. After refresh metadata will be broadcasted to all impala coordinators. The command needs to be executed in scenarios like whenever there is any addition or deletion of file from the HDFS directory where the Impala table is pointing. Or if there is any change in partitioning made outside of impala like addition or deletion of partition by the hive.

Command:

Refresh <db_name>.<table_name> partition(key1=value1,key2=val2);

15. Explain: this command is used to derive the execution plan of a statement, like how impala will read data from hdfs, will divide work among hdfs nodes, etc. This command can be used to select or create a table as or insert statements to understand the execution plan of a query. With the help of execution plan query can be changed to perform better like join clauses can be adjusted or where conditions can be adjusted accordingly. The execution plan should be read from bottom to top.

Command:

Explain <query can be either select or insert or CTAS>

16. Compute stats: This command is used to get information about data in a table and will be stored in the metastore database, later will be used by impala to run queries in an optimized way. This query gets information about data distribution or partitioning etc.

Command:

Compute stats <database_name>.<table_name>(column1, column2 etc)

Recommended Articles

This is a guide to Impala Shell. Here we discuss the introduction, 16 different basic commands in Impala Shell along with Syntax. You can also go through our other related articles to learn more–

  1. Insertion Sort in Data Structure
  2. Data Modeling Course
  3. Relational Database
  4. Types of NoSQL Databases
  5. Guid to Impala Database
Popular Course in this category
All in One Data Science Bundle (360+ Courses, 50+ projects)
  360+ Online Courses |  1500+ Hours |  Verifiable Certificates |  Lifetime Access
4.7
Price

View Course

Related Courses

Data Scientist Training (85 Courses, 67+ Projects)4.9
Machine Learning Training (20 Courses, 29+ Projects)4.8
Hadoop Training Program (20 Courses, 14+ Projects, 4 Quizzes)4.7
MapReduce Training (2 Courses, 4+ Projects)4.7
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more