EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • Featured Skills
    • New & Trending
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign Up
Home Data Science Data Science Tutorials Hadoop Tutorial Impala Shell
 

Impala Shell

Priya Pedamkar
Article byPriya Pedamkar

Updated March 27, 2023

Impala Shell

 

 

Introduction to Impala Shell

Impala is a massively parallel processing SQL like engine based on top of the Hadoop ecosystem. Impala Shell is used to processing a huge volume of data stored in the Hadoop cluster. Impala is developed by Cloudera distribution to overcome the slow processing of hive queries. Impala works better in comparison to a hive when a dataset is not huge. It can be used to share the database of the hive as it can connect hive metastore easily. Also, it can be integrated with HBASE or Amazon S3.

Watch our Demo Courses and Videos

Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

Impala is not map-reduce based, as it stores intermediate results in-memory, unlike map-reduce. It supports various HDFS supported file formats (like parquet, Avro, delimited text, etc) and compression codecs like snappy, gzip, etc. Impala-Shell is a command-line tool used to create a database, tables, and also to run interactive queries to fetch data, run Adhoc queries or execute the script files. Impala shell can be started using an impala-shell command on the terminal. Impala queries can be executed using shell, Perl or python scripts.

Basic Commands in Impala Shell

Below are the basic commands to execute a command on impala-shell:

1. Connect: This command is used to connect to the running impala instance. The default port connected is 21000.

Command:

Connect

2. Show databases: This command will list existing databases in impala.

Command:

Show databases

3. Create a database: This command is used to create a database.

Syntax:

create database <database_name>

Command:

create database if not exists <database name>;

4. Create table: This command is used to create tables.

Syntax:

create table <table_name>

Command:

create table if not exists <database_name>.<table_name>(field_name1 data_type, field_name2 data_type);

5. Describe: This command is used to describe the table and see the schema of the table like datatypes, number of columns and constraints, etc.

Command:

desc <database_name>;

6. Drop: This command is used to drop tables, database, functions.

Command:

drop database <database_name>

7. Insert: This command is used to insert records in either overwrite or append record in tables.

Command:

Insert into <database_name>.<table_name> values(field_value1, field_value2);

8. Select: This command is used to fetch data from the table or view.

Command:

Select column1, column2 from <database_name>.<table_name>;

9. Use: This command is used to select the database. Statements will be executed over the tables inside the database.

Command:

Use <database_name>

10. Help: This command is used to list all the commands.

Command:

help;

11. History: Impala stores the history of all the commands used in impala-shell. This command is used to display all the commands used in an impala-shell.

Command:

History;

12. Alter: Alter table statement is used to alter the properties of the table. Also, an alter table can be used to change the structure of the table. Change in the structure like adding or dropping columns, keys, constraints can be achieved using the alter command. Also, it can be used to change the partitions of the table. Also, you can rename the table and can change the properties of a table like an owner, etc. This command is mainly used to alter partition or alter the location where the table is pointing.

Command:

Alter table <database_name>.<table_name> drop partition(partition_name);

13. Invalidate Metadata: This command is used to refresh metadata for the tables. As impala shares metastore with hive so if there are any changes made in metadata of the tables outside of impala invalidate metadata is needed to run. As it will invalidate the existing metadata of the tables. It will reload metadata for tables before querying the table. This command is very expensive when compared to refresh. Invalidate metadata needs to be executed when there is any change in jars, or when some table will not be queried, or when new tables are added.

Command:

Invalidate metadata hive_db_name.table_name;

14. Refresh: This command is used to reload metadata about the table from metastore whenever there is a change in metadata outside of impala. Refresh will remove the inconsistency between hive metastore and impala. After refresh metadata will be broadcasted to all impala coordinators. The command needs to be executed in scenarios like whenever there is any addition or deletion of file from the HDFS directory where the Impala table is pointing. Or if there is any change in partitioning made outside of impala like addition or deletion of partition by the hive.

Command:

Refresh <db_name>.<table_name> partition(key1=value1,key2=val2);

15. Explain: this command is used to derive the execution plan of a statement, like how impala will read data from hdfs, will divide work among hdfs nodes, etc. This command can be used to select or create a table as or insert statements to understand the execution plan of a query. With the help of execution plan query can be changed to perform better like join clauses can be adjusted or where conditions can be adjusted accordingly. The execution plan should be read from bottom to top.

Command:

Explain <query can be either select or insert or CTAS>

16. Compute stats: This command is used to get information about data in a table and will be stored in the metastore database, later will be used by impala to run queries in an optimized way. This query gets information about data distribution or partitioning etc.

Command:

Compute stats <database_name>.<table_name>(column1, column2 etc)

Recommended Articles

This is a guide to Impala Shell. Here we discuss the introduction, 16 different basic commands in Impala Shell along with Syntax. You can also go through our other related articles to learn more–

  1. Insertion Sort in Data Structure
  2. Data Modeling Course
  3. Relational Database
  4. Types of NoSQL Databases
  5. Guid to Impala Database

Primary Sidebar

Footer

Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA Login

Forgot Password?

🚀 Limited Time Offer! - ENROLL NOW