EDUCBA

EDUCBA

MENUMENU
  • Blog
  • Free Courses
  • All Courses
  • All in One Bundle
  • Login
Home Data Science Data Science Tutorials Teradata Tutorial Collect Stats in Teradata

Collect Stats in Teradata

Priya Pedamkar
Article byPriya Pedamkar

Updated July 4, 2023

Collect Stats in Teradata

Collect Stats in Teradata

Collect Stats in Teradata collect the statistics for one or even multiple columns of the base table specified. These statistics include the hash index, a join index, etc. The primary purpose is to collect statistical profiles of the required columns and store them in a data dictionary.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

What is Collect stats in Teradata?

Collect Stats is used to gather the statistics on various columns as per the requirement. Thereafter Teradata’s optimizer creates an execution strategy that is based on these statistics collected by the COLLECT STATS command.

This COLLECT STATS command gathers data demographics and environment information, which the optimizer utilizes in order to optimize the plan for the SQL based on these tables.

Environment information consists of the below set of info:

  • The amount of memory required
  • The number of nodes
  • AMP’s
  • CPU’s

Data demographics consist of the below set of info:

  • The row size of the table
  • Number of rows in that table
  • The entire range of values present in that table
  • The number of rows available per table
  • Number of Null values in that table

There is a variety of approaches available to collect statistics over a table.

  • Using the sample option: This includes unique index columns, nearly unique indexes, or columns
  • Full statistic collection methodology: This usually includes Non-Indexed columns, Partition for all tables whether permitted or not, and Collection of full stats over relevant columns.
  • Most NUPIs refer to non-unique primary indexed columns, while UPIs stand for unique primary indexed columns.
  • Apart from these, it also includes the single-column join constraints along with NUSI’s
  • The Random AMP Sampling: It involves the USI’s or UPI’s if only used with equality predicates.

How to Collect Stats in Teradata?

Below is the syntax of COLLECT STATS statement in Teradata:

COLLECT [SUMMARY] STATISTICS
INDEX (name_of_the_index)
COLUMN (col_name)
ON <table_name>;

Here the keyword SUMMARY is optional, and the user may skip it if not required.

Let’s take up some examples to understand how to collect stats in Teradata in detail:

COLLECT STATISTICS COLUMN(roll_number) ON Student_Table;
  • This will collect stats on the roll_number column of the student_table
  • When the above query is executed, the below kind of output is produced

Collect Stats in Teradata 1

Let’s take an example to understand the optimization in detail.

Suppose we have two tables table1 and table2.

table1 contains the details of the students like id, name, age, marks, etc

whereas table2 contains the geographic info of the students like address, location along with the primary key as ID

SELECT
a.id,
a.name,
a.age,
a.marks,
b.address,
b.location
From table1 as a
left join table2 as b
on a.id = b.id

Let’s consider two cases wherein the above-mentioned query gets executed.

CASE 1: When we do not have any information regarding the statistics of any columns from table1 and table2. In this case, the execution plan for the above query will be more costly.

CASE2: When we do have the specifically required information regarding the statistics of any columns from table1 and table2. In this case, the execution plan for the above query will be less costly.

The reason is, during the join, which is based on the id column from table1 and table2, it needs to be on the same AMP in order to join the data based on this column from table1 and table2

Suppose table1 contains 100 records having ID from 1 to 100 distributed evenly over 10 AMP in the below fashion.

  • The records having the ID from 1 to 10 in AMP1
  • The records having the ID from 11 to 20 in AMP2
  • The records having the ID from 21 to 30 in AMP3
  • The records having the ID from 31 to 40 in AMP4
  • and so on…
  • The records having the ID from 91 to 100 in AMP10

And table2 has only 80 records having ID from 1 to 100 with 20 missing ID’s

  • The records having the ID from 1 to 8 in AMP1
  • The records having the ID from 9 to 10 and 15 to 20 in AMP2
  • The records having the ID from 21 to 28 in AMP3
  • The records having the ID from 29 to 36 in AMP4
  • and so on…
  • The records having the ID from 92 to 100 in AMP8

Now for the join to happen, the ID’s should be available in the SAME AMP

Generally, the data from the smaller table is redistributed. So here, the redistribution will happen for table2

How can we see the statistics collected for the tables in Teradata?

The collected stats can be seen using the below query:

HELP STATISTICS <table_name>

Let’s see the stats collected on Student_table

HELP STATISTICS student_table

When the above query is executed, The result will be something like this:

Collect Stats in Teradata 2

Conclusion

The COLLECT STATS command is used to gather statistics about tables, which can then be utilized to optimize queries involving those tables. These collected statistics provide valuable information about the distribution of data and can assist in generating efficient query execution plans.

Recommended Articles

We hope that this EDUCBA information on “Collect Stats in Teradata” was beneficial to you. You can view EDUCBA’s recommended articles for more information.

  1. Fastload in Teradata
  2. Insert into Teradata
  3. Teradata Architecture
  4. Teradata CASE Statement
All in One Excel VBA Bundle
500+ Hours of HD Videos
15 Learning Paths
120+ Courses
Verifiable Certificate of Completion
Lifetime Access
Financial Analyst Masters Training Program
2000+ Hours of HD Videos
43 Learning Paths
550+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Data Science Bundle
2000+ Hour of HD Videos
80 Learning Paths
400+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Software Development Bundle
5000+ Hours of HD Videos
149 Learning Paths
1050+ Courses
Verifiable Certificate of Completion
Lifetime Access
Primary Sidebar
All in One Data Science Bundle2000+ Hour of HD Videos | 80 Learning Paths | 400+ Courses | Verifiable Certificate of Completion | Lifetime Access
Financial Analyst Masters Training Program2000+ Hours of HD Videos | 43 Learning Paths | 550+ Courses | Verifiable Certificate of Completion | Lifetime Access
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more