EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • Featured Skills
    • New & Trending
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign Up
Home Data Science Data Science Tutorials Teradata Tutorial Collect Stats in Teradata
 

Collect Stats in Teradata

Priya Pedamkar
Article byPriya Pedamkar

Updated July 4, 2023

Collect Stats in Teradata

 

 

Collect Stats in Teradata

Collect Stats in Teradata collect the statistics for one or even multiple columns of the base table specified. These statistics include the hash index, a join index, etc. The primary purpose is to collect statistical profiles of the required columns and store them in a data dictionary.

Watch our Demo Courses and Videos

Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

What is Collect stats in Teradata?

Collect Stats is used to gather the statistics on various columns as per the requirement. Thereafter Teradata’s optimizer creates an execution strategy that is based on these statistics collected by the COLLECT STATS command.

This COLLECT STATS command gathers data demographics and environment information, which the optimizer utilizes in order to optimize the plan for the SQL based on these tables.

Environment information consists of the below set of info:

  • The amount of memory required
  • The number of nodes
  • AMP’s
  • CPU’s

Data demographics consist of the below set of info:

  • The row size of the table
  • Number of rows in that table
  • The entire range of values present in that table
  • The number of rows available per table
  • Number of Null values in that table

There is a variety of approaches available to collect statistics over a table.

  • Using the sample option: This includes unique index columns, nearly unique indexes, or columns
  • Full statistic collection methodology: This usually includes Non-Indexed columns, Partition for all tables whether permitted or not, and Collection of full stats over relevant columns.
  • Most NUPIs refer to non-unique primary indexed columns, while UPIs stand for unique primary indexed columns.
  • Apart from these, it also includes the single-column join constraints along with NUSI’s
  • The Random AMP Sampling: It involves the USI’s or UPI’s if only used with equality predicates.

How to Collect Stats in Teradata?

Below is the syntax of COLLECT STATS statement in Teradata:

COLLECT [SUMMARY] STATISTICS
INDEX (name_of_the_index)
COLUMN (col_name)
ON <table_name>;

Here the keyword SUMMARY is optional, and the user may skip it if not required.

Let’s take up some examples to understand how to collect stats in Teradata in detail:

COLLECT STATISTICS COLUMN(roll_number) ON Student_Table;
  • This will collect stats on the roll_number column of the student_table
  • When the above query is executed, the below kind of output is produced

Collect Stats in Teradata 1

Let’s take an example to understand the optimization in detail.

Suppose we have two tables table1 and table2.

table1 contains the details of the students like id, name, age, marks, etc

whereas table2 contains the geographic info of the students like address, location along with the primary key as ID

SELECT
a.id,
a.name,
a.age,
a.marks,
b.address,
b.location
From table1 as a
left join table2 as b
on a.id = b.id

Let’s consider two cases wherein the above-mentioned query gets executed.

CASE 1: When we do not have any information regarding the statistics of any columns from table1 and table2. In this case, the execution plan for the above query will be more costly.

CASE2: When we do have the specifically required information regarding the statistics of any columns from table1 and table2. In this case, the execution plan for the above query will be less costly.

The reason is, during the join, which is based on the id column from table1 and table2, it needs to be on the same AMP in order to join the data based on this column from table1 and table2

Suppose table1 contains 100 records having ID from 1 to 100 distributed evenly over 10 AMP in the below fashion.

  • The records having the ID from 1 to 10 in AMP1
  • The records having the ID from 11 to 20 in AMP2
  • The records having the ID from 21 to 30 in AMP3
  • The records having the ID from 31 to 40 in AMP4
  • and so on…
  • The records having the ID from 91 to 100 in AMP10

And table2 has only 80 records having ID from 1 to 100 with 20 missing ID’s

  • The records having the ID from 1 to 8 in AMP1
  • The records having the ID from 9 to 10 and 15 to 20 in AMP2
  • The records having the ID from 21 to 28 in AMP3
  • The records having the ID from 29 to 36 in AMP4
  • and so on…
  • The records having the ID from 92 to 100 in AMP8

Now for the join to happen, the ID’s should be available in the SAME AMP

Generally, the data from the smaller table is redistributed. So here, the redistribution will happen for table2

How can we see the statistics collected for the tables in Teradata?

The collected stats can be seen using the below query:

HELP STATISTICS <table_name>

Let’s see the stats collected on Student_table

HELP STATISTICS student_table

When the above query is executed, The result will be something like this:

Collect Stats in Teradata 2

Conclusion

The COLLECT STATS command is used to gather statistics about tables, which can then be utilized to optimize queries involving those tables. These collected statistics provide valuable information about the distribution of data and can assist in generating efficient query execution plans.

Recommended Articles

We hope that this EDUCBA information on “Collect Stats in Teradata” was beneficial to you. You can view EDUCBA’s recommended articles for more information.

  1. Fastload in Teradata
  2. Insert into Teradata
  3. Teradata Architecture
  4. Teradata CASE Statement

Primary Sidebar

Footer

Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA Login

Forgot Password?

🚀 Limited Time Offer! - 🎁 ENROLL NOW