EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Teradata Tutorial Teradata Partition by
Secondary Sidebar
Teradata Tutorial
  • Basic
    • What is Teradata
    • Career In Teradata
    • Teradata Architecture
    • Teradata data types
    • Teradata ODBC Driver
    • Teradata Vantage
    • Insert into Teradata
    • Teradata CASE Statement
    • Teradata Partition by
    • Teradata Date Formats
    • Teradata Current Date
    • Teradata Substring
    • Teradata BTEQ
    • Teradata Concatenate
    • Teradata REPLACE
    • Teradata Joins
    • Primary Index in Teradata
    • Fastload in Teradata
    • Collect Stats in Teradata
    • Teradata Volatile Table
    • Teradata TPT
    • Teradata Qualify
    • Teradata?coalesce
    • Teradata RENAME TABLE
    • Teradata Viewpoint
    • Teradata Performance Tuning
    • Teradata AMP
    • Teradata Utilities

Teradata Partition by

By Priya PedamkarPriya Pedamkar

Teradata Partition by

Introduction to Teradata Partition by

Partition by Teradata usually specifies that one or more partitioning levels partition the table on which it is applied. There are various partition types with a wide range in the no. of altogether combined partitions. However, there are both positive and negative impacts of defining a partition, which is an improvement in performance over the subsequent queries of that table alongside the disc storage, respectively.

  • A partitioning level can-not be specified if it includes a row-level security constraint column.
  • A character partitioning level can-not is specified for the columns or the constants having Kanji1 or KanjiSJIS server character sets.
  • For Global Temporary and Volatile table types, you can not specify or define the column partitioning.
  • Parting for NOPI or PA table can be row-level or column level.

How to use Partition by in Teradata?

Partition by is defined as a part of the CREATE TABLE statement to specify. How to distribute the columns over the available AMP’s?

Let us take the below examples to understand them.

Example #1

Partition Non-Unique Primary Index by Current date function.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Code:

CREATE TABLE cust (
customer_name CHARACTER(8),
policy_num INTEGER,
policy_exp_dt DATE FORMAT 'YYYY/MM/DD')
PRIMARY INDEX (customer_name, policy_num)
PARTITION BY CASE_N(policy_exp_dt>= CURRENT_DATE, NO CASE);

Output:

teradata partition by 1

  • Here the partitioning is based on a Non-unique column that is the policy expiration date.
  • The data having the expired policies are separated from the non-expired policies by using the current date for partitioning by the policy expiry date.

Example #2

Partitioning based on a Non-Unique primary index with a unique secondary index defined on the same column with the partitioning by Range_N function.

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,171 ratings)

Code:

CREATE TABLE order_table (
order_id INTEGER NOT NULL,
cust_id INTEGER,
order_stats CHARACTER(1) CASESPECIFIC,
total_price DECIMAL(13,2) NOT NULL,
order_date DATE FORMAT 'yyyy-mm-dd' NOT NULL,
order_priority CHARACTER(21),
clerk CHARACTER(16),
shipment_priority INTEGER,
remarks VARCHAR(79))
PRIMARY INDEX (order_id)
PARTITION BY RANGE_N(order_date BETWEEN DATE '1992-01-01'
AND DATE '1998-12-31'
EACH INTERVAL '1' MONTH)
UNIQUE INDEX (order_id);

Output:

teradata partition by 2

  • In this example, the Order_id column is the Primary Index of the order_table.
  • RANGE_N function is used for partitioning, using the order_date between a specified range and an interval of 1 month.
  • Order_id column is the unique index as well in this case.

Example #3

Partitioning by a unique primary Index over a narrow range defined by the RANGE_N function.

Code:

CREATE TABLE sales_table (
st_id INTEGER NOT NULL,
prod_id INTEGER NOT NULL,
sales_dt DATE FORMAT 'yyyy-mm-dd' NOT NULL,
tot_revenue DECIMAL(13,2),
tot_sold INTEGER,
note_remarks VARCHAR(256))
UNIQUE PRIMARY INDEX (st_id, prod_id, sales_dt)
PARTITION BY RANGE_N(sales_dt BETWEEN DATE '2001-01-01'
AND DATE '2001-05-31'
EACH INTERVAL '1' DAY);

Output:

primary Index over a narrow range

  • This example creates a unique primary index on the st_id, prod_id and the sales_dt, Whereas the partition is based on the RANGE_N function over sales_dt with a specified date range along within interval specified.
  • The records are distributed over the AMP’s based on the partitioning; that’s why partition by is defined during the CREATE TABLE statement.

Example #4

How partition works in Teradata. Partitioning by NUSI without a USI and partitioning by RANGE_N function.

Code:

CREATE TABLE lineitem_table (
orderkey INTEGER NOT NULL,
partkey INTEGER NOT NULL,
suppkey INTEGER,
linenumber INTEGER,
quantity INTEGER NOT NULL,
extendedprice DECIMAL(13,2) NOT NULL,
discount DECIMAL(13,2),
tax DECIMAL(13,2),
returnflag CHARACTER(1),
linestatus CHARACTER(1),
shipdate DATE FORMAT 'yyyy-mm-dd',
commitdate DATE FORMAT 'yyyy-mm-dd',
receiptdate DATE FORMAT 'yyyy-mm-dd',
shipinstruct VARCHAR(25),
shipmode VARCHAR(10),
comment VARCHAR(44))
PRIMARY INDEX (orderkey)
PARTITION BY RANGE_N(shipdate BETWEEN DATE '1992-01-01'
AND DATE '1998-12-31'
EACH INTERVAL '1' MONTH);

Output:

by NUSI without a USI

  • This example creates a non-unique primary index on the column orderkey, whereas the partition expression is based on the RANGE_N function on shipdate.
  • No secondary index is defined in the column.

Example #5

Partitioning on Unique primary index defined on Extract function.

This example creates a unique primary index based on st_id, prod_id, sales_dt. The partitioning is based on the Extract function by sales_dt. No secondary index is defined.

Code:

CREATE TABLE sales_month (
st_id INTEGER NOT NULL,
prod_id INTEGER NOT NULL,
sales_dt DATE FORMAT 'yyyy-mm-dd' NOT NULL,
tot_revenue DECIMAL(13,2),
to_sold INTEGER,
remarks VARCHAR(256))
UNIQUE PRIMARY INDEX (st_id, prod_id, sales_dt)
PARTITION BY EXTRACT(MONTH FROM sales_dt);

Output:

Unique primary index

  • Here the data of the sales_month table will be distributed by the month of the same date.

For example, the sales had the same month of sales date will be partitioned together.

  • This usually helps in extracting the data having the sales from the same month more efficiently. That’s why partitioning is done to have a better execution plan.
  • The execution plan for any query can be seen by simply modifying the query and putting EXPLAIN in front of it.

EXPLAIN select * from sales_month

  • First of all, the table is locked to read the reserved row hash in a single partition.
  • Thereafter, the all-AMP retrieving step is followed by a residue condition built in the spool space itself.
  • Finally, an END TRANSACTION command is sent out to all AMP’s involved in processing the request.

How can we qualify the partitioning to avoid ambiguity?

Code:

SELECT *
FROM orders_table , lineitem_table WHERE orders.PARTITION = 3
AND lineitem_table.PARTITION = 5
AND orders_table.o_orderkey = lineitem_table.l_orderkey;

SELECT orders_table.*, lineitem_table.*, orders_table.PARTITION
FROM orders_table, lineitem_table
WHERE orders_table.PARTITION = 3
AND lineitem_table.PARTITION = 5
AND orders_table.o_orderkey = lineitem_table.l_orderkey;

Output:

teradata partition by 6

  • Here in partitioning is utilized with the select clause to avoid the ambiguity between the partition values in the order table.

Advantages of Teradata Partition by

Below are the advantages:

  • If partitioning is defined, then full scans of the tables can be avoided improving the query time.
  • Usage of the secondary index can be avoided, which requires additional query time.
  • We can access a part of a subset of the large table quickly if partitions are defined.
  • The data can be dropped quickly, as well.
  • The reason being, we the PE, know where the data resides and eventually, the scan time is reduced.

Conclusion

Partitions are defined during the CREATE TABLE statement Partitioning allows us to distribute the data in a table across various AMP’s based on the selected field and the selected logic so that the retrieval process isn’t time-consuming. The query can be optimized using the partition information by the Parsing engine.

Recommended Articles

This is a guide to Teradata Partition by. Here we discuss the introduction and how to use partition by Teradata. You may also have a look at the following articles to learn more –

  1. What is Teradata?
  2. Career In Teradata
  3. Database Parallelism
  4. Teradata CASE Statement | How to Use?
  5. Teradata Architecture | Components
  6. PARTITION BY in SQL | How to Implement?
Popular Course in this category
Data Visualization Training (15 Courses, 5+ Projects)
  15 Online Courses |  5 Hands-on Projects |  105+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more