Updated June 19, 2023
Introduction to Split Function in Python
The string is a sequence of characters. The character can be any letter, digit, whitespace, or special symbol. Once you create a string in Python, you cannot alter it because it is unchangeable. However, we can copy the string and perform various operations. Python provides a separate class, “str,” for string handling and manipulations. One such operation is to split a large string into smaller chunks or substrings. Python class “str” provides a built-in function split() to facilitate this splitting operation on strings. The split function acts as the opposite of concatenation. While concatenation combines small strings to form a large string, you use the split() function to divide a large string into smaller substrings.
split(delimiter, max splits)
You call the split() function on a string object, and it takes two optional parameters and returns a list of substrings as the output.
- Delimiter: Optional parameter where splitting takes place. If no parameter is given, the string is split on whitespace characters’ encounter (space, tabs, newline). If you provide a separator or delimiter, the system will split the string on this delimiter.
- Max splits: Optional parameter which specifies the maximum number of splits. Default is -1, which means there is no limit on the number of splits.
Why is Split() Function Useful?
A potential example is reading data from CSV (comma-separated values) file. If a text file contains data separated by a comma or some other delimiter, we use split() to break the data into various chunks.
Examples of Split Function in Python
Given below are the examples of Split Functions in Python:
Python program to illustrate the functioning of the split() function.
# split () taking different delimiters # 1 --> no delimiter specified inp1 = 'A E I O U' print("string will split at whitespace :",inp1.split()) print("list of vowels :") for i in inp1.split(): print(i) print('\n') # 2 --> delimiter is comma and a space (, ) inp2 = 'J&K, Puducherry, Delhi, Andamana and Nicobar, Chandigarh, Dadra and Nagar Haveli, Daman and Diu, Lakshadweep, Ladakh' ut = inp2.split(', ') print("Number of Union Territories :", len(ut)) print("UT is listed as below :") for u in range(len(ut)): print((u+1),'-->', ut[u]) print('\n') # 3 --> delimiter is | inp3 = 'java|python|c++|scala|julia' print("string will split at '|' :", inp3.split('|')) print("Different programming languages are :") for lang in inp3.split('|'): print(lang) print('\n') # 4 --> splitting string at every 4 character inp4 = 'fourfivenine' print([inp4[i: i+4] for i in range(0, len(inp4), 4)])
In the above program, the split() function is called on four different input strings with different delimiters, and the max split optional parameter takes its default value of -1.
1. The first delimiter is the default one, i.e., whitespaces.
Input string: A string containing different vowels separated by spaces.
Output: We traverse a list containing vowels using a for loop.
2. The second delimiter is a comma followed by a space, i.e., ‘, ‘
Input string: String containing all the Union territories in India separated by a comma and space.
Output: List containing different UTs. Each UT is then printed using for loop.
3. The third delimiter is a pipe, i.e., |
Input string: String containing different programming languages separated by delimiter |
Output: List of different languages.
4. We split the final string, ‘fourfivenine’, at every 4th position using list comprehension.
Split() function with optional parameter ‘max splits.’
# split() function with different values for max splits print("split() with default max split values i/e -1 :") inp1 = 'Java@Python@C++@Scala@Julia' print("input string will split at @ :", inp1.split('@')) print('\n') print("number of splits (0):", inp1.split('@', 0)) print("number of splits (1):", inp1.split('@', 1)) print("number of splits (2):", inp1.split('@', 2)) print("number of splits (3):", inp1.split('@', 3)) print("number of splits (4):", inp1.split('@', 4)) print("number of splits (5):", inp1.split('@', 5)) print('\n') print("split using for loop") for i in range(6): print("number of splits", i, inp1.split('@', i))
In above program, split(sep=’@’, maxsplits= <different values>) is used.
- First, we split the input string at ‘@’ using the default value for max split, which is -1. This means that the function returns all possible splits.
- We then split the string at ‘@’ using different max_splits values, such as 0, 1, 2, 3, 4, 5.
- You can write the same program more concisely using a for loop.
Calculating the sum of marks scored by a student.
# calculating sum of marks of a student # input string containing marks scored by a student in five different subjects marks = '95, 92, 82, 92, 98' sum = 0 # input string will split at ', ' list_marks = marks.split(', ') print("marks scored by student :", list_marks) print("student marks:") for marks in list_marks: print(marks) print('\n') # calculating total marks for i in range(len(list_marks)): sum += int(list_marks[i]) print("total marks scored by student :", sum)
- The input string is a string of marks obtained by the student in five different subjects separated by delimiter ‘, ‘
- The input string is then split at ‘, ‘using the split() function.
- The output of split() is a list in which each element represents single subject marks.
- We traverse the output list using a for loop.
- We then calculate the sum of the total marks and print it as output.
The Python split() function splits a given input string into different substrings based on a delimiter. The delimiter can be anything. It could be a text, also. The system considers whitespace as the default delimiter if no delimiter is provided. We can also specify the number of splits controlled by the split() function optional parameter ‘max splits.’
This has been a guide to Split Function in Python. Here we discuss the basic concept, parameters, why the Split() function is useful, and examples. You may also have a look at the following articles to learn more –