Introduction to Split Function in Python
The string is a sequence of characters. The character can be any letter, digit, whitespace character or any special symbols. Strings in Python are immutable, which means once created, they cannot be modified. However, we can make a copy of the string and can perform various operations on it. For this, Python provides a separate class “str” for string handling and manipulations. One such operation is to split a large string into smaller chunks or substrings. Python class “str” provides a built-in function split() to facilitate this splitting operation on strings. The split function is the opposite of concatenation which concatenate small strings to form a large string, whereas split() is used to split a large string into smaller substrings.
split(delimiter, max splits)
split() function is called upon a string object and takes two optional parameters, and returns a list of substrings as the output.
- Delimiter: Optional parameter where splitting takes place. If no parameter is given, then the string is split on whitespace characters’ encounter (space, tabs, newline). If a separator or delimiter is provided, then splitting will be done on this delimiter.
- Max splits: Optional parameter, which specifies the maximum number of splits. Default is -1, which means there is no limit on the number of splits.
Why Split() Function is Useful?
A potential example is reading data from CSV (comma separated values) file. When a text file contains data that are separated by a comma or some other delimiter, split() is used to break the data into various chunks of data.
Examples of Split Function in Python
Given below are the examples of Split Function in Python:
Python program to illustrate the functioning of the split() function.
# split () taking different delimiters
# 1 --> no delimiter specified
inp1 = 'A E I O U'
print("String will split at whitespace :",inp1.split())
print("list of vowels :")
for i in inp1.split():
# 2 --> delimiter is comma and a space (, )
inp2 = 'J&K, Puducherry, Delhi, Andamana and Nicobar, Chandigarh, Dadra and Nagar Haveli, Daman and Diu, Lakshadweep, Ladakh'
ut = inp2.split(', ')
print("Number of Union Territories :", len(ut))
print("UT is listed as below :")
for u in range(len(ut)):
# 3 --> delimiter is |
inp3 = 'java|python|c++|scala|julia'
print("String will split at '|' :", inp3.split('|')
print("Different programming languages are :")
for lang in inp3.split('|'):
# 4 --> splitting string at every 4 character
inp4 = 'fourfivenine'
print([inp4[i: i+4] for i in range(0, len(inp4), 4)])
In the above program, the split() function is called on four different input strings with different delimiters, and the max split optional parameter is taking its default value -1.
1. The first delimiter is the default one, i.e. whitespaces.
Input string: A string containing different vowels separated by spaces.
Output: List containing vowels which are then traversed using for loop.
2. The second delimiter is a comma followed by a space, i.e. ‘, ‘
Input string: String containing all the Union territories in India separated by a comma and space.
Output: List containing different UTs. Each UT is then printed using for loop.
3. The third delimiter is pipe, i.e. |
Input string: String containing different programming languages separated by delimiter |
Output: List of different languages.
4. Final string ‘fourfivenine’ is split at every 4th. This split is done using list comprehension.
Split() function with optional parameter ‘max splits’.
# split() function with different values for max splits
print("Split() with default max split values i/e -1 :")
inp1 = 'Java@Python@C++@Scala@Julia'
print("Input string will split at @ :", inp1.split('@'))
print("number of splits (0):", inp1.split('@', 0))
print("number of splits (1):", inp1.split('@', 1))
print("number of splits (2):", inp1.split('@', 2))
print("number of splits (3):", inp1.split('@', 3))
print("number of splits (4):", inp1.split('@', 4))
print("number of splits (5):", inp1.split('@', 5))
print("Split using for loop")
for i in range(6):
print("number of splits", i, inp1.split('@', i))
In above program, split(sep=’@’, maxsplits= <different values>) is used.
- First, the input string is split at ‘@’ with the default value for max split, i.e. -1. It means all the possible splits are returned.
- The string is then split at ‘@’ with different max_splits values, i.e. 0, 1, 2, 3, 4, 5.
- The same program can be written in a concise manner using for loop.
Calculating the sum of marks scored by a student.
# calculating sum of marks of a student
# input string containing marks scored by a student in five different subjects
marks = '95, 92, 82, 92, 98'
sum = 0
# input string will split at ', '
list_marks = marks.split(', ')
print("Marks scored by student :", list_marks)
for marks in list_marks:
# calculating total marks
for i in range(len(list_marks)):
sum += int(list_marks[i])
print("total marks scored by student :", sum)
- The input string is a string of marks obtained by the student in five different subjects separated by delimiter ‘, ‘
- The input string is then split at ‘, ‘ using the split() function.
- The output of split() is a list in which each element represents single subject marks.
- The output list is traversed using for loop.
- The sum of total marks is then calculated and printed as output.
Python split() function is used to split a given input string into different substrings based on a delimiter. The delimiter can be anything. It could be a text also. If no delimiter is provided, then whitespace is considered as the default delimiter. We can also specify the number of splits which are controlled by split() function optional parameter ‘max splits’.
This has been a guide to Split Function in Python. Here we discuss the basic concept, parameters, why Split() function is useful and example. You may also have a look at the following articles to learn more –