Definition of Linux Join
In a certain situation in today’s world, when the data is sparse, it becomes necessary to join 2 files, which contain parts of the same data. In other words, using join one can achieve the utility of “joining” 2 files so that the output of the join makes more sense and is complete. There are many applications where the join command finds its use. The join happens on a key, which is present in both the files and by default, it is the first column that is taken as a key by default. Let us make it more clear to you with an example. Suppose there are 2 files, and in one file we have a list of employees and in the other their address. Join in Linux comes in handy for these kinds of join situations!
Syntax:
The basic syntax attached to the join is:
join [OPTION]… FILE1 FILE2
where, FILE1 and FILE 2 are the files, where contents are located, and OPTION denotes the various options we would discuss here, which helps achieve the desired requirement.
1. -a <FILE NUMBER> option: Way to also print non paired lines
Syntax:
join <FILE_1> <FILE_2> -a 1
2. -v option: Way to ONLY print non paired lines
Syntax:
join <FILE_1> <FILE_2> -v 1
3. Join custom columns from 2 files
Syntax:
join -1 <Column for join in File 1> -2 <Column for join in File 2> <FILE_1> <FILE_2>
4. -i / –ignore-case option: Case insensitive join
Syntax:
join -i <FILE_1> <FILE_2>
OR
join --ignore-case <FILE_1> <FILE_2>
5. –check-order / –nocheck-order: Check for sort through all input lines
Syntax:
join --nocheck-order <FILE_1> <FILE_2>
OR
join --check-order <FILE_1> <FILE_2>
6. –help option: Display of help message
Syntax:
Join --help
How does Join work in Linux?
Join in Linux finds its application in various uses, and in this section, we would look into some of the most used ones and during the explanation of each of them, we would take turn explaining the working of each in the due course of the section.

4.5 (8,975 ratings)
View Course
The first and foremost is the basic join, where the intent is to just join 2 files, through a common key, here the key is also referred to as index and acts like matching 2 contents on some similar grounds. Think of this as a sports tournament, where teams play against each other on some common ground, may it be goals scored in soccer, runs scored in cricket, and so on. Now since only 2 teams can play against each other in contrast to so many teams in the tournament, there are some common rules to judge the winners and runners up. Now, with the same analogous situation, the 2 files will be compared and wherever the index would match the contents corresponding to the index will be copied along with a gap. Now, one needs to be careful about any gap or empty character in place as they will tend to be concatenated along. In the next one, there might be conditions where the “index” might be missing from any one of the files and hence the user may choose to add the non-paired ones during the join with the intent that the result file is something like a union of the files and would contain “best of both worlds”. For this, user might need to use “-a <FILE NUMBER>” where file number essentially means that file, whose non-paired rows will be included during join. Also, in some situations, the user might need to only see the non-paired rows and for that, the user can use “-v a <FILE NUMBER>” option. What these 2 options will eventually do is include the non-pairing ones and in accordance with -a or -v getting used, the necessary actions are taken. By default, the index is always the first non-break set of characters. Non-break set of characters is referred to as the set of characters undivided by space. But, in some scenarios users would want to use some custom numbered set of characters, for example instead of the first set of non-break characters the user would choose to use the third set on non-break ones. In this scenario, the -1 <Custom numbered> and -2<Custom numbered> are the arguments that would serve the purpose. To make it simpler, if the position of the set of characters is the same for both files, the above command can be replaced by -j <Custom numbered>.
Also, one must be aware of the act that the join in Linux is case sensitive. Now, in some scenarios, the user would like to neglect the case of the indexes used for joining. Now, obviously, if the index is number, the case won’t matter, but in case the index are alphabets, the ascii value of small caps in comparison to all caps is different and hence problematic for Linux to join by default. Hence, the user can use -i to make the indexes case-insensitive during the join.
Another point to be kept in mind is that the order of indexes needs to be sorted, or else one would need to use the option –nocheck-order so that the order is not checked. At last, there are other sets of commands which one can access using –help in Linux, should one feel the need to explore more of Linux join.
Examples
Lets us discuss the examples of Linux Join.
Example #1
Join with printing all non-paired rows in File 2.
Syntax:
join file1.txt file2.txt -a 2
Join with printing all non-paired rows in File 1:
Syntax:
join file1.txt file2.txt -a 1
Output:
Inputs files:
Join with printing all non-paired rows in File 2:
Join with printing all non-paired rows in File 1:
Example #2
Join with printing ONLY non-paired rows in File 2:
Syntax:
join file1.txt file2.txt -v 2
Join with printing ONLY non-paired rows in File 1:
Syntax:
join file1.txt file2.txt -v 1
Output:
Example #3
Syntax:
When the order of custom column is different:
join file1.txt file2.txt -1 2 -2 1
When the order of custom column is the same:
join file1.txt file2.txt -j 2
Output:
When the order of custom column is different:
When the order of custom column is the same:
Example #4
Syntax:
join -i file1.txt file2.txt
join --ignore-case file1.txt file2.txt
Output:
When no option is used, the join returns empty!
Example #5
Syntax:
No option
join -i file1.txt file2.txt
Using the option of check order
join -i --check-order file1.txt file2.txt
Using the option to not check order
join -i --nocheck-order file1.txt file2.txt
Output:
When there is no option of –nocheck-order, there is an error reported, whilst using the option there is no such error and the unsorted line is omitted. When –check-order option is used, the join stops at the point where the index is unsorted, and no further join happens.
Example #6
Syntax:
join --help
Output:
Conclusion
With the set of examples and explanations to working of join in Linux, you must be quite used to the usage of the same and this will enable you to experiment more with other arguments of Linux join and learn more!
Recommended Articles
This is a guide to Linux Join. Here we discuss the definition, How Join works in Linux with some Examples?. You may also have a look at the following articles to learn more –