Introduction to Linux uniq
Linux uniq command-line utility is helpful to filter out the repeated or duplicate records in the file. In other words, it will detect the adjacent duplicate lines and omit/delete the duplicate lines from the inputs data or file.
This utility was written by Richard M. Stallman and David MacKenzie.
uniq [ OPTION ] ... [ INPUT [ OUTPUT ] ]
The uniq syntax is simple. As per the above syntax, the “INPUT” refers to the input data or input file. It will contain the repeated lines of duplicate data needs to be filtered out. If the “INPUT” keyword will not specify then “uniq” will read the standard input. The “OUTPUT” refers to the output file in which we can store the filtered output data generated by “uniq” command.
How does Linux uniq Command work?
The Linux uniq command is like filter program and it will use after the sort. The uniq command will get the repeated or duplicate input data or input file. With the help of different filter actions or keywords available in the uniq. We will filter out the adjacent data or duplicate data from the input file and process the end result to the output file.
Note: It will filter out the adjacent duplicate data only. If the number of duplicate records comes in serially then only it will filter out and remove the duplicate records. Consider as a single record. If the same record will come in the input file (without duplicate) but in a different location (except the previous adjacent location) then it will print in the output file. The duplicate data exactly match with each other i.e. it is case sensitive.
Examples to Implement Linux uniq Command
Examples to implement linux uniq command are given below:
1. Uniq Command
The uniq command will help to remove the duplicate data or records from the input file.
Explanation: We have the sample text file “contents.txt”. It is having some data in the same file (refer screenshot 1 . Now we are using the uniq command to filter out the duplicate data (refer screenshot 1 ).
2. Uniq Count Command
In uniq count command, we will identify how many times the line was repeated in the input file or data with prefix value. To count the number of duplicate records in input file we can use the “-c” option with uniq command.
uniq -c contents.txt
Explanation: In contents.txt file, there are a total of 8 records in it. The non-duplicate records are three. With the help of “-c” option, we will get the number of duplicate count record information.
3. Uniq Repeated Command
It will display or print the unique records from the input file. No repeated or duplicate records will display. For repeated records, we can use the “-d” option with uniq command.
uniq -d contents.txt
Explanation: From the input file (contents.txt), we will only identify the repeated unique records.
4. Uniq all-repeated
With the help of all-repeated command, we will able to get all the records from the input file. For all-repeated records, we can use the “-D” option with uniq command.
uniq -D contents.txt
Explanation: From the input file “contents.txt”, we can print all the records with the help of all-repeated functionality in uniq.
5. Uniq Unique Command
The unique command will help to identify the unique records or data from the input file. For unique records, we can use the “-u” option with uniq command.
uniq -u cont.txt
Explanation: In the input file “cont.txt”, we have only one unique word in the file i.e. “thanks”. With the help of “-u” unique option, we are able to identify form the input data or file.
6. Uniq Using –f N Command
In this uniq command, it will allow the “N” fields to be skipped while comparing the uniqueness of the lines.
uniq -f 3 cont.txt
Explanation: In cont.txt input file having the number of input records (refer the screenshot 3 (a)). We are able to print the unique records with the interval of “1” (“N”) [the “N” value is depending on the client requirement]
7. Uniq Using –s N Command
The uniq –s N command is similar to -f N option. But it will skip the “N” characters but not skip the “N” fields.
uniq -s 3 cont.txt
Explanation: In cont.txt input file, we are having multiple lines with special characters and numbers in it (refer screenshot 1). With the help of the “-s” option will skip the special characters or numbers from the input file. The “N” denotes the number value. The same number of value will help to skip the special character form the input file.
8. Uniq Using –w N Command
The uniq –w N command is similar to skipping the characters. With the help of uniq command, we can limit the comparison to the set number of characters (“N” value).
uniq -w 3 input.txt
Explanation: In input.txt file, we are having 3 records starting to form 3 letter words i.e. “How” (refer Screenshot 1). With the help of “-w” option and “N” value (3), we will skip the initial 3 characters of the input file and having character sequence (refer Screenshot 2).
We have seen the uncut concept of “Linux uniq” with the proper example, explanation, and command with different outputs. The uniq command is like a filter program. It will filter out the data as per user requirements. The filter data or output data will further use of shell jobs and other application development tasks.
This is a guide to Linux uniq. Here we also discuss the Introduction and how does linux uniq command work? along with different examples and its code implementation. You may also have a look at the following articles to learn more –