comm command in Linux with examples
The 'comm' command in Linux is a powerful utility that allows you to compare two sorted files line by line, identifying the lines that are unique to each file and those that are common to both. This command is particularly useful when you have lists, logs, or data sets that need to be compared efficiently. Here, we will explore the syntax, usage, options, and examples of the 'comm' command.
What is the 'comm' Command?
The 'comm' command is used for line-by-line comparison of two sorted files. It reads two files as input and generates a three-column output by default:
- Column 1: Lines unique to the first file.
- Column 2: Lines unique to the second file.
- Column 3: Lines common to both files.
Syntax:
$comm [OPTION]... FILE1 FILE2
- 'FILE1' and 'FILE2': The sorted files to compare.
- '[OPTION]': Flags to modify the command’s output.
Example of the 'comm' Command
Let us suppose there are two sorted files file1.txt and file2.txt and now we will use comm command to compare these two.
Displaying contents of file1
$cat file1.txt
Apaar
Ayush Rajput
Deepak
Hemant
Displaying contents of file2
$cat file2.txt
Apaar
Hemant
Lucky
Pranjal Thakral
Now, run comm command as:
$comm file1.txt file2.txt
Output:
Apaar
Ayush Rajput
Deepak
Hemant
Lucky
Pranjal Thakral
Explanation:
- Column 1: Lines unique to 'file1.txt' ('Ayush Rajput', 'Deepak').
- Column 2: Lines unique to 'file2.txt' ('Lucky', 'Pranjal Thakral').
- Column 3: Lines common to both files ('Apaar', 'Hemant').
Key Options for the 'comm' command
The 'comm' command offers several options to customize its output. Here are the most useful ones:
1. '-1' (Suppress First Column)
Suppresses lines that are unique to the first file, displaying only lines from the second file and common lines.
2. '-2' (Suppress Second Column)
Suppresses lines that are unique to the second file, displaying only lines from the first file and common lines.
3. '-3' (Suppress Third Column)
Suppresses lines that are common to both files.
4. '--check-order'
Check that the input is correctly sorted, even if all input lines are pairable.
5. '--nocheck-order'
Ignores whether the input files are sorted. This can be useful when comparing unsorted files but may result in unpredictable outputs.
6. '--output-delimiter=STR' (Custom Delimiter)
Changes the default column delimiter from a tab to a specified string. This option is useful for formatting output to meet specific needs.
7. '--help'
Display a help message, and exit.
8. '--version'
Output version information, and exit.
Note: The options 4 to 8 are rarely used but options 1 to 3 are very useful in terms of the desired output user wants.
Using comm with options
1. Using -1 ,-2 and -3 options :
The use of these three options can be easily explained with the help of example :
//suppress first column using -1// $comm -1 file1.txt file2.txt Apaar Hemant Lucky Pranjal Thakral //suppress second column using -2// $comm -2 file1.txt file2.txt Apaar Ayush Rajput Deepak Hemant //suppress third column using -3// $comm -3 file1.txt file2.txt Ayush Rajput Deepak Lucky Pranjal Thakral
Note that you can also suppress multiple columns using these options together as:
//...suppressing multiple columns...// $comm -12 file1.txt file2.txt Apaar Hemant /* using -12 together suppressed both first and second columns */
2. Using '--check-order' option
This option is used to check whether the input files are sorted or not and in case if either of the two files are wrongly ordered then comm command will fail with an error message.
$comm - -check-order f1.txt f2.txt
The above command produces the normal output if both f1.txt and f2.txt are sorted and it just gives an error message if either of the two files are not sorted.
3. Using '--nocheck-order' option
In case if you don't want to check whether the input files are sorted or not, use this option. This can be explained with the help of an example.
//displaying contents of unsorted f1.txt// $cat f1.txt Parnjal Kartik //displaying contents of sorted file f2.txt// $cat f2.txt Apaar Kartik //now use - -nocheck-order option with comm// $comm - -nocheck-order f1.txt f2.txt Pranjal Apaar Kartik /*as this option forced comm not to check the sorted order that's why the output comm produced is also not in sorted order*/
4. '--output-delimiter=STR' option
By default, the columns in the 'comm' command output are separated by spaces as explained above. However, if you want, you can change that, and have a string of your choice as separator. This can be done using the '--output-delimiter' option. This option requires you to specify the string that you want to use as the separator.
Syntax:
$comm - -output-delimiter=STR FILE1 FILE2
Example:
//...comm command with - -output-delimiter=STR option...// $comm - -output-delimiter=+file1.txt file2.txt ++Apaar Ayush Rajput Deepak ++Hemant +Lucky +Pranjal Thakral /*+ before content indicates content of second column and ++ before content indicates content of third column*/
Conclusion
In conclusion, the comm command provides a straightforward solution for text comparison tasks; to identify unique entries, find common lines, or customize the output format. This greatly simplify file comparison workflows in your daily tasks.