uniq is just an abbreviation of word unique. It’s capable of performing actions about finding unique or duplicated elements. It usually works with
sort command because it does not detect repeated lines unless they are adjacent.
Some important options are listed below:
-i: Do case-insensitive comparisons
-c: Count the occurrences of lines
-uyou get only unique lines and
- To limit comparison between lines to only the first N characters, use
- Avoid comparing the first N characters with
- Avoid comparing first N fields with option
- More on Google
With a simple example you can figure out how it works, say I have a sorted text file
Green is the favorite of David.
To count the occurrences of colors, you can try this:
uniq -c -w 5 colors.txt. Here comes the console printout:
2 Green is the favorite of David.
The digit before each line denotes the times of occurrences.
When the amount of the lines is great,
uniq command can refine some important statistic features.
Note that the part of line avoided is not directly removed, but you can still use other tools to achieve this goal.
And to get unique lines or duplicated ones, use
-d respectively. Here are the results:
$ uniq -d -w5 colors.txt
In the example above,
-w cannot be cut out when focusing on colors. But in daily uses,
uniq often targets better preprocessed lines, where
-w is not so necessary.
Can you get the number of colors appearing in
more_colors.txt the text file below? Try