Text files that contain independent records of data are often candidates for sorting. A predictable record order makes life easier for human users: book indexes, dictionaries, telephone directories have little value if they are unordered.

Like awk, cut, and join, sort views its input as a stream of records made up of fields of variable width, with records deliited by newlines characters and fields delimited by whitespace or a user specifiable single character.

Sorting in lines

In simple case, no options, complete records are sorted according to the order defined by the locale which means ASCII order by default.

$ cat /etc/group | sort
adm:x:4:root,adm,daemon
atcan:x:42000:
atcpu:x:43000:
atlas:x:1307:
atprd:x:41000:
atsgm:x:30000:

There are some options can be used when doing sorting:

-b  ignore leading whitespace
-d Dictionary order: only alphanumerics and whitespace are significant.
-g general numeric value
-f fold lower case to upper case characters
-i ignore nonprintable characters
-k define the sort key field
-m merge already sorted input files into a sorted output stream.
-n compare fields as integer numbers
-o outfile
-r reverse the sort order to descending, rather than default ascending
-t use the single character char as the default field separator, instead of the default of whitespace
-u unique records only: discard all but the first record in a group with equal keys

Sorting by Fields

The -k option allows you to specify the field to sort and the -t options lets you chose the field delimiter.

For example, you want to sort /etc/passwd by uid

$ sort -t: -k3 -n /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt

Where -t specify the field delimiter, -n is to sort in integer numbers.

More options available for Fields sorting

-b  ignore leading whitespace
-d Dictionary order: only alphanumerics and whitespace are significant.
-g general numeric value
-f fold lower case to upper case characters
-i ignore nonprintable characters
-k define the sort key field
-n compare fields as integer numbers
-r reverse the sort order to descending, rather than default ascending
-t use the single character char as the default field separator, instead of the default of whitespace

Another example, you can also sort two or more field in one sort order.

For example, in /etc/passwd, Field 4 is GID, Field 3 is UID, so sort user list by GID,UID

$ sort -t: -k4n -k3n  /etc/passwd
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
operator:x:11:0:operator:/root:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin

Note: -u option can also work with -k option, the fun par of this is when -u is used, it only compares the sort key fields, even if there are differenc elsewhere, there will be only one record left after sorting.

Sorting by Months of a year

Option -M allows you sort content by months. Here is one example:

$ls -l |sort -M -k6
-rw-r--r-- 1 root root  2037808913 Feb 22 03:06 00000000FE33877940FF9559E7C493BB4AD1
-rw-r--r-- 1 root root   141480968 Mar 11  2014 00000001CADACF494B9986871D12C5E17120
-rw-r--r-- 1 root root    32940072 Mar 10  2014 000000062AEA473449FDBB19BED670DBAEA5
-rw-r--r-- 1 root root     7487891 Mar  3 20:07 00000001C2D995314EBE99C9489E0BF55EAB
-rw-r--r-- 1 root root     1064400 Jun 30 05:02 000000064F3C03F747AD85E06CE15568E347
-rw-r--r-- 1 root root   154124518 Jun 18 07:44 00000008B90FE28947DB839FDC0662373B81
-rw-r--r-- 1 root root    16206487 Jun  5 00:14 00000001EDE323304A8F8C49B093B25F3F76
-rw-r--r-- 1 root root      355282 Jun  9 14:44 000000027E8E5CD44CD6BBB1C614EB5B6082
-rw-r--r-- 1 root root     1715712 Jul 14  2014 000000079DA742534F0EBF12F24FDA412597
-rw-r--r-- 1 root root     4438469 Jul  9 14:49 00000001BC901B514E6697B58B174D18B884
-rw-r--r-- 1 root root      444401 Aug 17 01:12 00000002E78CF1D74E9AB66EF415CAAFDF8E

The ls -l output is sorted by Month of the year.  You can add other field to fine sort the output by month and day like

ls -l | sort  -k6M -k7n

To sort files by year, month,date

ls -l --time-style=+%b%t%d%t%Y | sort -k8n -k6M -k7n 

Of course you can get the same result by just run ls with some options, here just show you how to use sort to do the same work. Perhaps useful when you have a large text file to sort which contains a lot data format fields.

Remove duplicated lines

option -u is to elimit deplicated records.

sort -u ./filea 

Be careful when use -u with -k, it wil eliminate records base on matching keys rther than matching records, so in this case, it's safter to use uniq command

sort -k2 ./filea | uniq

 

 

 

 

 

Comments powered by CComment