KoblentsBlog Photography
Contact About
Ches
Unix Sort By Key Unexpected Behavior
Recently I needed to sort a file of entries by the date in each entry, where the date was a human-readable string across a few whitespace breaks. The format of each line was like this:
1
ENTRY-0009      FAIL    Thu Sep  5 17:39:24 2019        PASS 7  FAIL 6
But the most obvious sort command
sort entries.txt -k7 -k4M -k5 -k6
didn't produce results sorted in the expected order (2019, Sep [month], 5, 17:39:24). Let me show you after the break:
The full file to reproduce this is at the end.
Here's the abridged output of the above command:
12345678910111213141516171819202122232425262728293031323334353637
$ sort file.txt -k7 -k4M -k5 -k6

ENTRY-0041      FAIL    Sat Jul 13 16:13:35 2019        PASS 0  FAIL 1
ENTRY-0051      FAIL    Sat Jul 13 20:31:09 2019        PASS 0  FAIL 1
ENTRY-0112      FAIL    Mon Jul 15 07:50:46 2019        PASS 0  FAIL 1
ENTRY-0113      FAIL    Mon Jul 15 07:59:11 2019        PASS 0  FAIL 1
ENTRY-0024      FAIL    Mon Jul 15 08:04:43 2019        PASS 0  FAIL 1
ENTRY-0024      FAIL    Mon Jul 15 08:12:14 2019        PASS 0  FAIL 1
ENTRY-0105      FAIL    Tue Jul 16 07:30:09 2019        PASS 0  FAIL 1
ENTRY-0124      FAIL    Tue Jul 16 07:36:08 2019        PASS 0  FAIL 1
ENTRY-0252      FAIL    Sat Aug 10 10:47:59 2019        PASS 0  FAIL 1
ENTRY-0287      FAIL    Fri Sep 27 17:10:00 2019        PASS 0  FAIL 1
ENTRY-0242      FAIL    Fri Sep 27 17:12:00 2019        PASS 0  FAIL 1
ENTRY-0023      FAIL    Sun Oct 27 12:13:11 2019        PASS 0  FAIL 1
ENTRY-0018      FAIL    Thu Oct 17 15:22:08 2019        PASS 0  FAIL 2
ENTRY-0042      FAIL    Sat Oct 26 12:42:20 2019        PASS 10 FAIL 2
ENTRY-0078      FAIL    Sun Oct 27 08:01:55 2019        PASS 10 FAIL 2
ENTRY-0001      OK      Wed Sep  4 20:37:36 2019        PASS 11 FAIL 0
ENTRY-0003      OK      Wed Sep  4 20:57:44 2019        PASS 11 FAIL 0
ENTRY-0054      FAIL    Sat Oct 26 13:58:55 2019        PASS 11 FAIL 1
ENTRY-0004      OK      Thu Sep  5 17:46:48 2019        PASS 12 FAIL 0
ENTRY-0006      OK      Thu Sep  5 17:49:08 2019        PASS 12 FAIL 0
ENTRY-0016      OK      Fri Sep  6 16:42:10 2019        PASS 12 FAIL 0
ENTRY-0026      OK      Fri Sep  6 16:44:48 2019        PASS 12 FAIL 0
ENTRY-0030      OK      Fri Sep  6 16:55:26 2019        PASS 12 FAIL 0
ENTRY-0001      OK      Thu Oct 17 15:07:28 2019        PASS 12 FAIL 0
...snip...
ENTRY-0047      OK      Fri Oct 25 11:28:12 2019        PASS 12 FAIL 0
ENTRY-0060      OK      Fri Oct 25 15:15:41 2019        PASS 12 FAIL 0
ENTRY-0003      OK      Sat Oct 26 07:08:59 2019        PASS 12 FAIL 0
ENTRY-0059      OK      Sat Oct 26 08:59:57 2019        PASS 12 FAIL 0
...snip...
ENTRY-0078      OK      Sun Oct 27 08:03:14 2019        PASS 12 FAIL 0
ENTRY-0162      OK      Wed Jul 17 09:02:36 2019        PASS 16 FAIL 0
ENTRY-0041      OK      Sat Jul 13 16:14:24 2019        PASS 17 FAIL 0
ENTRY-0013      OK      Sun Jul 14 10:39:49 2019        PASS 17 FAIL 0
...snip...
Clearly out of order!
Thankfully, the --debug flag was very illuminating.
123456
$ sort file.txt -k7 -k4M -k5 -k6 --debug

Memory to be used for sorting: 8589934592
Using collate rules of en_US.UTF-8 locale
sort_method=heapsort
; k1=< 2019     PASS 12 FAIL 0>(20), k2=< 2019  PASS 0  FAIL 1>(19);  ... snip ... cmp1=1
As you can see, though we intended with -k7 to refer to the 7th column (date), in fact sort takes this as the "7th column and everything after."

 


The solution: restrict it to precisely the column width.
123456
$ sort file.txt -k7.1,7.5 -k4M -k5 -k6 --debug

Memory to be used for sorting: 8589934592
Using collate rules of en_US.UTF-8 locale
sort_method=heapsort
; k1=< 2019>(5), k2=< 2019>(5); ...snip... cmp1=-1

1234567891011121314151617181920212223242526272829303132
$ sort file.txt -k7.1,7.5 -k4M -k5 -k6

ENTRY-0001      OK      Sat Jul 13 13:31:15 2019        PASS 17 FAIL 0
ENTRY-0041      FAIL    Sat Jul 13 16:13:35 2019        PASS 0  FAIL 1
...
ENTRY-0024      FAIL    Mon Jul 15 08:12:14 2019        PASS 0  FAIL 1
ENTRY-0105      FAIL    Tue Jul 16 07:30:09 2019        PASS 0  FAIL 1
ENTRY-0105      OK      Tue Jul 16 07:30:27 2019        PASS 17 FAIL 0
ENTRY-0124      FAIL    Tue Jul 16 07:36:08 2019        PASS 0  FAIL 1
ENTRY-0102      OK      Tue Jul 16 08:26:46 2019        PASS 17 FAIL 0
ENTRY-0162      OK      Wed Jul 17 09:02:36 2019        PASS 16 FAIL 0
ENTRY-0204      OK      Wed Jul 17 09:06:45 2019        PASS 17 FAIL 0
ENTRY-0252      FAIL    Sat Aug 10 10:47:59 2019        PASS 0  FAIL 1
ENTRY-0260      OK      Sat Aug 10 11:27:30 2019        PASS 17 FAIL 0
ENTRY-0222      OK      Sat Aug 10 11:28:43 2019        PASS 17 FAIL 0
ENTRY-0001      OK      Wed Sep  4 20:37:36 2019        PASS 11 FAIL 0
...snip...
ENTRY-0030      OK      Fri Sep  6 16:55:26 2019        PASS 12 FAIL 0
ENTRY-0287      FAIL    Fri Sep 27 17:10:00 2019        PASS 0  FAIL 1
ENTRY-0242      FAIL    Fri Sep 27 17:12:00 2019        PASS 0  FAIL 1
ENTRY-0001      OK      Thu Oct 17 15:07:28 2019        PASS 12 FAIL 0
ENTRY-0018      FAIL    Thu Oct 17 15:22:08 2019        PASS 0  FAIL 2
ENTRY-0018      OK      Thu Oct 17 15:22:22 2019        PASS 17 FAIL 0
ENTRY-0004      OK      Fri Oct 18 10:25:05 2019        PASS 12 FAIL 0
...snip...
ENTRY-0047      OK      Thu Oct 24 10:24:53 2019        PASS 12 FAIL 0
ENTRY-0043      OK      Thu Oct 24 10:38:35 2019        PASS 17 FAIL 0
ENTRY-0047      OK      Fri Oct 25 11:28:12 2019        PASS 12 FAIL 0
ENTRY-0060      OK      Fri Oct 25 15:15:41 2019        PASS 12 FAIL 0
ENTRY-0003      OK      Sat Oct 26 07:08:59 2019        PASS 12 FAIL 0
...snip...
ENTRY-0038      OK      Sat Oct 26 14:06:41 2019        PASS 12 FAIL 0

 


 


Here's the file for you to play with:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
$ cat file.txt

ENTRY-0001	OK	Sat Jul 13 13:31:15 2019	PASS 17 FAIL 0
ENTRY-0041	FAIL	Sat Jul 13 16:13:35 2019	PASS 0	FAIL 1
ENTRY-0041	OK	Sat Jul 13 16:14:24 2019	PASS 17	FAIL 0
ENTRY-0051	FAIL	Sat Jul 13 20:31:09 2019	PASS 0	FAIL 1
ENTRY-0013	OK	Sun Jul 14 10:39:49 2019	PASS 17	FAIL 0
ENTRY-0019	OK	Sun Jul 14 10:47:27 2019	PASS 17	FAIL 0
ENTRY-0112	FAIL	Mon Jul 15 07:50:46 2019	PASS 0	FAIL 1
ENTRY-0114	OK	Mon Jul 15 07:57:02 2019	PASS 17	FAIL 0
ENTRY-0113	FAIL	Mon Jul 15 07:59:11 2019	PASS 0	FAIL 1
ENTRY-0110	OK	Mon Jul 15 08:00:01 2019	PASS 17	FAIL 0
ENTRY-0024	FAIL	Mon Jul 15 08:04:43 2019	PASS 0	FAIL 1
ENTRY-0035	OK	Mon Jul 15 08:07:11 2019	PASS 17	FAIL 0
ENTRY-0028	OK	Mon Jul 15 08:09:36 2019	PASS 17	FAIL 0
ENTRY-0024	FAIL	Mon Jul 15 08:12:14 2019	PASS 0	FAIL 1
ENTRY-0105	FAIL	Tue Jul 16 07:30:09 2019	PASS 0	FAIL 1
ENTRY-0105	OK	Tue Jul 16 07:30:27 2019	PASS 17	FAIL 0
ENTRY-0124	FAIL	Tue Jul 16 07:36:08 2019	PASS 0	FAIL 1
ENTRY-0102	OK	Tue Jul 16 08:26:46 2019	PASS 17	FAIL 0
ENTRY-0162	OK	Wed Jul 17 09:02:36 2019	PASS 16	FAIL 0
ENTRY-0204	OK	Wed Jul 17 09:06:45 2019	PASS 17	FAIL 0
ENTRY-0252	FAIL	Sat Aug 10 10:47:59 2019	PASS 0	FAIL 1
ENTRY-0260	OK	Sat Aug 10 11:27:30 2019	PASS 17	FAIL 0
ENTRY-0222	OK	Sat Aug 10 11:28:43 2019	PASS 17	FAIL 0
ENTRY-0001	OK	Wed Sep  4 20:37:36 2019	PASS 11	FAIL 0
ENTRY-0003	OK	Wed Sep  4 20:57:44 2019	PASS 11	FAIL 0
ENTRY-0009	FAIL	Thu Sep  5 17:39:24 2019	PASS 7	FAIL 6
ENTRY-0010	FAIL	Thu Sep  5 17:42:58 2019	PASS 7	FAIL 6
ENTRY-0004	OK	Thu Sep  5 17:46:48 2019	PASS 12	FAIL 0
ENTRY-0006	OK	Thu Sep  5 17:49:08 2019	PASS 12	FAIL 0
ENTRY-0016	OK	Fri Sep  6 16:42:10 2019	PASS 12	FAIL 0
ENTRY-0026	OK	Fri Sep  6 16:44:48 2019	PASS 12	FAIL 0
ENTRY-0030	OK	Fri Sep  6 16:55:26 2019	PASS 12	FAIL 0
ENTRY-0287	FAIL	Fri Sep 27 17:10:00 2019	PASS 0	FAIL 1
ENTRY-0242	FAIL	Fri Sep 27 17:12:00 2019	PASS 0	FAIL 1
ENTRY-0001	OK	Thu Oct 17 15:07:28 2019	PASS 12	FAIL 0
ENTRY-0018	FAIL	Thu Oct 17 15:22:08 2019	PASS 0	FAIL 2
ENTRY-0018	OK	Thu Oct 17 15:22:22 2019	PASS 17	FAIL 0
ENTRY-0004	OK	Fri Oct 18 10:25:05 2019	PASS 12	FAIL 0
ENTRY-0011	OK	Fri Oct 18 15:42:59 2019	PASS 17	FAIL 0
ENTRY-0004	OK	Fri Oct 18 21:47:37 2019	PASS 12	FAIL 0
ENTRY-0013	OK	Sat Oct 19 08:15:40 2019	PASS 12	FAIL 0
ENTRY-0011	OK	Sat Oct 19 08:19:07 2019	PASS 12	FAIL 0
ENTRY-0007	OK	Mon Oct 21 14:38:53 2019	PASS 12	FAIL 0
ENTRY-0022	OK	Mon Oct 21 14:56:12 2019	PASS 12	FAIL 0
ENTRY-0008	OK	Tue Oct 22 07:35:59 2019	PASS 12	FAIL 0
ENTRY-0009	OK	Tue Oct 22 07:38:24 2019	PASS 12	FAIL 0
ENTRY-0052	OK	Thu Oct 24 09:57:03 2019	PASS 12	FAIL 0
ENTRY-0047	FAIL	Thu Oct 24 10:23:50 2019	PASS 8	FAIL 4
ENTRY-0047	OK	Thu Oct 24 10:24:53 2019	PASS 12	FAIL 0
ENTRY-0043	OK	Thu Oct 24 10:38:35 2019	PASS 17	FAIL 0
ENTRY-0047	OK	Fri Oct 25 11:28:12 2019	PASS 12	FAIL 0
ENTRY-0060	OK	Fri Oct 25 15:15:41 2019	PASS 12	FAIL 0
ENTRY-0003	OK	Sat Oct 26 07:08:59 2019	PASS 12	FAIL 0
ENTRY-0059	OK	Sat Oct 26 08:59:57 2019	PASS 12	FAIL 0
ENTRY-0058	FAIL	Sat Oct 26 09:03:51 2019	PASS 8	FAIL 4
ENTRY-0069	OK	Sat Oct 26 12:08:27 2019	PASS 12	FAIL 0
ENTRY-0054	OK	Sat Oct 26 12:12:37 2019	PASS 12	FAIL 0
ENTRY-0042	FAIL	Sat Oct 26 12:42:20 2019	PASS 10	FAIL 2
ENTRY-0055	OK	Sat Oct 26 13:50:00 2019	PASS 12	FAIL 0
ENTRY-0053	OK	Sat Oct 26 13:56:53 2019	PASS 12	FAIL 0
ENTRY-0054	FAIL	Sat Oct 26 13:58:55 2019	PASS 11	FAIL 1
ENTRY-0038	OK	Sat Oct 26 14:06:41 2019	PASS 12	FAIL 0
ENTRY-0078	FAIL	Sun Oct 27 08:01:55 2019	PASS 10	FAIL 2
ENTRY-0078	OK	Sun Oct 27 08:03:14 2019	PASS 12	FAIL 0
ENTRY-0023	FAIL	Sun Oct 27 12:13:11 2019	PASS 0	FAIL 1
Ches Koblents
March 2, 2020
 
« Newer Older »
© Copyright Koblents.com, 2012-2024