Pranav February 2016

To find latest entry for a particular record in the unix file

I have a file which has multiple entries for a single record. For example:

abc~20160120~120
abc~20160125~150
xyz~20160201~100
abc~20160205~200
xyz~20160202~90
pqr~20160102~250

The first column is record name, second column is date and third column is the entry for that particular date.

Now what I want to display in my file is the latest entry for a particular record. This is how my output should look like

abc~20160205~200
xyz~20160202~90
pqr~20160102~250

Can anybody help with a shell script for the same? Keeping in mind that I have too many records which needs to be sorted first according to their record name and then taking out the latest one for each record according to date.

Answers


choroba February 2016

Sort the lines by record name and date reversed, than use the -u unique flag of sort to only output the first entry for each record:

sort -t~ -k1,2r  < input-file | sort -t~ -k1,1 -u


anubhava February 2016

Using awk you can avoid sorting your huge file and get the results using single command:

awk -F '~' '$2>a[$1]{a[$1]=$2; r[$1]=$0} END{for (i in r) print r[i]}' file

Output:

abc~20160205~200
pqr~20160102~250
xyz~20160202~90

Post Status

Asked in February 2016
Viewed 3,090 times
Voted 11
Answered 2 times

Search




Leave an answer