ocp1000 February 2016

grep lines that contains only numbers or only letters characters until a delimiter

Kind of a confusing title, but let me explain. I have a tab delimited file that contains lines in the following format (columns are delimited by tab)

Extra   573102|000473
Extra   ZRY|BC95624
Missing ABC|BC99000
Missing 123456|001122

I'd like to split the file to 4 different files, based on the following logic:

  1. If line contains "Extra" and only numbers until the "|", put that line in file #1 (In the above case, file #1 will contain "Extra 573102|000473").

  2. If line contains "Extra" and only letters until the "|", put that line in file #2 (In the above case, file #2 will contain "Extra ZRY|BC95624").

  3. If line contains "Missing" and only numbers until the "|", put that line in file #3 (In the above case, file #3 will contain "Missing ABC|BC99000").

  4. If line contains "Missing" and only letters until the "|", put that line in file #4 (In the above case, file #4 will contain "Missing 123456|001122").

I have no idea how to combine the text, tab character and the regex that will accomplish the above.

Answers


Jan February 2016

Some dummy code:

regex1 = "^Extra\h+\d+\|"
# This is Extra at the beginning of the string / line in multiline mode
# followed by spaces and digits up to the | character
regex2 = "^Extra\h+[a-zA-Z]+\|"
# same with letters
regex3 = "^Missing\h+\d+\|"
regex4 = "^Missing\h+[a-zA-Z]+\|"

if line matches regex1:
    append to file1
else if line matches regex2:
    append to file2
else if line matches regex3:
    append to file3
else if line matches regex4:
    append to file4

See a demo on regex101.com


Casimir et Hippolyte February 2016

You can use awk:

awk -F'[\t |]+' '$1=="Extra" {
    if ($2~/^[0-9]+$/) print >> "file1"
    else
    if ($2~/^[A-Z]+$/) print >> "file2"
    next
}

$1=="Missing" {
    if ($2~/^[0-9]+$/) print >> "file3"
    else
    if ($2~/^[A-Z]+$/) print >> "file4"
}' yourfile

Post Status

Asked in February 2016
Viewed 1,312 times
Voted 10
Answered 2 times

Search




Leave an answer