Konrad February 2016

grep/ack on Mac OS X finding multiple strings and respecting file types

I would like to run grep on Mac OS X that would meeting the following criteria:

  • search all files with *.R or *.r as extension and ignore other files
  • Find strings: wordA and wordB accounting for the fact that the strings may appear in the format someRubbishWordARubbish (this is a valid match)
  • List only the files where both strings appear irrespectively of the order
  • Print the lines where the strings appear
  • Highlight the found words in colour
  • Print the file name as a header and lines under the header. I'm inspired by the ack options.
  • Ignoring the case

Approach

I was thinking of making use of this discussion and starting with the following grep syntax:

grep --include=*.R -r setHeader .

Then combining it with the following:

grep 'word1\|word2\|word3' /path

However, I would appreciate comments on ensuring that all of the criteria stated above will be evaluated correctly.

Groups

^(.*)(facet|map)(.*)(map|facet)(.*)$

regex101


Ack

Running ack -f shows that *.R files would be searched so solutions using ack will be accepted. For example, running:

ack wordA --colour -i -H --rr

gets the desired results with respect to the wordA. I was thinking of combining it the solutions discussed here but I would like to

Answers


PapaBuduiit February 2016

Well, this is pretty inefficient, but assuming you aren't searching a large directory, this should work.

find . -iname "*.r"  | while read file
do 
if (grep -qi "wordB" $file && grep -qi "wordA" $file)
  then
  echo "======== $file ======="
  grep --color=auto -iE "wordA|wordB" $file
  fi
done


Chad Nouis February 2016

Here's a script that combines the closely-named awk and ack commands:

find . -iname '*.r' | while read file; do
    awk '
        BEGIN { IGNORECASE=1; sawWordA = 0; sawWordB = 0 }
        /wordA/ { sawWordA = 1 }
        /wordB/ { sawWordB = 1 }
        sawWordA && sawWordB { exit } # stop reading lines if both matches seen
        END { exit !(sawWordA && sawWordB) }
        ' \
        "${file}" \
    && ack --nofilter -H -i 'wordA|wordB' "${file}"
done

The awk command...

  • Lists only the files where both strings appear irrespectively of the order
  • Ignores the case

...and the ack command...

  • Prints the lines where the strings appear
  • Highlights the found words in colour
  • Prints the file name as a header and lines under the header inspired by the ack options
  • Ignores the case

The awk script sets flags if there are search string matches. If both strings have been matched, then the snippet exit !(sawWordA && sawWordB) will return 0. If awk returns 0, then the ack command runs.

The ack --nofilter option tells ack to avoid reading from STDIN. Otherwise, ack would try to use the STDIN that the read command is using.

In the comments, Konrad asked how to use the above code when passing in variables in a shell script. Below is an example:

#!/bin/sh

if [ $# -ne 2 ]; then
    echo Usage: $0 {string1} {string2}
    E_BADARGS=65
    exit $E_BADARGS
fi

find . -maxdepth 1 -iname '*.r' | while read file; do
    awk "
        BEGIN { IGNORECASE=1; sawArg1 = 0; sawArg2 = 0 }
        /$1/ { sawArg1 = 1 }
        /$2/ { sawArg2 = 1 }
        sawArg1 && sawArg2 { exit } # stop reading lines if both matches seen
        END { exit !(sawArg1 && sa 

Post Status

Asked in February 2016
Viewed 3,965 times
Voted 4
Answered 2 times

Search




Leave an answer