Thursday, May 29, 2008

Grep - global/regular expression/print

Grep has several options, some of those.
-l For listing the names of matching files.
-r Search the directories recursively.
-e If the pattern has leading '-' character.
-w If you want to search whole word, not a part of word.
-C [2-8] prints the two lines of context around each matching line.
/dev/null Appending at the end, forces grep to print the name of the file.
-a or --binary-files=text Forces grep to output the lines even from the files that appear to be binary files.
-I or --binary-files=without-match For eliminating binary file matches.
-lv Lists the names of all files containing one or more lines that do not match.
-L or --files-without-match To list the names of all files that contain no matching lines.

fgrp stands for Fixed grep,egrep is for Extended grep
There are four major variants of grep, controlled by the following options.
`-G'
`--basic-regexp'
Interpret the pattern as a basic regular expression. This is the default.

`-E'
`--extended-regexp'
Interpret the pattern as an extended regular expression.

`-F'
`--fixed-strings'
Interpret the pattern as a list of fixed strings, separated by newlines, any of which is to be matched.

`-P'
`--perl-regexp'
Interpret the pattern as a Perl regular expression.
Examples:

  1. How can I list just the names of matching files?


    grep -l 'main' *.c

    lists the names of all C files in the current directory whose contents mention `main'.

  2. How do I search directories recursively?


    grep -r 'hello' /home/gigi

    searches for `hello' in all files under the directory `/home/gigi'. For more control of which files are searched, use find, grep and xargs. For example, the following command searches only C files:


    find /home/gigi -name '*.c' -print | xargs grep 'hello' /dev/null

    This differs from the command:


    grep -r 'hello' *.c

    which merely looks for `hello' in all files in the current directory whose names end in `.c'. Here the `-r' is probably unnecessary, as recursion occurs only in the unlikely event that one of `.c' files is a directory.

  3. What if a pattern has a leading `-'?


    grep -e '--cut here--' *

    searches for all lines matching `--cut here--'. Without `-e', grep would attempt to parse `--cut here--' as a list of options.

  4. Suppose I want to search for a whole word, not a part of a word?


    grep -w 'hello' *

    searches only for instances of `hello' that are entire words; it does not match `Othello'. For more control, use `\<' and `\>' to match the start and end of words. For example:


    grep 'hello\>' *

    searches only for words ending in `hello', so it matches the word `Othello'.

  5. How do I output context around the matching lines?


    grep -C 2 'hello' *

    prints two lines of context around each matching line.

  6. How do I force grep to print the name of the file?

    Append `/dev/null':


    grep 'eli' /etc/passwd /dev/null

    gets you:


    /etc/passwd:eli:DNGUTF58.IMe.:98:11:Eli Smith:/home/do/eli:/bin/bash

  7. Why do people use strange regular expressions on ps output?


    ps -ef | grep '[c]ron'

    If the pattern had been written without the square brackets, it would have matched not only the ps output line for cron, but also the ps output line for grep. Note that some platforms ps limit the ouput to the width of the screen, grep does not have any limit on the length of a line except the available memory.

  8. Why does grep report "Binary file matches"?

    If grep listed all matching "lines" from a binary file, it would probably generate output that is not useful, and it might even muck up your display. So GNU grep suppresses output from files that appear to be binary files. To force GNU grep to output lines even from files that appear to be binary, use the `-a' or `--binary-files=text' option. To eliminate the "Binary file matches" messages, use the `-I' or `--binary-files=without-match' option.

  9. Why doesn't `grep -lv' print nonmatching file names?

    `grep -lv' lists the names of all files containing one or more lines that do not match. To list the names of all files that contain no matching lines, use the `-L' or `--files-without-match' option.

  10. I can do OR with `|', but what about AND?


    grep 'paul' /etc/motd | grep 'franc,ois'

    finds all lines that contain both `paul' and `franc,ois'.

  11. How can I search in both standard input and in files?

    Use the special file name `-':


    cat /etc/passwd | grep 'alain' - /etc/motd

  12. How to express palindromes in a regular expression?

    It can be done by using the back referecences, for example a palindrome of 4 chararcters can be written in BRE.


    grep -w -e '\(.\)\(.\).\2\1' file

    It matches the word "radar" or "civic".

    Guglielmo Bondioni proposed a single RE that finds all the palindromes up to 19 characters long.


    egrep -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file

    Note this is done by using GNU ERE extensions, it might not be portable on other greps.

  13. Why are my expressions whith the vertical bar fail?


    /bin/echo "ba" | egrep '(a)\1|(b)\1'

    The first alternate branch fails then the first group was not in the match this will make the second alternate branch fails. For example, "aaba" will match, the first group participate in the match and can be reuse in the second branch.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home