Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

grep tutorial

by Dr. Nikolai Bezroukov
Version 2.0 (Oct 23, 2018)

News	Searching Algorithms	Recommended B ooks	Recommended Links	Using grep and extended regular expressions to analyze text files	Regex	Linux grep reference	Solaris Grep reference
fgrep	egrep	ngrep -- searching network packets like Unix grep	pcregrep	gzgrep	bzgrep	Agrep
ack - grep replacement	find	xargs	String search algorithms	History	Sysadmin Horror Stories	Humor	Etc

Introduction
fgrep -- searching for fixed string
Grep regular expressions
Major options
Examples
Grep with pipes
Tips
Using grep with find
Grep alternatives

Introduction

The Linux grep command searches a file for lines matching a fixed string or regular expressions (also incorrectly called patterns). We will assume GNU version of grep. Alternatives are quite similar and more powerful, but GNU grep is a standard de-facto and currently it does implement Perl-style regex -P (Perl regex) option, which are the recommended form of regex to use.

By default grep output matching lines that it can find in the lest of files specified by arguments and then exists with the return code zero, if one or more lines were marched, one if no lines were matched. In case of inaccessible input files or syntax errors in specified regex grep returns code larger then one.

The strange name grep originates in the early days of Unix, whereby Unix ed editor commands was g/re/p (globally search for a regular expression and print the matching lines). Because this editor command was used so often, a separate grep command was created to search files without first starting the line editor. From Wikipedia:

Regular expressions entered popular use from 1968 in two uses: pattern matching in a text editor^[5] and lexical analysis in a compiler.^[6] Among the first appearances of regular expressions in program form was when Ken Thompson built Kleene's notation into the editor QED as a means to match patterns in text files.^[5]^[7]^[8]^[9] For speed, Thompson implemented regular expression matching by just-in-time compilation (JIT) to IBM 7094 code on the Compatible Time-Sharing System, an important early example of JIT compilation.^[10] He later added this capability to the Unix editor ed, which eventually led to the popular search tool grep's use of regular expressions ("grep" is a word derived from the command for regular expression searching in the ed editor: g/re/p meaning "Global search for Regular Expression and Print matching lines"^[11]). Around the same time when Thompson developed QED, a group of researchers including Douglas T. Ross implemented a tool based on regular expressions that is used for lexical analysis in compiler design.^[6]

... ... ...

Many variations of these original forms of regular expressions were used in Unix^[9] programs at Bell Labs in the 1970s, including vi, lex, sed, AWK, and expr, and in other programs such as Emacs. Regexes were subsequently adopted by a wide range of programs, with these early forms standardized in the POSIX.2 standard in 1992.

... ... ...

Starting in 1997, Philip Hazel developed PCRE (Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regex functionality and is used by many modern tools including PHP and Apache HTTP Server.

The power of grep stems from the ability of using regular expression, so we need to pay proper attention to study of regular expression, while studying grep. GNU grep used in Linux accepts three types of regular expressions, which complicates its usage. Historically they emerged in order basic regex, extended regex, and Perl-style regex. Now they should be used in reverse order, with Perl-style regex as preferable notation and engine:

Perl-style regular expressions (option -P ). Perl language brought a new level of sophistication in regular expression ( perlre - perldoc.perl.org ) domain, which low became standard de-facto in other languages via Perl Compatible Regular Expressions (PCRE) is a library written in C, which implements a regular expression engine, inspired by the capabilities of the Perl programming language. Philip Hazel started writing PCRE in summer 1997.[2] PCRE's syntax is much more powerful and flexible than either of the POSIX regular expression flavors and than that of many other regular-expression libraries. (for example Java and Python) as well due to independent implementation of Perl. I highly recommend to use as a standard option defining it is grep alias, for example:
```
alias grepp="grep -P "
```
Extended regular expression (option -E ). This type of regular expression introduced in AWK and SED and for which now there is a POSIX standard. implemented via egrep command.

Basic regular expression. This is a default. So this type of regex notation and engine is used if grep is involved without any options. Some books mention that you can connect elements of regular expression with OR ('|') even in basic regex. I do not recommend to use alteration with the regular grep invocation. Use egrep or grep -P (aliased to grepp) instead. If you are forced to do it, please remember that you need to use backspace before'|' even if regex is in quotes. Such a perversion. For example:
```
grep 'if|while'    # -- wrong
grep 'if\|while'   # -- will work, please note that despite single quotes we need a backslash
egrep 'if|while'   # -- will work better (this is an extended grep
grep -P 'if|while' # -- will work best
```

Unfortunately that means that sysadmins need to know at least two ("basic" and "extended") or basic" and Per-style". And preferably all three. This "multiple personalities" (aka schizoid) behavior is very confusing. I hate the fact that nobody has the courage to implement a new standard grep and that the current implementation has all warts accumulated during the 30 years of Unix existence.

I highly recommend using -P option (Perl regular expressions) as default by redefining grep -P as an alias grep. It makes grep behavior less insane. Sysadmin who do not know Perl but widely use AWK are encouraged to use AWK instead of grep in all complex cases, which require extended regular expressions.

Knowing extended regular expression is valuable it you also use awk instead of Perl. Otherwise I would say learn Perl-style regular expressions

Linux uses GNU implementation of grep, which combines old separate versions of grep into a single utility. But this utility has two aliases (fgrep and egrep) and invocation via particular alias changes its behavior by invoking particular regex engine, as if you specified option -F or -E on the command string. So the classic names survived; they are just implemented via aliases.

fgrep command search for fixed strings only and is equivalent to grep -F invocation (or grep --fixed-stringsinvocation). It implements very fast search for fixed strings only; no regular expression
grep This is a legacy grep which implemented basic (DOS -style) regular expression with some extensions.
egrep (extended grep) which accepts extended regular expressions and fgrep (). The egrep command is equivalent to grep -E invocation ( or grep --extended-regexp), In POSIX standard certain named classes of characters are predefined [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means [0-9A-Za-z], except the latter form depends upon the C locale and the ASCII character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions.

You can also add alias grepp for grep -P, which is simpler to type then egrep and makes use of more powerful and flexible Perl-style regular expressions.

-P, --perl-regexp Interpret PATTERN as a Perl regular expression.

A complete list of Linux grep switches can be found in man page, Below are nine additional most useful grep options, that any sysadmin must know and use:

-i Ignore case. Match either upper- or lowercase.
-v Print only lines that do not match the pattern.
-H, --with-filename Print the file name for each match. This is the default when there is more than one file to search but essential if you
-A n Show n lines after the matching line.
-B n Show n lines before the matching line.
-C n Show n lines before and after the matching line.
-n Print the matched line and its line number.
-l Print only the names of files with matching lines (this is the lowercase letter “L”). So output contains only matching filenames (rather than the lines in those files that contain the search pattern). There is also an opposite option -L ( --files-without-match ) which also suppress normal output; instead print the name of each input file that have zero matches of specified regex. Scanning of the file stops on the first match, which increases efficiency.
-c Print only the count of matching lines.

Being able to invert the search logic with the -v flag is a very important foe sysadmins and widely used feature of grep. Among other things it allows delete "log noise". Some daemons in RHEL 7 by default are configured in such a way that they relentlessly spam the log making it unusable. Two worst offenders are systemd and dbus. For your own server you, of course can reconfigure them to higher level of alert stopping this nasty spam. But for servers you do not own this is impossible and the only way to deal with them is to filter those messages out. For example systemd daemon introduced in RHEL 7 pollutes the log. It often makes sense to exclude those messages when you analyze /var/log/messages

grep -Pv 'systemd\: (Start|Created|Removed|Stopping)|systemd\-login|dbus.*freedesktop\.problems|dbus.*(Activating|bluez)|pulseaudio' messages

Of course, you are better off writing a more sophisticated filter in Perl or Python. But as a "quick and dirty" solution this is OK.

NOTE: Please note that you can't alias it to pgrep: the name pgrep was taken before this mode of grep was implemented: utility pgrep exists and implements search of process table like ps | grep.

TIPS:

grep sometimes mis-recognize text files as binary. To suppress this use option -a
-a, --text -- Process a binary file as if it were text; this is equivalent to the --binary-files=text option.
One interesting test of Red hat certification books, is whether the coverage of grep includes options -A n (print n lines AFTER match) and -B n (print n lines BEFORE match) Lower quality books usually does not mention those options, which is definite sign the author is detached from the reality of sysadmin work.
- -A NUM, --after-context=NUM -- Print NUM lines of trailing context after matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
- -B NUM, --before-context=NUM -- Print NUM lines of leading context before matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.

Certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means [0-9A-Za-z], except the latter form depends upon the C locale and the ASCII character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.
In GNU grep you can also colorize the output using "--color"option with grep command which basically surround the matching string with the marker find in GREP_COLOR environment variable.
grep --color=auto foo myfile

Fgrep -- searching for fixed string

Invoking grep as fgrep, or using the short option -F or long option --fixed-strings switch to the search of fixed staring and does not interpret any pattern-matching characters. In this case grep returns all matching lines that contain particular string as a substring of the line. All characters in this case are interpreted literally, and are not assigned and special meaning.

[root@test01 log]# fgrep kernel  messages | fgrep failed

You can view fixed string as the most primitive form of regular expression -- regular expression without any metasymbols. But this extreme case allow to search file much more efficiently. GNU grep implements special algorithm for fast matching of such "fixed" string that allow to do it very fast even in a very large files. To activate this algorithm you should iether use option -F or invoke grep as fgrep. For example,

fgrep foo file # returns all the lines that contain a string "foo" in the file "file".

This option is often used for filtering data which comes form STDIN instead of a file. For example,

locate | fgrep /sysconfig # lists all entries in locate database which contain string /sysconfig

Grep regular expressions

As we mentioned above grep allow to use three types of regular expression: basic, extended and Perl-style:

Perl-style regular expressions

Any regular expressions consists of literals (strings that are interpreted "as is") and metacharacters, which specified particular type of matching on literals or by themselves. Perl-style regex have the following major metacharacter:

. -- matches any character, except (in some cases) newline (character grouping [^\n])

\d -- matches a digit (character grouping [0-9])

\D -- matches a non-digit (character grouping [^0-9]

\w -- matches a word character (character grouping [a-zA-Z0-9_] (underscore is counted as a word character here)

\W -- matches a non-word character (character grouping [^a-zA-Z0-9_]

\s -- matches a 'space' character (character grouping [\t\n ]. (tab, newline, space)

\S -- matches a 'non-space' character (character grouping [^\t\n ]).

(matches any character, when you say m"(.*)"s. See modifiers, below.))

$ -- anchor which matches the 'end of line', if placed at the end of a regular expression.

^ -- anchor that matches 'beginning of line' if placed at the beginning of a regular expression.

\b, \B -- anchors that matches a word boundary (\b) or lack of word boundary (\B).

It's probably best to build up your use of regular expressions slowly from simplest cases to more complex. You are always better off starting with simple expressions, making sure that they work and them adding additional more complex elements one by one. Unless you have a couple of years of experience with regex do not even try to construct a complex regex one in one quaint step.

Here are a few examples:

grep -P '404 - - ' /var/log/http* # allow to see all 404 error messages in http logs

grep -p '40\d/'  /var/log/http* # matches 400, 
	401, 403, etc.

Here are more examples of simple regular expression that might be reused in other contexts:

grep -P 't.t'	      # matches t followed by any letter followed by t	
grep -P '^131'        # 131 at the beginning of a line
grep -P '0$'	      # match lines that ends with  zero
grep -P 'error\d+'    # matches lines with the word error followed by  digits 		
grep -Pv '^$'         # Allow to remove empty lines  from the  output

Character classes

Now let's add complexity by introducing classes of characters.

Character Classes: The square brackets are used to create character classes. A character class is used to match a specific type of character. For example, you can match any decimal digit using m/[0123456789]/. This will match a single character in the range of zero to nine.
Symbolic Character Classes: There are several character classes that are used so frequently that they have a symbolic representation. The period meta-character stands for a special character class that matches all characters except for the newline. The rest are \d, \D, \s, \S, \w, and \W.

They are can be sets or ranges and should be put inside square brackets a -(minus) indicates "between" and a ^ after [ means "not":

grep -P '[abcde]'		# Either a or b or c or d or e
grep -P '[a-e]'			# same thing ("-" denote range here)
grep -P '[a-z]'			# Anything from a to z inclusive
grep -P '[^a-z]'		# Any non lower case letter

grep -P '[a-zA-Z]' 		# Any letter
grep -P '\w'	 		# Same thing as above

grep -P '[a-z]+'		# Any non-zero sequence of lower case letters
grep -P '[01]'			# Either "0" or "1"
grep -P '[^0-9a-zA-Z]'    	# matches any non-word character.

If you need to match a word whose length is unknown, you probably should not use an * or *? because a zero length word makes no sense.

Now let's introduce two so called anchors, a special characters that tell regex engine that the match should start of end in a certain position of the string. Two most common anchors are ^ and $:

^ signifies the beginning of the line and
$ signify the end of the line.

For example to match the first word on the line we can use the following regex :

grep -P '^\w+'

Several additional examples:

grep -P '0'		# zero: "0"
grep -P '0*'		# zero of more zeros		
grep -P '0+'		# one or more zeros
grep -P '0*0'		# same as above
grep -P '\d'		# any digit but only one
grep -P '\d+'           # any integer
grep -P '\d+\.\d*'      # a subset of real numbers. Please note that 0. is a real number

grep -P '\d+\.\d+\.\d+\.\d+' # IP addresses )no control of the number of digits so 1000.1000.1000.1000 would match  this regex

grep -P '/\d+\.\d+\.\d+\.255' # IP addresses ending with 255

At this point you can probably benefit from doing several exercises on the computer. Let's repeat key Perl regex metacharacters for reference:

\n		# A newline
\t		# A tab
\w		# Any alphanumeric (word) character.
		# The same as [a-zA-Z0-9_]
\W		# Any non-word character.
		# The same as [^a-zA-Z0-9_]
\d		# Any digit. The same as [0-9]
\D		# Any non-digit. The same as [^0-9]
\s		# Any whitespace character: space,
		# tab, newline, etc
\S		# Any non-whitespace character
\b		# A word boundary, outside [] only
\B		# No word boundary

NOTE: Characters $, |, [],{} (), \, / ^, / and several others in regular expressions should be preceded by a backslash, for example:

\|		# Vertical bar
\[		# An open square bracket
\)		# A closing parenthesis
\*		# An asterisk
\^		# A carat symbol
\/		# A slash
\\		# A backslash

Metacharacters in Character Classes

The character class [0123456789] or, shorter, [0-9] defines the class of decimal digits, and [0-9a-fA-F] defines the class of hexadecimal digits. You should use a dash to define a range of consecutive characters. Character classes let you match any of a range of characters. You can use variable interpolation inside the character class, but you must be careful when doing so. You can use metacharacters inside character classes but not as endpoints of a range. For example, you can do the following:

grep -P '[\d\s]'

Meta-characters that appear inside the square brackets that define a character class are used in their literal sense. They lose their meta-meaning. This may be confusing but that's how it is.

How to Create Complex Regex

Complex patterns are constructed from simple regular expressions using the following metacharacters:

Character Sequences: A sequence of characters (substring) will match the identical substring in the searched string. For example, 'abc' will match "abc" but not "cab" or "bca". If any character in the sequence is a meta-character, you need to use the backslash to match its literal value. Character sequences can be enclosed in round brackets. In this case metasymbols can be applied to then much like to character classes. you can "backreference" any sequence of characters in round brackets, see below.
Alternation: The alternation meta-character (|) will let you match more than one possible string. For example, 'a|b' will match if either the "a" character or the "b" character is in the searched string. You can use sequences of more than one character with alternation. For example, 'dog|cat' will match if either of the strings "dog" or "cat" is in the searched string. You can use several substrings in parentheses like in m/(dog|cat)/; However, this will affect pattern memory (see below)
Anchors: there are two types oar anchor: beginning and end of the string and word boundaries.
- The caret (^) and the dollar sign meta-characters are used to anchor a pattern to the beginning and the end of the searched string. The caret is always the first character in the pattern when used as an anchor. For example, '^one' will only match if the searched string starts with sequence of characters, one. The dollar sign is always the last character in the pattern when used as an anchor. For example, '(last|end)$' will match only if the searched string ends with either the character sequence last or the character sequence end.
- Word Boundaries: The \b meta-sequence will match the spot between a space and the first character of a word or between the last character of a word and the space. The \b will match at the beginning or end of a string if there are no leading or trailing spaces. For example, the regex '\bfoo\b' will match foo even without spaces surrounding the word. It will also match $foo because the dollar sign is not considered a word character. It will match foo but not foobar, and the regex '\bwiz' will match wizard but not geewiz. The \B meta-sequence will match everything except at a word boundary.
Quantifiers: There are several meta-characters that are devoted to controlling how many characters are matched. For example, the regex '\d{5}' means that five digits must be found for a match. The *, +, and ? meta-characters and the curly braces are all used as quantifiers. Ranges are also possible:
- {n} - matches n copies of the preceding character!
- {n,m} - matches at least n but not more than m copies of the preceding character
- {n,} - matches at least n copies of the preceding character.
Pattern Memory: Parentheses are used to store matched values into buffers for later recall. Sometimes they are called back-references. After you use the regex '(fish|fowl)' to match a line or a word and a match is found, the built in macrovariable \1 will hold either fish or fowl depending on which sequence was matched.

Meta-characters are characters that have an additional meaning above and beyond their literal meaning. For example, the period character can have two meanings in a pattern. First, it can be used to match a period character in the searched string - this is its literal meaning. And second, it can be used to match any character in the searched string except for the newline character - this is its meta-meaning. The following two components that can be used to construct complex patterns:

Variable Interpolation: Any variable in regex will be expanded to its value, and only then regex will be evaluated by Perl regex engine. Only one level of interpolation is done. This means that if the value of the variable includes, for example, $scalar as a string value, then $scalar will not be interpolated. In addition, back-quotes do not interpolate within double-quotes, and single-quotes do not stop interpolation of variables when used within double-quotes.

Anchors

The metacharacter differ in their behaviors. some of them can match zero number of characters of a particular class, but most require at least one such character. Here are examples of metacharacters that we already know:

\D (non-digit),
\d (digit),
\w (word),
\W (non-word)
\s (space)
\S (non-space)
. (dot, any character except newline)

Substrings matched by those metacharacters always have positive width. Or to put it differently the regular expression engine 'eats' characters in the process of matching.

The second group of characters does not eat any characters -- that means that they do not require any character to be present. This subclass is usually called anchors. Here are most important anchors:

^ (beginning of the string),
$ (ending of the string)
\b (word boundary)
\B (non-word boundary)

Anchors don't match a character, they match a condition. In other words, the regex '^cat\b' will match a string with the word 'cat' at the beginning of the line

Alternation

Alternation is the way to tell Perl regex engine that you wish to match one of two or more patterns. In other words, the regular expression:

grep -P '^foreach|^for|^while' myscript.pl

in a regular expression tells Perl regex engine "look for the line beginning with the string 'for' OR the string 'if' or the string 'while'." As an example, start with the following statement:

The ( | ) syntax split regular expression on sections and each section will be tried independently. Alternation always tries to match the first item in the parentheses. If it doesn't match, the second pattern is then tried and so on.

In this case the string foreach will never be matched as for will match before it. This is so common a mistake that I would like to recommend to put longest string first in such cases.

grep -P 'word(s?)'

The useful option for matching words is -i (ignore case). for example

grep -P 'word(s?)'

Backreferences

Suppose you want to search for a string which contains a certain substring in more than one place. An example is the heading tag in HTML. Suppose I wanted to search for <h1>some string</h1> . This is easy enough to do. But suppose I wanted to do the same but allow H2 H3 H4 H5 H6 in place of H1. The expression <h[1-6]>.*</h[1-6]> is not good enough since it matches <h1>Hello world</h3> but we want the opening tag to match the closing one. To do this, we use a backreference

Backreference is the expression \n where n is a number, matches the contents of the n'th set of parentheses in the expression

For example:

grep -Pi '\<h([1-6]\).*</h\1>' index.shtml

matches what we were trying to match before.

grep -Pi '\<h([1-6]).*</h\1>' ../Public*/index.shtml
<h2><a name="Latest">Recent updates</a></h2>
<h2><a href="switchboard.shtml">Softpanorama Switchboard</a></h2>
<h4><a href="switchboard.shtml">Switchboard </a>-- Links, Links, Links...</h4>
<h4><a name="Bookshelf">Bookshelf</a></h4>
<h4><a href="switchboard.shtml#recent_papers">Recent articles</a>:</h4>

Extended regular expressions

egrep uses matching patterns called extended regular expressions, which are similar to the pattern matching capabilities of Bash extended test command ( [[..]] ).

The extended regular expression uses the following compatible with Perl regex metasymbols:

- -- Zero or more characters (same as in Perl regex)
+ -- One or more characters (same as in Perl regex)
? -- Follows a character, which is optional (same as in Perl regex)
. -- Any single character (same as in Perl regex)
^ -- The start of the line (same as in Perl regex)
$ -- The end of the line (same as in Perl regex)
[...] -- A list of characters, including ranges and character classes (same as in Perl regex)
{n} -- Follows an item that is to appear n times (same as in Perl regex)
{n,} -- Follows an item that is to appear n or more times (same as in Perl regex)
{n,m} -- Follows an item that is to appear n to m times (same as in Perl regex)
(...) -- A sub-pattern that's used to change the order of operations (same as in Perl regex)

Notice that the symbols are not exactly the same as the globing symbols used for file matching. For example, on the command line a question mark represents any character, whereas in grep, the period has this effect.

The characters ?, +, {, |, (, and ) must appear escaped with backslashes to prevent Bash from treating them as file-matching characters.

For example if we search /var/log/messages first for message that contain work kernel and then work failure we will get

[root@test01 log]# grep  kernel  messages | grep failed
Sep 24 22:48:02 localhost kernel: tsc: Fast TSC calibration failed
Sep 24 22:48:02 localhost kernel: acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
Sep 24 22:48:02 localhost kernel: psmouse serio1: trackpoint: failed to get extended button data
Sep 24 22:48:14 localhost systemd: Dependency failed for ABRT kernel log watcher.

The asterisk (*) is a placeholder representing zero or more characters. Using this metasymbol we can rewrite previous query as:

[root@test01 log]# egrep 'kernel.*failed' messages
Sep 24 22:48:02 localhost kernel: tsc: Fast TSC calibration failed
Sep 24 22:48:02 localhost kernel: acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
Sep 24 22:48:02 localhost kernel: psmouse serio1: trackpoint: failed to get extended button data

The caret (^) character indicates the beginning of a line. Use the caret to check for a pattern at the start of a line. The --invert-match (or -v) switch shows the lines that do not match. Lines that match are not shown. This often valuable for analyzing config file -- it allow to delete all the comments making "meaningful" line more visible

[root@test01 etc]# grep -v '^#' /etc/sudoers

Defaults !visiblepw

Defaults always_set_home

Defaults env_reset
Defaults env_keep = "COLORS DISPLAY HOSTNAME HISTSIZE KDEDIR LS_COLORS"
Defaults env_keep += "MAIL PS1 PS2 QTDIR USERNAME LANG LC_ADDRESS LC_CTYPE"
Defaults env_keep += "LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES"
Defaults env_keep += "LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE"
Defaults env_keep += "LC_TIME LC_ALL LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY"

Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin

root ALL=(ALL) ALL

%wheel ALL=(ALL) NOPASSWD: ALL

The --ignore-case (or -i) switch makes the search case insensitive.

grep -i error /var/log/messages

Regular expressions can be joined together with a vertical bar (|). This has the same effect as combining the results of two separate grep commands.

egrep -i 'error|fail|crash' /var/log/messages
[root@test01 etc]# egrep -i 'error|fail|crash' /var/log/messages
Sep 24 22:48:02 localhost kernel: tsc: Fast TSC calibration failed
Sep 24 22:48:02 localhost kernel: acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
Sep 24 22:48:02 localhost kernel: acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
Sep 24 22:48:02 localhost kernel: crash memory driver: version 1.1
Sep 24 22:48:02 localhost kernel: psmouse serio1: trackpoint: failed to get extended button data
Sep 24 22:48:14 localhost systemd: Dependency failed for ABRT Xorg log watcher.
Sep 24 22:48:14 localhost systemd: Job abrt-xorg.service/start failed with result 'dependency'.
Sep 24 22:48:14 localhost systemd: Dependency failed for Harvest vmcores for ABRT.
Sep 24 22:48:14 localhost systemd: Job abrt-vmcore.service/start failed with result 'dependency'.
Sep 24 22:48:14 localhost systemd: Dependency failed for Install ABRT coredump hook.
Sep 24 22:48:14 localhost systemd: Job abrt-ccpp.service/start failed with result 'dependency'.
Sep 24 22:48:14 localhost systemd: Dependency failed for ABRT kernel log watcher.
Sep 24 22:48:14 localhost systemd: Job abrt-oops.service/start failed with result 'dependency'.
Sep 24 22:48:14 localhost rngd: read error
Sep 24 22:48:14 localhost rngd: read error
Sep 24 22:48:29 localhost python: 2018/09/24 22:48:29.828375 INFO sfdisk with --part-type failed [1], retrying with -c
Sep 24 22:48:29 localhost python: 2018/09/24 22:48:29.926344 INFO sfdisk with --part-type failed [1], retrying with -c
Sep 24 22:50:04 localhost python: 2018/09/24 22:50:04.956978 WARNING Download failed, switching to host plugin

To identify the matching line, the --line-number (or -n) switch displays both the line number and the line. Using cut, head, and tail, the first line number can be saved in a variable. The number of bytes into the file can be shown with --byte-offset (or -b).

$ grep -n "crash" orders.txt

The --count (or -c) switch counts the number of matches and displays the total.

grep recognizes the standard character classes as well.

$ grep "[[:cntrl:]]" orders.txt

A complete list of Linux grep switches can be found in man page

Basic regex

Basic regular expression (also called DOS-style regular expression) is the most well known by sysadmin type of regex as it is used on command line with other utilities such as ls.

NOTE: In grep basic regular expressions allow alternation but you need to remember to use backslash before any special character in a regular expressions. For example:

grep 'if|while'     #-- wrong

grep 'if\|while'     #-- will work, please note single quotes

Please use egrep or grep -P instead. In complex cases please always use -P option (Perl regular expression option -- available only in GNU grep)

In complex cases always use Perl or use grep -P option

Using quotes

Single quotes are the safest to use, because they protect your regular expression from the shell. For example, grep ! file will often produce an error (since the shell thinks that "!" is referring to the shell command history) while grep '!' file will not.

When should you use single quotes ?

The answer is this: if you want to use shell variables, you need double quotes; otherwise always use single quotes.

For example,

grep "$HOME" file

searches file for the name of your home directory, while

grep '$HOME' file

searches for the string $HOME

Major options

A complete list of Linux grep switches can be found in man page. Default options can be specified via environmnet variable GREP_OPTIONS

Below are most useful grep options, that any sysadmin must know and use:

-i Ignore case. Match either upper- or lowercase.
-v Print only lines that do not match the pattern.
-H, --with-filename Print the file name for each match. This is the default when there is more than one file to search but essential if you
-A n Show n lines after the matching line.
-B n Show n lines before the matching line.
-C n Show n lines before and after the matching line.
-n Print the matched line and its line number.
-l Print only the names of files with matching lines (this is the lowercase letter "L"). So output contains only matching filenames (rather than the lines in those files that contain the search pattern). There is also an opposite option -L ( --files-without-match ) which also suppress normal output; instead print the name of each input file that have zero matches of specified regex. Scanning of the file stops on the first match, which increases efficiency.
-c Print only the count of matching lines.
-w (or --word-regexp ) Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.
-x (or --line-regexp ) Select only those matches that exactly match the whole line.

Listing filename of the matching line, options -H and -l

If grep is invoked with two or more files as arguments, it lists the filename in which particular match is found. Problems arise if this is a single file. In this case by default grep does not list the name of the file. As grep of open used with find via option -exec (see below) this is a very important "use case" and GNU grep provides a special option of handle it -- option -H. In older versions of grep there was no such possibility and you need to imitate it by supplying dummy file /dev/null as the second file in such cases. For example

egrep -H 'error|crash' /var/log/messages
egrep 'error|crash' /var/log/messages /dev/null # same effect as specified option -H, used for old version of grep used in Solaris, HP-UX, and AIX.

Option -l allows you to list only files that contain the search string. To reverse section, obtaining "do not contain" effect use option -v. This option is mainly useful in scripts. It is seldom used on command line. For example, if we have daily HTTP logs and want to determine when particular IP accessed the site we can use:

egrep -l '10.10.5.4' http_logs*

Printing context of the matching line, options -B (before), and -A (after)

GNU grep is able to output lines in the vicinity of the match line. Which in many cases it is extremely important to the context of matching like. This is a typical situation in troubleshooting. GNU grep provides very flexible capabilities for this (you can also print the line number of matching line with option -n, see below).

-A NUM, --after-context=NUM Print NUM lines of trailing context after matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
-B NUM, --before-context=NUM Print NUM lines of leading context before matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
-C NUM, -NUM, --context=NUM Print NUM lines of output context. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.

Ignoring case

i, --ignore-case: Ignore case distinctions in both the PATTERN and the input files. (-i is specified by POSIX .)

This is an important option often used on grepping the logs for error messages, using specific keywords.

Reversal of matching

-v, --invert-match Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX .)

Specifying multiple regular expressions in the file

The relevant option is:

-f FILE, --file=FILE Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing. (-f is specified by POSIX .)

Regex in lines are interpreted as connected by logical "OR", so it there are three lines containing regex, any line that matches at least one of them will be printed in output. See discussion of a very perverted example at Getting all the matches with 'grep -f' option - Stack Overflow

This is convenient option for scanning the logs for error messages, using specific keywords.

Grep Recursive Search

GNU grep can recursively search for a regex or fixed string via -r option (or --recursive). By default it does not follow symbolic links. To follow all symbolic links, use the -R option (or --dereference-recursive).

But traditionally for recursive search grep is combined with find (see below)

Examples

Fgrep:

fgrep -l 'hahaha' * # just the names of matching files

fgrep  'May 16'  /var/logs/https/access # we are searching string, so fgrep is better

fgrep -v 'yahoo.com' /var/logs/https/access  # filtering yahoo.com using -v options

find . -type f -print | xargs fgrep -l 'hahaha'

More complex example: remove lines from invoices.txt if they appear in paid_today.txt (note the elegance of the solution -- one input file serves as a set of fixed string for grep to match in the other):

fgrep -xvf paid_today.txt invoices.txt > paidinvoices.txt

Grep:

Suppose you want to match a specific number of repetitions of a pattern. A good example is IP address. You could search for an arbitrary IP address like this:

grep -P '[:digit:]{1,3}(\.[:digit:]{1,3}){3}' file

There is actually no difference between [0-9] and [[:digit:]] but the latter can be faster.

The same can be done for phone numbers written in 999-999-9999 form:

([[:digit:]]{3}[[:punct:]]){2}[[:digit:]]{4}

To search email that has come from a certain address:

grep -P '^From:.*somebody\@' /var/spool/mail/root

To search several variants of the same name:

grep -P 'Nic?k\(olai\)\? Bezroukov '  # matches "Nick Bezroukov" or "Nikolai Bezroukov".

grep -P 'cat|dog' file # matches lines containing the word "cat" or the word "dog"

grep -P '^\(From\|To\|Subject\):'  # matches corresponding part of the email header

Using -l option

grep -Pl 'nobody\@nowhere'

Using -w option

grep -w '\<abuse' *
grep -w 'abuse\>' *

The first command searches for those lines where any word in that line begins with the letters 'abuse' and the second command searches for those lines where any word in that line ends with the letter 'abuse'

See also

Grep with pipes

The output of grep can also be piped to another program as follows:

ps -ef | grep httpd | wc  -l

The example above counts how many 'httpd' processes are running:

We can count letters from particular spammer using the following pipe:

ls -l | xargs -n1 fgrep -H '[email protected]'

Note: option -H -- it instructs grep to output the name of the file in all cases. By default grep output the name of the file only if more then one file is specified as the argument.

During debugging comments often obscure the logic of the program and interfere with the search of a bug. Here is how to display non comment lines of myscript.pl

grep -v '^#' ~/myscript.pl | less

Grep is very useful as a simple yet powerful HTML analyzer. Here is how find HTML tags that are not closed before the line break.

egrep '<[^>]*$' *.html

You can also use grep to search from files that are unzipped from standard input but it is better to use available wrappers such as zgrep and bzgrep. zgrep is an wrapper for grep that can invoke the grep on compressed or gzip'ed files. All options specified are passed directly to grep. If no file is specified, then the standard input is decompressed and fed to grep. Otherwise the given files are uncompressed if necessary and fed to grep.

grep gzip files
---------------
zgrep foo myfile.gz                           # all lines containing the pattern 'foo'
zgrep 'GET /blog' access_log.gz               # all lines containing 'GET /blog'
zgrep 'GET /blog' access_log.gz | more        # same thing, case-insensitive

Tips

Tip 1: How to block an extra line when grepping ps output for a string or pattern:

ps -ef | grep '[c]ron'

If the pattern had been written without the square brackets, it would have matched not only the ps output line for cron, but also the ps output line for grep. on the length of a line except the available memory.

Tip 2: How do I search directories recursively?

grep -r 'hello' ~/*.html

Newer version of grep has -r option. Example above searches for `hello' in all html files under the user home directory. For more control of which files are searched, use find and xargs. For example,

find ~ -name *html -type f -print | xargs grep 'hello'

Tip 3: How do I output context around the matching lines?

grep -C 2 'hello' * # prints two lines of context around each matching line.

Using grep with find

The Linux find command searches for files that meet specific conditions such as files with a certain name or files greater than a certain size. find is similar to the following loop where MATCH is the matching criteria:

ls --recursive | while read FILE ; do
     # test file for a match
    if [ $MATCH ] ; then
       printf "%s\n" "$FILE"
    fi
done

This script recursively searches directories under the current directory, looking for a filename that matches some condition called MATCH.

find is much more powerful than this script fragment. Like the built-in test command, find switches create expressions describing the qualities of the files to find. There are also switches to change the overall behavior of find and other switches to indicate actions to perform when a match is made.

The basic matching switch is -name, which indicates the name of the file to find. Name can be a specific filename or it can contain shell path wildcard globbing characters like * and ?. If pattern matching is used, the pattern must be enclosed in quotation marks to prevent the shell from expanding it before the find command examines it.

find /etc -name "*.conf"

The previous find command matches any type of file, including files such as pipes or directories, which is not usually the intention of a user. The -type switch limits the files to a certain type of file. The -type f switch matches only regular files, the most common kind of search. The type can also be b (block device), c (character device), d (directory), p (pipe), l (symbolic link), or s (socket).

find /etc -name "*.conf"  -type f

The switch -name "*.conf" -type f is an example of a find expression. These switches match a file that meets both of these conditions (implicitly, a logical "and"). There are other operator switches for combining conditions into logical expressions, as follows:

( expr )-- Forces the switches in the parentheses to be tested first
-not expr (or ! expr)-- Ensures that the switch is not matched
expr -and expr (or expr -a expr)-- The default behavior; looks for files that match both sets of switches
expr -or expr (or expr -o expr)-- Logical "or". Looks for files that match either sets of switches
expr , expr-- Always checks both sets of switches, but uses the result of the right set to determine a match

For example, to count the number of regular files and directories, do this:

[root@test01 etc]# find /etc -name "*.conf"  -type f | wc -l
145

The number of files without suffix .conf can be counted as well.

find . ! -name "*.conf" -type f | wc -l

Parentheses must be escaped by a backslash or quotes to prevent Bash from interpreting them as a subshell. Using parentheses, the number of files ending in .txt or .sh can be expressed as

$ find . "(" -name "*.conf" -or -name "*.config" ")" -type f | wc -l

Some expression switches refer to measurements of time. Historically, find times were measured in days, but the GNU version adds min switches for minutes. find looks for an exact match.

To search for files older than an amount of time, include a plus or minus sign. If a plus sign (+) precedes the amount of time, find searches for times greater than this amount. If a minus sign (-) precedes the time measurement, find searches for times less than this amount. The plus and minus zero days designations are not the same: +0 in days means "older than no days," or in other words, files one or more days old. Likewise, -5 in minutes means "younger than 5 minutes" or "zero to four minutes old".

There are several switches used to test the access time, which is the time a file was last read or written. The -anewer switch checks to see whether one file was accessed more recently than a specified file. -atime tests the number of days ago a file was accessed. -amin checks the access time in minutes.

Likewise, you can check the inode change time with -cnewer, -ctime, and -cmin. The inode time usually, but not always, represents the time the file was created. You can check the modified time, which is the time a file was last written to, by using -newer, -mtime, and -mmin.

To find files that haven't been changed in more than one day:

find /etc -name "*.conf" -type f -mtime +0

To find files that were modified in the hour:

[root@test01 etc]# find /etc -type f -mmin -60
/etc/sudoers

The -size switch tests the size of a file. The default measurement is 512-byte blocks, which is counterintuitive to many users and a common source of errors. Unlike the time-measurement switches, which have different switches for different measurements of time, to change the unit of measurement for size you must follow the amount with a b (bytes), c (characters), k (kilobytes), or w (16-bit words). There is no m (megabyte). Like the time measurements, the amount can have a minus sign (-) to test for files smaller than the specified size, or a plus sign (+) to test for larger files.

For example, use this to find log files greater than 1GBMB:

$ find / -type f  -size +1G

find shows the matching paths on standard output. Historically, the -print switch had to be used. Printing the paths is now the default behavior for most Unix-like operating systems, including Linux. If compatibility is a concern, add -print to the end of the find parameters.

To perform a different action on a successful match, use -exec. The -exec switch runs a program on each matching file. This is often combined with rm to delete matching files, or grep to further test the files to see whether they contain a certain pattern. The name of the file is inserted into the command by a pair of curly braces ({}) and the command ends with an escaped semicolon. (If the semicolon is not escaped, the shell interprets it as the end of the find command instead.)

$ find . -type f -name "*.txt" -exec grep 10.10.10.10 {} \;

More than one action can be specified. To show the filename after a grep match, include -print.

$ find . -type f -name "*.txt" -exec grep Table {} \; -print

find expects {} to appear by itself (that is, surrounded by whitespace). It can't be combined with other characters, such as in an attempt to form a new pathname.

The -exec switch can be slow for a large number of files: The command must be executed for each match. When you have the option of piping the results to a second command, the execution speed is significantly faster than when using -exec. A pipe generates the results with two commands instead of hundreds or thousands of commands.

The -ok switch works the same way as -exec except that it interactively verifies whether the command should run.

$ find . -type f -name "*.txt" -ok rm {} \;
< rm ... ./orders.txt > ? n
< rm ... ./advocacy/linux.txt > ? n
< rm ... ./advocacy/old_orders.txt > ? n

The -ls action switch lists the matching files with more detail.

The -printf switch makes find act like a searching version of the statftime command. The % format codes indicate what kind of information about the file to print. Many of these provide the same functions as statftime, but use a different code.

%a-- File's last access time in the format returned by the C ctime function.
%c-- File's last status change time in the format returned by the C ctime function.
%f-- File's name with any leading directories removed (only the last element).
%g-- File's group name, or numeric group ID if the group has no name.
%h-- Leading directories of file's name (all but the last element).
%i-- File's inode number (in decimal).
%m-- File's permission bits (in octal).
%p-- File's pathname.
%P-- File's pathname with the name of the command line argument under which it was found removed.
%s-- File's size in bytes.
%t-- File's last modification time in the format returned by the C ctime function.
%u-- File's username, or numeric user ID if the user has no name.

Grep alternatives

There are also several variants of grep that can search directly in archives, for example gzgrep and bzgrep. gzgrep is an envelope for grep that can invoke the grep on compressed or gzip'ed files. All options specified are passed directly to grep. If no file is specified, then the standard input is decompressed and fed to grep. Otherwise the given files are uncompressed if necessary and fed to grep.

Grep has one useful option for grepping file extacted from archive

--label=LABEL
Display input actually coming from standard input as input coming from file LABEL.
This is especially useful for tools like zgrep, e.g. gzip -cd foo.gz | grep --label=foo something

Clearly as one of the oldest Unix utilities grep can be improved. There are several alternative implementations, each of which is better then original grep in several major ways but not enough to displace grep:

pcregrep -- A separate implementation of grep that uses Perl regular expressions exclusively (see pcregrep specification). It uses the same options as grep, and as such is a very convenient replacement. Now this important functionality is re-implemented in GNU grep via option-P.
ack -- ack is designed for searching in source code. It is written in Perl. Major options are compatible, but its behaviour slightly deviates from grep. Ask by default searches entire trees while ignoring Subversion, Git and other VCS directories and other files that aren't your source code. Where grep is a general text search tool, ack is especially for the programmer searching source code. Common tasks take fewer keystrokes.

Dr. Nikolai Bezroukov

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

NEWS CONTENTS

20181115 : Is Glark a Better Grep Linux.com The source for Linux information ( Nov 15, 2018 , www.linux.com )
20181029 : Getting all the matches with 'grep -f' option ( Oct 29, 2018 , stackoverflow.com )
20171109 : Searching files ( sanctum.geek.nz )
20171101 : Default grep options by Tom Ryder ( May 18, 2012 , sanctum.geek.nz )
20171031 : Counting with grep and uniq by Tom Ryder ( Feb 18, 2012 , sanctum.geek.nz )
20110730 : pcregrep(1) grep with Perl-compatible regex - Linux man page ( pcregrep(1) grep with Perl-compatible regex - Linux man page, Jul 30, 2011 )
20090804 : Tech Tip View Config Files Without Comments Linux Journal ( Tech Tip View Config Files Without Comments Linux Journal, Aug 4, 2009 )
20090318 : UNIX BASH scripting Highlight match with color in grep command ( UNIX BASH scripting Highlight match with color in grep command, Mar 18, 2009 )
20080911 : glark by Jeff Pace ( glark, Sep 11, 2008 )
20080506 : ack! - Perl-based grep replacement ( ack! - Perl-based grep replacement, May 06, 2008 )
20060531 : Linux.com GNU greps new features ( Linux.com GNU grep's new features, May 31, 2006 )
20060531 : [ z a z z y b o b . c o m ] -usr-share-doc-tips ( [ z a z z y b o b . c o m ] -usr-share-doc-tips, )
20060531 : pcregrep-4.5-1.i386 RPM ( pcregrep-4.5-1.i386 RPM, )
20060531 : Re Replacing GNU grep revisited ( Re Replacing GNU grep revisited, )

Old News ;-)

[Nov 15, 2018] Is Glark a Better Grep Linux.com The source for Linux information

Notable quotes:

"... stringfilenames ..."

Nov 15, 2018 | www.linux.com

Is Glark a Better Grep? GNU grep is one of my go-to tools on any Linux box. But grep isn't the only tool in town. If you want to try something a bit different, check out glark a grep alternative that might might be better in some situations.

What is glark? Basically, it's a utility that's similar to grep, but it has a few features that grep does not. This includes complex expressions, Perl-compatible regular expressions, and excluding binary files. It also makes showing contextual lines a bit easier. Let's take a look.

I installed glark (yes, annoyingly it's yet another *nix utility that has no initial cap) on Linux Mint 11. Just grab it with apt-get install glark and you should be good to go.

Simple searches work the same way as with grep : glark stringfilenames . So it's pretty much a drop-in replacement for those.

But you're interested in what makes glark special. So let's start with a complex expression, where you're looking for this or that term:

glark -r -o thing1 thing2 *

This will search the current directory and subdirectories for "thing1" or "thing2." When the results are returned, glark will colorize the results and each search term will be highlighted in a different color. So if you search for, say "Mozilla" and "Firefox," you'll see the terms in different colors.

You can also use this to see if something matches within a few lines of another term. Here's an example:

glark --and=3 -o Mozilla Firefox -o ID LXDE *

This was a search I was using in my directory of Linux.com stories that I've edited. I used three terms I knew were in one story, and one term I knew wouldn't be. You can also just use the --and option to spot two terms within X number of lines of each other, like so:

glark --and=3 term1 term2

That way, both terms must be present.

You'll note the --and option is a bit simpler than grep's context line options. However, glark tries to stay compatible with grep, so it also supports the -A , -B and -C options from grep.

Miss the grep output format? You can tell glark to use grep format with the --grep option.

Most, if not all, GNU grep options should work with glark .
Before and After
If you need to search through the beginning or end of a file, glark has the --before and --after options (short versions, -b and -a ). You can use these as percentages or as absolute number of lines. For instance:

glark -a 20 expression *

That will find instances of expression after line 20 in a file.
The glark Configuration File
Note that you can have a ~/.glarkrc that will set common options for each use of glark (unless overridden at the command line). The man page for glark does include some examples, like so:
after-context:     1
before-context:    6
context:           5
file-color:        blue on yellow
highlight:         off
ignore-case:       false
quiet:             yes
text-color:        bold reverse
line-number-color: bold
verbose:           false
grep:              true
Just put that in your ~/.glarkrc and customize it to your heart's content. Note that I've set mine to grep: false and added the binary-files: without-match option. You'll definitely want the quiet option to suppress all the notes about directories, etc. See the man page for more options. It's probably a good idea to spend about 10 minutes on setting up a configuration file.
Final Thoughts
One thing that I have noticed is that glark doesn't seem as fast as grep . When I do a recursive search through a bunch of directories containing (mostly) HTML files, I seem to get results a lot faster with grep . This is not terribly important for most of the stuff I do with either utility. However, if you're doing something where performance is a major factor, then you may want to see if grep fits the bill better.

Is glark "better" than grep? It depends entirely on what you're doing. It has a few features that give it an edge over grep, and I think it's very much worth trying out if you've never given it a shot.

[Oct 29, 2018] Getting all the matches with 'grep -f' option

Perverted example, but interesting question.

Oct 29, 2018 | stackoverflow.com

Arturo ,Mar 24, 2017 at 8:59

I would like to find all the matches of the text I have in one file ('file1.txt') that are found in another file ('file2.txt') using the grep option -f, that tells to read the expressions to be found from file.
'file1.txt'

a

a

'file2.txt'

a

When I run the command:

grep -f file1.txt file2.txt -w

I get only once the output of the 'a'. instead I would like to get it twice, because it occurs twice in my 'file1.txt' file. Is there a way to let grep (or any other unix/linux) tool to output a match for each line it reads? Thanks in advance. Arturo

RomanPerekhrest ,Mar 24, 2017 at 9:02

the matches of the text - some exact text? should it compare line to line? – RomanPerekhrest Mar 24 '17 at 9:02

Arturo ,Mar 24, 2017 at 9:04

Yes it contains exact match. I added the -w options, following your input. Yes, it is a comparison line by line. – Arturo Mar 24 '17 at 9:04

Remko ,Mar 24, 2017 at 9:19
Grep works as designed, giving only one output line. You could use another approach:
while IFS= read -r pattern; do
    grep -e $pattern file2.txt
done < file1.txt
This would use every line in file1.txt as a pattern for the grep, thus resulting in the output you're looking for.
Arturo ,Mar 24, 2017 at 9:30

That did the trick!. Thank you. And it is even much faster than my previous grep command. – Arturo Mar 24 '17 at 9:30

ar7 ,Mar 24, 2017 at 9:12
When you use
grep -f pattern.txt file.txt
It means match the pattern found in pattern.txt in the file file.txt .

It is giving you only one output because that is all is there in the second file.

Try interchanging the files,
grep -f file2.txt file1.txt -w
Does this answer your question?
Arturo ,Mar 24, 2017 at 9:17

I understand that, but still I would like to find a way to print a match each time a pattern (even a repeated one) from 'pattern.txt' is found in 'file.txt'. Even a tool or a script rather then 'grep -f' would suffice. – Arturo Mar 24 '17 at 9:17

[Nov 09, 2017] Searching files

Notable quotes:

"... With all this said, there's a very popular alternative to grep called ack , which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep , and being a Perl script it's otherwise very simple to install. ..."

"... Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep , but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available. ..."

sanctum.geek.nz

More often than attributes of a set of files, however, you want to find files based on their contents, and it's no surprise that grep, in particular grep -R, is useful here. This searches the current directory tree recursively for anything matching 'someVar':
$ grep -FR someVar .
Don't forget the case insensitivity flag either, since by default grep works with fixed case:
$ grep -iR somevar .
Also, you can print a list of files that match without printing the matches themselves with grep -l:
$ grep -lR someVar .
If you write scripts or batch jobs using the output of the above, use a while loop with read to handle spaces and other special characters in filenames:
grep -lR someVar | while IFS= read -r file; do
    head "$file"
done
If you're using version control for your project, this often includes metadata in the .svn, .git, or .hg directories. This is dealt with easily enough by excluding (grep -v) anything matching an appropriate fixed (grep -F) string:
$ grep -R someVar . | grep -vF .svn
Some versions of grep include --exclude and --exclude-dir options, which may be tidier.

With all this said, there's a very popular alternative to grep called ack, which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep, and being a Perl script it's otherwise very simple to install.

Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep, but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available.

[Nov 01, 2017] Default grep options by Tom Ryder

May 18, 2012 | sanctum.geek.nz

When you're searching a set of version-controlled files for a string with grep , particularly if it's a recursive search, it can get very annoying to be presented with swathes of results from the internals of the hidden version control directories like .svn or .git , or include metadata you're unlikely to have wanted in files like .gitmodules .

GNU grep uses an environment variable named GREP_OPTIONS to define a set of options that are always applied to every call to grep . This comes in handy when exported in your .bashrc file to set a "standard" grep environment for your interactive shell. Here's an example of a definition of GREP_OPTIONS that excludes a lot of patterns which you'd very rarely if ever want to search with grep :
GREP_OPTIONS=
for pattern in .cvs .git .hg .svn; do
    GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern
done
export GREP_OPTIONS
Note that --exclude-dir is a relatively recent addition to the options for GNU grep , but it should only be missing on very legacy GNU/Linux machines by now. If you want to keep your .bashrc file compatible, you could apply a little extra hackery to make sure the option is available before you set it up to be used:
GREP_OPTIONS=
if grep --help | grep -- --exclude-dir &>/dev/null; then
    for pattern in .cvs .git .hg .svn; do
        GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern"
    done
fi
export GREP_OPTIONS
Similarly, you can ignore single files with --exclude . There's also --exclude-from=FILE if your list of excluded patterns starts getting too long.

Other useful options available in GNU grep that you might wish to add to this environment variable include:

--color -- On appropriate terminal types, highlight the pattern matches in output, among other color changes that make results more readable

-s -- Suppresses error messages about files not existing or being unreadable; helps if you find this behaviour more annoying than useful.

-E, -F, or -P -- Pick a favourite "mode" for grep ; devotees of PCRE may find adding -P for grep 's experimental PCRE support makes grep behave in a much more pleasing way, even though it's described in the manual as being experimental and incomplete

If you don't want to use GREP_OPTIONS , you could instead simply set up an alias :
alias grep='grep --exclude-dir=.git'
You may actually prefer this method as it's essentially functionally equivalent, but if you do it this way, when you want to call grep without your standard set of options, you only have to prepend a backslash to its call:
$ \grep pattern file
Commenter Andy Pearce also points out that using this method can avoid some build problems where GREP_OPTIONS would interfere.

Of course, you could solve a lot of these problems simply by using ack but that's another post. Posted in Bash Tagged ack , alias , color , default , environment , exclude , grep , grep_options , options , pcre , variable , version control

[Oct 31, 2017] Counting with grep and uniq by Tom Ryder

Feb 18, 2012 | sanctum.geek.nz

A common idiom in Unix is to count the lines of output in a file or pipe with wc -l :
$ wc -l example.txt
43
$ ps -e | wc -l
97
Sometimes you want to count the number of lines of output from a grep call, however. You might do it this way:
$ ps -ef | grep apache | wc -l
6
But grep has built-in counting of its own, with the -c option:
$ ps -ef | grep -c apache
6
The above is more a matter of good style than efficiency, but another tool with a built-in counting option that could save you time is the oft-used uniq . The below example shows a use of uniq to filter a sorted list into unique rows:
$ ps -ef | awk '{print $1}' | sort | uniq
105
daemon
lp
mysql
nagios
postfix
root
snmp
tom
UID
www-data
If it would be useful to know in this case how many processes were being run by each of these users, you can include the -c option for uniq :
$ ps -ef | awk '{print $1}' | sort | uniq -c
    1 105
    1 daemon
    1 lp
    1 mysql
    1 nagios
    2 postfix
    78 root
    1 snmp
    7 tom
    1 UID
    5 www-data
You could even sort this output itself to show the users running the most processes first with sort -rn :
$ ps -ef | awk '{print $1}' | sort | uniq -c | sort -rn
    78 root
    8 tom
    5 www-data
    2 postfix
    1 UID
    1 snmp
    1 nagios
    1 mysql
    1 lp
    1 daemon
    1 105
Incidentally, if you're not counting results and really do just want a list of unique users, you can leave out the uniq and just add the -u flag to sort :
$ ps -ef | awk '{print $1}' | sort -u
105
daemon
lp
mysql
nagios
postfix
root
snmp
tom
UID
www-data
The above means I actually find myself using uniq with no options quite seldom.

[Jul 30, 2011] pcregrep(1) grep with Perl-compatible regex - Linux man page

pcregrep searches files for character patterns, in the same way as other grep commands do, but it uses the PCRE regular expression library to support patterns that are compatible with the regular expressions of Perl 5. See pcrepattern for a full description of syntax and semantics of the regular expressions that PCRE supports

[Aug 4, 2009] Tech Tip View Config Files Without Comments Linux Journal

I've been using this grep invocation for years to trim comments out of config files. Comments are great but can get in your way if you just want to see the currently running configuration. I've found files hundreds of lines long which had fewer than ten active configuration lines, it's really hard to get an overview of what's going on when you have to wade through hundreds of lines of comments.
$ grep ^[^#] /etc/ntp.conf
The regex ^[^#] matches the first character of any line, as long as that character that is not a #. Because blank lines don't have a first character they're not matched either, resulting in a nice compact output of just the active configuration lines.

[Mar 18, 2009] UNIX BASH scripting Highlight match with color in grep command

You can change this color by setting the GREP_COLOR environment variable to different combinations (from the color code list given below).
I use
$ export GREP_COLOR='1;30;43'
which basically highlights the matched pattern with foreground color black and background color yellow (shown below in the snap).

The set display attributes list:
0 Reset all attributes
1 Bright
2 Dim
4 Underscore
5 Blink
7 Reverse
8 Hidden
Foreground Colours
30 Black
31 Red
32 Green
33 Yellow
34 Blue
35 Magenta
36 Cyan
37 White
Background Colours
40 Black
41 Red
42 Green
43 Yellow
44 Blue
45 Magenta
46 Cyan
47 White

[Sep 11, 2008] glark by Jeff Pace

Ruby based

glark offers grep-like searching of text files, with very powerful, complex regular expressions (e.g., "/foo\w+/ and /bar[^\d]*baz$/ within 4 lines of each other"). It also highlights the matches, displays context (preceding and succeeding lines), does case-insensitive matches, and automatic exclusion of non-text files. It supports most options from the GNU version of grep.

[May 06, 2008] ack! - Perl-based grep replacement

There are some tools that look like you will never replace them. One of those (for me) is grep. It does what it does very well (remarks about the shortcomings of regexen in general aside). It works reasonably well with Unicode/UTF-8 (a great opportunity to Fail Miserably for any tool, viz. a2ps).
Yet, the other day I read about ack, which claims to be "better than grep, a search tool for programmers". Woo. Better than grep? In what way?
The ack homepage lists the top ten reasons why one should use it instead of grep. Actually, it's thirteen reasons but then some are dupes. So I'd say "about ten reasons". Let's look at them in order.
It's blazingly fast because it only searches the stuff you want searched.
Wait, how does it know what I want? A DWIM-Interface at last? Not quite. First off, ack is faster than grep for simple searches. Here's an example:
$ time ack 1Jsztn-000647-SL exim_main.log >/dev/null
real    0m3.463s
user    0m3.280s
sys     0m0.180s
$ time grep -F 1Jsztn-000647-SL exim_main.log >/dev/null
real    0m14.957s
user    0m14.770s
sys     0m0.160s
Two notes: first, yes, the file was in the page cache before I ran ack; second, I even made it easy for grep by telling it explicitly I was looking for a fixed string (not that it helped much, the same command without -F was faster by about 0.1s). Oh and for completeness, the exim logfile I searched has about two million lines and is 250M. I've run those tests ten times for each, the times shown above are typical.
So yes, for simple searches, ack is faster than grep. Let's try with a more complicated pattern, then. This time, let's use the pattern (klausman|gentoo) on the same file. Note that we have to use -E for grep to use extended regexen, which ack in turn does not need, since it (almost) always uses them. Here, grep takes its sweet time: 3:56, nearly four minutes. In contrast, ack accomplished the same task in 49 seconds (all times averaged over ten runs, then rounded to integer seconds).
As for the "being clever" side of speed, see below, points 5 and 6
ack is pure Perl, so it runs on Windows just fine.
This isn't relevant to me, since I don't use windows for anything where I might need grep. That said, it might be a killer feature for others.
The standalone version uses no non-standard modules, so you can put it in your ~/bin without fear.
Ok, this is not so much of a feature than a hard criterion. If I needed extra modules for the whole thing to run, that'd be a deal breaker. I already have tons of libraries, I don't need more undergrowth around my dependency tree.
Searches recursively through directories by default, while ignoring .svn, CVS and other VCS directories.
This is a feature, yet one that wouldn't pry me away from grep: -r is there (though it distinctly feels like an afterthought). Since ack ignores a certain set of files and directories, its recursive capabilities where there from the start, making it feel more seamless.
ack ignores most of the crap you don't want to search
To be precise:

VCS directories

blib, the Perl build directory

backup files like foo~ and #foo#

binary files, core dumps, etc.

Most of the time, I don't want to search those (and have to exclude them with grep -v from find results). Of course, this ignore-mode can be switched off with ack (-u). All that said, it sure makes command lines shorter (and easier to read and construct). Also, this is the first spot where ack's Perl-centricism shows. I don't mind, even though I prefer that other language with P.
Ignoring .svn directories means that ack is faster than grep for searching through trees.
Dupe. See Point 5
Lets you specify file types to search, as in --perl or --nohtml.
While at first glance, this may seem limited, ack comes with a plethora of definitions (45 if I counted correctly), so it's not as perl-centric as it may seem from the example. This feature saves command-line space (if there's such a thing), since it avoids wild find-constructs. The docs mention that --perl also checks the shebang line of files that don't have a suffix, but make no mention of the other "shipped" file type recognizers doing so.
File-filtering capabilities usable without searching with ack -f. This lets you create lists of files of a given type.
This mostly is a consequence of the feature above. Even if it weren't there, you could simply search for "."
Color highlighting of search results.
While I've looked upon color in shells as kinda childish for a while, I wouldn't want to miss syntax highlighting in vim, colors for ls (if they're not as sucky as the defaults we had for years) or match highlighting for grep. It's really neat to see that yes, the pattern you grepped for indeed matches what you think it does. Especially during evolutionary construction of command lines and shell scripts.
Uses real Perl regular expressions, not a GNU subset
Again, this doesn't bother me much. I use egrep/grep -E all the time, anyway. And I'm no Perl programmer, so I don't get withdrawal symptoms every time I use another regex engine.
Allows you to specify output using Perl's special variables
This sounds neat, yet I don't really have a use case for it. Also, my perl-fu is weak, so I probably won't use it anyway. Still, might be a killer feature for you.
The docs have an example:
ack '(Mr|Mr?s)\. (Smith|Jones)' --output='$&'
Many command-line switches are the same as in GNU grep:
Specifically mentioned are -w, -c and -l. It's always nice if you don't have to look up all the flags every time.
Command name is 25% fewer characters to type! Save days of free-time! Heck, it's 50% shorter compared to grep -r
Okay, now we have proof that not only the ack webmaster can't count, he's also making up reasons for fun. Works for me.
Bottom line: yes, ack is an exciting new tool which partly replaces grep. That said, a drop-in replacement it ain't. While the standalone version of ack needs nothing but a perl interpreter and its standard modules, for embedded systems that may not work out (vs. the binary with no deps beside a libc). This might also be an issue if you need grep early on during boot and /usr (where your perl resides) isn't mounted yet. Also, default behaviour is divergent enough that it might yield nasty surprises if you just drop in ack instead of grep. Still, I recommend giving ack a try if you ever use grep on the command line. If you're a coder who often needs to search through working copies/checkouts, even more so.
Update
I've written a followup on this, including some tips for day-to-day usage (and an explanation of grep's sucky performance).
Comments
René "Necoro" Neumann writes (in German, translation by me):

Stumbled across your blog entry about "ack" today. I tried it and found it to be cool :). So I created two ebuilds for it:

sys-apps/ack

dev-perl/File-Next

Just wanted to let you know (there is no comment function on your blog).

[May 31, 2006] Linux.com GNU grep's new features By: Michael Stutz

It looks like GNU grep became too overloaded with features ("christmas tree"). In many complex cases custom Perl script can compete with grep.

If you haven't been paying attention to GNU grep recently, you should be happily surprised by some of the new features and options that have come about with the 2.5 series. They bring it functionality you can't get anywhere else -- including the ability to output only matched patterns (not lines), color output, and new file and directory options.

Granted, the addition of this feature set caused a number of bugs that made it necessary to rewrite part of the code, but the latest 2.5.1a bugfix release is eminently usable.
One highlight of the new version is its ability to output only matched patterns. This is one of the most exciting features, because it adds completely new functionality to the tool. Remember, "grep" is an acronym -- it got its name from a function in the old Unix ed utility, global / regular expression / print -- and its purpose was to output lines from its input that match a given regular expression.
It remains such, but the new -o option (or --only-matching) specifies that only the matched patterns themselves are to be output, and not the entire lines they come on. If more than one match is found on a single line, those matches are output on lines of their own.
With this new option, suddenly GNU grep is transformed from a utility that outputs lines into a tool for harvesting patterns. You can use it to harvest data from input files, such as pulling out referrers from your server logs, or URLs from a file:
egrep -o '(((http(s)?|ftp|telnet|news|gopher)://|mailto:)[^[:space:]]+)' logfile

Or grab email addresses from a file:
egrep -o '\@/:[:space:]]+\>@[a-zA-Z_\.]+?\.[a-zA-Z]{2,3}' somefile

Use it to pull out all the senders from an email archive and sort into a file of unique addresses:
grep '^From: ' huge-mail-archive | egrep -o '\@/:[:space:]]+\>@[a-zA-Z_\.]+?\.[a-zA-Z]{2,3}' | sort | uniq > email.addresses

New uses for this feature keep popping up. You can use it, for instance, as a tool for testing regular expressions. Say you've whipped up a complicated regexp to do some task. You think it's the world's greatest regexp, it's going to do everything short of solving all the world's problems -- but at runtime, it doesn't seem to go as planned.
Next time this happens, use the -o option when you're in the design stage, and have grep read from the standard input, where you can feed it test data -- you'll see right away whether or not it matches exactly what you think it does. Since grep will be tossing back to you not the matched lines but the actual matches to the expression, it'll give you a pretty good clue how to fix it.
Output matches in color

Use the --coloroption to display matches in the input in color (red, by default). Color is added via ANSI escape sequences, which don't work in all displays, but grep is smart enough to detect this and won't use color (even if specified) if you're sending the output down a pipeline. Otherwise, if you piped the output to (say) less, the ANSI escape sequences would send garbage to the screen. If, on the other hand, that's really what you want to do, there's a workaround: use the --color=always to force it, and call less with the -R flag (which prints all raw control characters). That way, the color codes will escape correctly and you'll page through screens of text with your matched patterns in full color:

grep --color=always "regexp" myfile | less -R

The GREP_COLOR environment variable controls which color is used. To change the color from red to something else, set GREP_COLOR to a numeric value according to this chart:
30	black
31	red
32	green
33	yellow
34	blue
35	purple
36	cyan
37	white
For example, to have matches highlighted in a shade of green:
GREP_COLOR=32; export GREP_COLOR; grep pattern myfile

Use Perl regexps

One of the biggest developments in regular expressions to occur in the last few decades has been the Perl programming language, with its own regular expression dialect. GNU grep now takes Perl-style regexps with the -P option. (It's not always compiled in by default, so if you get an error message of "grep: The -P option is not supported" when you try to use it, you'll have to get the sources and recompile.)
To search for a bell character (Ctrl-g), you can now use:
grep -P '\cG' myfile

This is considered a "major variant" of grep, as with the -E and -F options (which are the egrep and fgrep tools, respectively), but it doesn't yet come with an associated program name -- perhaps new versions will have a prep binary (it sounds much better than pgrep) that will mean the same thing as using -P.

Dealing with input

A number of new features have to do with files and input. The new --label option lets you specify a text "label" to standard input. Where it's really useful is when you're grepping a lot of files at once, plus standard input, and you're making use of the labels that grep prefixes its matches with. Normally, standard input would be the only one with a label you couldn't control -- it's always prefixed with "(standard input)" as its label. Now, it can be prefixed with whatever argument you give the --label option.

grep changes quick reference

-Cx prints context lines before and after matches and must have argument x.
--color outputs matches in color (default red).
-D action specifies an action to take on device files (the default is "read").
--exclude= filespec excludes files matching filespec.
--include= filespec only searches through files matching filespec.
--label= name makes name the new label for stdin.
--line-buffered turns on line buffering.
-m X stops searching input after finding X matched lines.
-o outputs only matched patterns, not entire lines.
-P uses Perl-style regular expressions.
When searching through multiple files, you can control which files to search for with the --include and --exclude options. For example, to search for "linux" only in files with .txt extensions in the /usr/local/src directory tree, use:
grep -r --include=*.txt linux /usr/local/src

When you're recursively searching directories of files, you'll get errors when grep comes across a device file. With the new --devices option, you can specify what you want it to do on these files, by giving it an optional action. The default action is "read," which means to just read the file as any other file. But you can also specify "skip," which will skip the file entirely. Those are currently the only two methods for handling devices.
To search for "linux" in all files on the system, excluding special device files, use:
grep -r --device=skip linux /

Finally, the --line-buffered option turns on line buffering, and --m (or --max-count) gives the maximum number of matched lines to show, after which grep will stop searching the given input. For example, this command searches a huge file with line buffering, exiting after at most 10 matched lines occur:
grep --line-buffered -m 10 huge.file

POSIX updates

Some of the other new updates were made are so that GNU grep conforms to POSIX.2, including subtle changes in exit status.
One of these changes is that the interpretation of character classes is now locale-dependent. That means that ranges specified in bracketed expressions like [A-Z] don't mean the same thing everywhere. If the system's current locale environment calls for its own characters or sorting, these settings will override any default character range.
Another related update is a change to the old -C option, which outputs a specified number of lines of context before and after matched lines. In the past, when you used -C without an option, grep would output two lines of before-and-after context, but now you have to give an argument; if you don't, grep will report an error and exit. That's something to look out for if you've got any old shells scripts or routines sitting around that call grep.

grep changes quick reference
`-Cx` prints context lines before and after matches and must have argument x. `--color` outputs matches in color (default red). `-D` action specifies an action to take on device files (the default is "read"). `--exclude=` filespec excludes files matching filespec. `--include=` filespec only searches through files matching filespec. `--label=` name makes name the new label for stdin. `--line-buffered` turns on line buffering. `-m` X stops searching input after finding X matched lines. `-o` outputs only matched patterns, not entire lines. `-P` uses Perl-style regular expressions.

[ z a z z y b o b . c o m ] -usr-share-doc-tips

GNU grep comes with a recursive option (-r,-R) that allows you to recursively grep for a pattern through all files and any subdirectories.

But what happens if you aren't using GNU grep? You can use find to assist...
find /path/to/files -exec grep "pattern" {} \;
You can, of course, provide your usual options to grep, e.g.
find /path/to/files -exec grep -li "pattern" {} \;

pcregrep-4.5-1.i386 RPM

pcregrep searches files for character patterns, in the same way as other grep commands do, but it uses the PCRE regular expression library to support patterns that are compatible with the regular expressions of Perl 5. See pcre(3) for a full description of syntax and semantics.

If no files are specified, pcregrep reads the standard input. By default, each line that matches the pattern is copied to the standard output, and if there is more than one file, the file name is printed before each line of output. However, there are options that can change how pcregrep behaves.

Lines are limited to BUFSIZ characters. BUFSIZ is defined in <stdio.h>. The newline character is removed from the end of each line before it is matched against the pattern.

Re Replacing GNU grep revisited

To: <[email protected]>

Subject: Re: Replacing GNU grep revisited

From: "James P. Howard II" <[email protected]>

Date: Mon, 23 Jun 2003 10:21:53 -0400 (EDT)

Cc: <[email protected]>, <[email protected]>

References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>

Chris Costello said:
> On Sunday, June 22, 2003, Sean Farley wrote:
>> Reasons to consider for switching:
>> 1. GNU's grep -r option "is broken" according to the following post.
>>    The only thing I have noticed is that FreeGrep has more options for
>> controlling how symbolic links are traversed.
>>       http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&selm=xzp7kchblor.fsf_flood.ping.uio.no%40ns.sol.net
>
>    A workaround for this problem in the meantime would be to use
>
>      find <directory> -type f | xargs grep EXPR
>
>    Just FYI.

Rumors of my demise are greatly exaggerated.  And to call myself busy any
more is an understatement.

But yes, I got an email from Ted Unangst telling me about the OpenBSD move
to FreeGrep and this pleases me greatly.  I have been glancing over thier
CVS tree (via the web) and they have made a number of changes to fix the
bugs being discussed here.  Aside from a handful of errors (which are
presumably correctable), the speed is still an issue.

It is horribly slow when compared to the GNU version.  FreeBSD will see
better times than OpenBSD due to some changes made to the regex code a few
years ago which I adapted from the 4.4BSD-Lite2 code for grep, but it
still lags behind GNU in performance.

Jamie

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: February 19, 2020

grep tutorial

Fgrep:

Grep:

Using -l option

Using -w option

Old News ;-)

[Nov 15, 2018] Is Glark a Better Grep Linux.com The source for Linux information

Notable quotes:

"... stringfilenames ..."

Nov 15, 2018 | www.linux.com

[Oct 29, 2018] Getting all the matches with 'grep -f' option

Perverted example, but interesting question.

Oct 29, 2018 | stackoverflow.com

[Nov 09, 2017] Searching files

Notable quotes:

[Nov 01, 2017] Default grep options by Tom Ryder

May 18, 2012 | sanctum.geek.nz

[Oct 31, 2017] Counting with grep and uniq by Tom Ryder

Feb 18, 2012 | sanctum.geek.nz

[Jul 30, 2011] pcregrep(1) grep with Perl-compatible regex - Linux man page

[Aug 4, 2009] Tech Tip View Config Files Without Comments Linux Journal

[Mar 18, 2009] UNIX BASH scripting Highlight match with color in grep command

[Sep 11, 2008] glark by Jeff Pace

Ruby based

[May 06, 2008] ack! - Perl-based grep replacement

[May 31, 2006] Linux.com GNU grep's new features By: Michael Stutz

It looks like GNU grep became too overloaded with features ("christmas tree"). In many complex cases custom Perl script can compete with grep.

Google matched content

Softpanorama Recommended

Tutirials

Regex:

[May 06, 2008] ack! - Perl-based grep replacement