IT. Expert System.

REGEX

Unix Shell Tools


Unix Shell Tools

The Unix shell tools: awk, sed, and egrep are used for text processing. Each contains a regular expression engine.

Version: GNU awk: 3.1, GNU sed: 3.02, GNU egrep: 2.4.2

Engine Type: GNU awk: DFA, GNU sed: NFA, GNU egrep: DFA

Web Site: http://www.pcre.org

Supported Metacharacters

The following tables list the supported metacharacters:

Special Characters

Sequence Description Tool
\a Alarm (beep) awk, sed
\b Backspace; supported only in character class awk
\f Form feed awk, sed
\n Newline (line feed) awk, sed
\r Carriage return awk, sed
\t Horizontal Tab awk, sed
\v Vertical tab awk, sed
\ooctal A character specified by a one-, two-, or three-digit octal code sed
\octal A character specified by a one-, two-, or three-digit octal code awk
\xhex A character specified by a two-digit hexadecimal code awk, sed
\ddecimal A character specified by a one, two, or three decimal code awk, sed
\cchar A named control character (e.g., \cC is Control-C) awk, sed
\b Backspace awk
\metacharacter Escape the metacharacter so that it literally represents itself awk, sed, egrep

Character Classes

Class Description Tool
[...] Matches any single character listed or contained within a listed range awk, sed, egrep
[^...] Matches any single character that is not listed or contained within a listed range awk, sed, egrep
. Matches any single character, except newline awk, sed, egrep
\w Matches an ASCII word character, [a-zA-Z0-9_] egrep, sed
\W Matches a character that is not an ASCII word character, [^a-zA-Z0-9_] egrep, sed
[:prop:] Matches any character in the POSIX character class awk, sed
[^[:prop:]] Matches any character not in the POSIX character class awk, sed

Anchors and other zero-width testshell tools

Sequence Description Tool
^ Matches only start of string, even if newlines are embedded awk, sed, egrep
$ Matches only end of search string, even if newlines are embedded awk, sed, egrep
\< Matches beginning of word boundary egrep
\> Matches end of word boundary egrep

Modifiers

Modifier Description Tool
flag: i or I Case-insensitive matching for ASCII characters sed
command-line option: -i Case-insensitive matching for ASCII characters egrep
set IGNORECASE to non-zero Case-insensitive matching for Unicode characters awk

Grouping, capturing, conditional, and control

Sequence Description Tool
(PATTERN) Grouping awk
\(PATTERN\) Group and capture sub-matches, filling \1,\2,...,\9 sed
\n Contains the nth earlier submatch sed
...|... Alternation; match one or the other egrep, awk, sed
Greedy quantifiers
* Match 0 or more times awk, sed, egrep
+ Match 1 or more times awk, sed, egrep
? Match 1 or 0 times awk, sed, egrep
\{n\} Match exactly n times sed, egrep
\{n,\} Match at least n times sed, egrep
\{x,y\} Match at least x times, but no more than y times sed, egrep

egrep

egrep [options] pattern files

egrep searches files for occurrences of pattern and prints out each matching line.

Example

$ echo 'Spiderman Menaces City!' > dailybugle.txt $ egrep -i 'spider[- ]?man' dailybugle.txt Spiderman Menaces City!
sed

sed '[address1][,address2]s/pattern/replacement/[flags]' files  sed -f script files

By default, sed applies the substitution to every line in files. Each address can be either a line number or a regular expression pattern. A supplied regular expression must be defined within the forward slash delimiters (/...). If address1 is supplied, substitution will begin on that line number or the first matching line, and continue until either the end of the file or the line indicated or matched by address2.

Two subsequences, & and \n, will be interpreted in replacement based on the results of the match. The sequence & is replaced with the text matched by pattern. The sequence \n corresponds to a capture group (1..9) in the current match.

The available flags are:

n

Substitute the nth match in a line, where n is between 1 and 512.

g

Substitute all occurrences of pattern in a line.

p

Print lines with successful substitutions.

w file

Write lines with successful substitutions to file.

Example

Change date formats from MM/DD/YYYY to DD.MM.YYYY.

$ echo 12/30/1969' |   sed 's!\([0-9][0-9]\)/\([0-9][0-9]\)/\([0-9]\{2,4\}\)!\2.\1.\3!g'
awk

awk 'instructions' files  awk -f script files

The awk script contained in either instructions or script should be a series of /pattern/ {action} pairs. The action code is applied to each line matched by pattern. awk also supplies several functions for pattern matching.

Functions

match( text, pattern)

If pattern matches in text, returns the position in text where the match starts. A failed match returns zero. A successful match also sets the variable RSTART to the position where the match started and the variable RLENGTH to the number of characters in the match.

gsub( pattern, replacement, text)

Substitutes each match of pattern in text with replacement and returns the number of substitutions. Defaults to $0 if text is not supplied.

sub (pattern, replacement, text)

Substitutes first match of pattern in text with replacement. A successful substitution returns 1, and an unsuccessful substitution returns 0. Defaults to $0 if text is not supplied.

Example

Create an awk file and then run it from the command line.

$ cat sub.awk {      gsub(/https?:\/\/[a-z_.\\w\/\\#~:?+=&;%@!-]*/,                      "<a href=\"\&\">\&</a>");         print }    $ echo "Check the website, http://www.oreilly.com/catalog/repr" | awk -f sub.awk


Content

Android Reference

Java basics

Java Enterprise Edition (EE)

Java Standard Edition (SE)

SQL

HTML

PHP

CSS

Java Script

MYSQL

JQUERY

VBS

REGEX

C

C++

C#

Design patterns

RFC (standard status)

RFC (proposed standard status)

RFC (draft standard status)

RFC (informational status)

RFC (experimental status)

RFC (best current practice status)

RFC (historic status)

RFC (unknown status)

IT dictionary

License.
All information of this service is derived from the free sources and is provided solely in the form of quotations. This service provides information and interfaces solely for the familiarization (not ownership) and under the "as is" condition.
Copyright 2016 © ELTASK.COM. All rights reserved.
Site is optimized for mobile devices.
Downloads: 183 / 158679536. Delta: 0.03507 с