IT. Expert System.

REGEX

Getting Started


Getting Started

This section the very basics of understanding, creating and using regular expressions.

Simple String Matching

The simplest regular expression is a string of characters. To have a match, those characters must appear in the target string. Specifically, they must appear in the same order just as they appear in the regular expression.

The following table shows some simple string regular expressions and the matches for a target string:

Regex Matches Target String
a a "Maui no ka oi"
"abc "abc "abcdef"
Maui Maui "Maui no ka oi"
ka oi ka oi "Maui no ka oi"

Note: Not all characters can be used "as is" in a match. Some characters, called metacharacters, are special characters in regular expressions. The metacharacters are:

{}[]()^$.|*+?\

Simple string matching works well, but the regular expression needs to be pretty specific for each target string. The next section will address this.

Using Character Classes

A character class allows a set of possible characters, rather than just a single character, to match at a particular point in a regex.

The following table shows some regular expressions with character classes and the matches for a target string:

Regex Matches Target String
[a-z] a,b,c "abcABC"
[a-zA-Z] a,b,c,A,B,C "abcABC"
[^0-9] ",a,b,c,A,B,C," "abcABC"
[ch]at cat, hat "The cat and the hat"

Matching This or That (Alternation)

The vertical bar '|' metacharacter can be used to match different character strings. To match "cat" or "hat", the regular expression "cat|hat" can be used. The regular expression engine will try at each character position to match "cat". If "cat" doesn't match, the engine will try the next alternative, "hat". If "hat" doesn't match either, then the match fails and the engine moves to the next position in the string.

It is important to remember that the regular expression engine will try to match the regex at the earliest possible point in the string.

The following table shows some regular expressions with alternation and the matches for a target string:

Regex Matches Target String
c|co|cow c cows
cow|co|c cow cows
cow|pig|chicken cow, pig, chicken "The farmer raises cows, pigs, and chickens"
pig|chicken|cow cow, pig, chicken "The farmer raises cows, pigs, and chickens"

Grouping and Capturing

In a regular expression, the '(' and ')' characters perform two functions: grouping and capturing.

Grouping

A subpattern within the parenthesis is treated as a single unit.

The following table shows some regular expressions using grouping and the matches for a target string:

Regex Matches Target String
car(toon|pet) carpet "There is a spot on the carpet."
car(toon|pet) cartoon "Scooby Doo is my favorite cartoon."

Capturing

Any text matched by the pattern within parenthesis is captured for later use. The captures are numbered by counting the opening parenthesis '(' started from the left.

Note: The captures can also be named by using the form (?<name>expression).

If the regular expression engine supports backreferences, the match can be referred to within the same expression with \1, \2, etc.

In many cases, the captured text is also made available after a match, depending upon the implementation. In some engines, the captures are placed in special variables like $1, $2, etc.

The following table shows some regular expressions with alternation and the matches for a target string:

Regex Matches Target String
(\w)\1 oo scooby
(?<ch1>\w)\k<ch1> oo, oo, oo, oo (.NET only) scooby doooooo!

Note: To keep the parenthesis metacharacters from capturing matches (ie. a non-capturing group), use the form: (?:expression).

Quantifiers (Repetition)

To specify that a portion of a regular expression repeats, use the quantifier metacharacters ('*', '?', '+', and "{ }". These metacharacters have the following meanings:

  • exp* = match exp 0 or more times

  • exp? = match exp 0 or 1 times

  • exp+ = match exp 1 or more times

  • exp{n} = match exp exactly n times

  • exp{n,} = match exp at least n or more times

  • exp{n,m} = match exp at least n times, but not more than m times.

The following table shows some regular expressions with quantifiers and the matches for a target string:

Regex Matches Target String
[a-z]+ The "The farmer raises cows, pigs, and chickens"
\w.*\w Green Eggs and Ham "Green Eggs and Ham"
\d{4} 1955 Nov 5, 1955

Summary

This section has covered some of the basic and more commonly used regular expression features, but there is much more. Please refer to the specific section regarding each of these subjects along with additional sections covering more advanced features.



Content

Android Reference

Java basics

Java Enterprise Edition (EE)

Java Standard Edition (SE)

SQL

HTML

PHP

CSS

Java Script

MYSQL

JQUERY

VBS

REGEX

C

C++

C#

Design patterns

RFC (standard status)

RFC (proposed standard status)

RFC (draft standard status)

RFC (informational status)

RFC (experimental status)

RFC (best current practice status)

RFC (historic status)

RFC (unknown status)

IT dictionary

License.
All information of this service is derived from the free sources and is provided solely in the form of quotations. This service provides information and interfaces solely for the familiarization (not ownership) and under the "as is" condition.
Copyright 2016 © ELTASK.COM. All rights reserved.
Site is optimized for mobile devices.
Downloads: 1415 / 158768787. Delta: 0.01468 с