Grouping and Capturing
The grouping constructs delineate the subexpressions of a regular expression and capture the matched substrings of an input string.
These grouping constructs can be used to:
- Match a subexpression that is repeated in the input string
- Apply a quantifier to a subexpression
- Include a subexpression in the string via backreferences
- Capture and retrieve matched subexpressions
The following grouping construct captures a matched subexpression:
( expression )
|expression||Specifies any valid regular expression pattern|
Capturing groups are numbered by counting their opening parentheses from left to right, starting from one. In the expression
((A)(B(C))), for example, there are four groups:
Group zero always contains for the entire expression.
Capturing groups can be access in the following ways:
- By using a backreference within the regular expression
- By using a named backreference within the regular expression
- By using the $number sequence
Note: Groups beginning with
non-capturing groups that do not capture text and do not count towards the group total.
The following example uses grouping, capturing, and a backreference to match duplicate words in a string:
The first subexpression captures multiple word characters, then followed by a whitespace character. Finally, a backreference to the first capturing group to match the same word.
For the text "This is a test test...", this regular expression would produce the following matches:
"This is a test test..."