regex

Regular Expressions, also know as RegEx or RegExp, is a standard language for text matching, used by many many many tools. If you work with text matching regularly - you should definitely learn it.

It is very easy to start using RegEx with basic things, but it's very powerful (and complicated) if you want to master it.

Some of notes on RegEx to keep in mind:

RegEx is greedy by default - it will try to match as much as possible, if you permit that, often getting unexpected results. Try to limit your patterns.
RegEx uses anchors - special symbols which tell RegEx to match only specific positions, like word breaks, whitespaces, line starts/ends, etc. Anchors are a must for any complicated pattern.
RegEx main instrument are masks, which allow usage of more than 1 specific symbol at a place. Most likely you already know masks from other notations, like * as a mask for any amount of any symbols used in Blob patterns and Windows masks.
RegEx allows to use quantifiers - how many of something you want to match.
RegEx allows to do lookups (look-aheads) - forward or back ones, positive or negative, without actually matching them - a condition.
RegEx allows to define groups - either positional or named, and to reference their values - either in the same pattern definition, or in substition.
RegEx uses many special characters. If you need to match one of those specifically in text - escape it using \. For example | has a meaning of OR, but if you need to match it literally - you need to use \|.
RegEx allows several ways to write specific pattern. If you can't write the pattern in one of ways - just use one of replacements. Like, OTLoV filtering uses to separate individual rules, so you won't be able to use space directly in RegEx - you can use \s, for example, or just ..

There are numerous cheat sheets on Regex, just google a few and use whichever you like.

There are great services, which help to understand RegEx better and to learn to use it. I prefer regex101.com (need to use .NET variant there).

A few of most commonly used basic patterns:

Pattern	Meaning
`.`	Any one symbol
`\s`	whiteSpace symbol (space, tab, newline if allowed by options, etc.). Lowercase `s`.
`\S`	not a whiteSpace symbol. Uppercase `S`.
`\w`	Word character (alpha, numeric, underscore)
`[a-z]`	Specific list of characters. Lowercase `a` to `z` in this case (regex is case sensitive by default, but OTLoV uses a default option to make it case insensitive; can be changed)
`[^:]`	Any character, but `:`
`.*`	`` denotes a quantifier - any number (0 or more) of the character before it. In this case `.` before denotes any character. So `.` means - any number of any characters.
`\w+`	`+` denotes a quantifier - one or more of character before it. In this case `\w` before - means word characters. So `\w+` means - one or more of word characters.
`[a-zA-Z_]+`	this is the same as `\w+` - another form.
`fail(ure)?`	`?` denotes a quantifier - one or zero of characters before it. In this case `(ure)` group comes before `?`, which must match `ure` characters. So `fail(ure)?` will match `fail` and `failure` (as we have no anchors here - the pattern will match `fail` in `failed` too for example).
`^`	anchor to match the start of the line. For multiline text behavior depends on options.
`\n`	matches new line character (OTLoV specially converts any line breaks into only `\n`).
`$`	anchor to match the end of the line. For multiline text behavior depends on options.
`\d{1,5}`	`\d` means Decimal (number), `{1,5}` - a quantifier, 1 to 5 characters. So `\d{1,5}` means - from 1 to 5 numbers.
`ab\|ba`	OR pattern - either `ab` or `ba`.

More difficult examples:

Pattern	Meaning
`\b(OK\|KO)\b`	`\b` - a word break (`\w` from one side). `(...)` - a group. `OK\|KO` - OR pattern - either OK or KO. `\b(OK\|KO)\b` will match standalone words `OK` or `KO`, but not characters inside a word (say, `OKKIE` won't be matched, as there's no word break after `OK`).
`(?<!no )fail`	`(?<!...)` Negative lookbehind - there should be NO pattern denoted by `...` before. `fail` - match `fail` specifically. `(?<!no )fail` means - match `fail` if there's no `no` before, i.e. `no fail` won't match, while `fail` will be matched.
`\d[1-9]\d`	a number, which consists not only from zeroes.

To use found values for replacements (Replacement):

Pattern	Meaning
$0	Full text matching RegEx pattern
$1	The first RegEx group matching text (groups are defined in brackets; order - from left to right; passive groups defined as `(?:...)` are ignored here)
$2	The second RegEx group value
${name}	Value of the named group with name `name`, defined as `(?<name>...)`

End-user

Templates development

XML basics
Principles
RegEx options
Rules
- Template
- File
- Section
- Match
- Detect
- SplitAt
- Clean
- ValueSeparator
- ValuePattern
- Highlight

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regex

End-user

Templates development

Clone this wiki locally