Skip to content

codeguru85/learn-regex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 

Repository files navigation

What is Regular Expression?

Regular expression is a group of characters or symbols which is used to find a specific pattern from a text.

A regular expression is a pattern that is matched against a subject string from left to right. The word "Regular expression" is a mouthful, you will usually find the term abbreviated as "regex" or "regexp". Regular expression is used for replacing a text withing a string, validating form, extract a substring from a string based upon a pattern match, and so much more.

Imagine you are writing an application and you want to set the rules when user chosing their username. We want the username can contains letter, number, underscore and hyphen. We also want to limit the number of characters in username so it does not look ugly. We use the following regular expression to validate a username:

Regular expression

Above regular expression can accepts the strings "john_doe", "jo-hn_doe" and "john12_as". It does not match "Jo" because that string contains uppercase letter and also it is too short.

Table of Contents

  1. Basic Matchers
  2. Meta character
  3. Quantifiers
  4. OR operator
  5. Character Sets
  6. Shorthand Character Sets
  7. Grouping
  8. Lookaheads
  9. Flags

1. Basic Matchers

A regular expression is just a pattern of letters and digits that we used to search in a text. For example the regular expression cat means: the letter c, followed by the letter a, followed by the letter t.

"cat" => The cat sat on the mat

The regular expression 123 matches the string "123". The regular expression is matched against an input string by comparing each character in the regular expression to each character in the input string, one after another. Regular expressions are normally case-sensitive so the regular expression Cat would not match the string "cat".

"Cat" => The cat sat on the Cat

2. Meta Characters

Meta characters are the building blocks of the regular expressions. Meta characters do not stand for themselves but instead are interpreted in some special way. Some meta characters have a special meaning that are written inside the square brackets. The meta character are as follows:

Meta character Description
. Period matches any single character except a line break.
[ ] Character class. Matches any character contained between the square brackets.
[^ ] Negated character class. Matches any character that is not contained between the square brackets
* Matches 0 or more repetitions of the preceding symbol.
+ Matches 1 or more repetitions of the preceding symbol.
? Makes the preceding symbol optional.
{n} Braces. Matches “n” repetitions of the preceding symbol.
(xyz) Character group. Matches the characters xyz in that exact order.
| Alternation. Matches either the characters before or the characters after the symbol.
\ Escapes the next character. This allows you to match reserved characters [ ] ( ) { } . * + ? ^ $ \ |
^ Matches the beginning of the input.
$ Matches the end of the input.

2.1 Full stop

Full stop . is the simplest example of meta character. The meta character . matches any single character. It will not match return or new line characters. For example the regular expression .ar means: any character, followed by the letter a, followed by the letter r.

".ar" => The car parked in the garage.

2.2 Character set

Character sets are also called character class. Square brackets are used to specify character sets. Use hyphen inside character set to specify the characters range. The order of the character range inside square brackets doesn't matter. For example the regular expression [Tt]he means: an uppercase T or lowercase t, followed by the letter h, followed by the letter e.

"[Tt]he" => The car parked in the garage.

2.2.1 Negated character set

In general the caret symbol represents the start of the string, but when it is typed after the opening square bracket it negates the character set. For example the regular expression [^c]ar means: any character except c, followed by the character a, followed by the letter r.

"[^c]ar" => The car parked in the garage.

2.2.2 Repeating character set

We can repeat a character class by using +, * or ? operators. For example the regular expression [a-z]+ means: any number of lowercase letters in a row.

"[a-z]+" => The car parked in the garage.

About

Learn regex the easy way

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published