WebFund 2014W Lecture 20
Basics of Regular Expressions
- start and end: /
- . represents any single character
- * is 0 or more repeats, + is one or more repeats
- Thus .* matches any number of characters (including none)
- () denote groups, normally for extraction or later substitution
- Each group is numbered, so first () is $1 (or something like that)
- can include letter ranges in [], e.g. [a-z]
- An all lowercase word with at least one character is: /[a-z]+/
- | means or (as usual), and is implicit
Apparently there are regular expression decoders online somewhere
Escaped characters
- \ is used to treat special characters as literals
- % followed by hex numbers denotes character codes
Chomsky hierarchy