Skip to content
Ron Panduwana edited this page Sep 27, 2015 · 5 revisions

1. IPv4 Address

/byte/dot/byte/dot/byte/dot/byte/
    byte = '0'..'255'
    dot = '.'

Compared to plain regex:

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}

--

2. BEGIN-something-END (e.g. extracting <title> from HTML)

/begin/__/end/
    begin = '<title>'
    end = '</title>'

The above oprex compiles to the following regex:

<title>(?:[^<]++|<(?!/title>))++</title>

Which is correct and high-performing, but unreadable compared to <title>.+?</title> which looks nice but is slow and sometimes problematic.

Using oprex, we get both correct+fast and readable.

--

3. Date

/yyyy/separator/mm/=separator/dd/
    yyyy = 4 of digit
    mm = '01'..'12'
    dd = '01'..'31'
    [separator] = 1 of: - /

It compiles to (prettified):

\d{4}
(?P<separator>[\-/])
(?>1[0-2]|0[1-9])
(?P=separator)
(?>3[01]|[12]\d|0[1-9])(?!\d)

--

4. Time

<<|
  |/hh/colon/mm/colon/ss/space/ampm/ -- 12-hour format
  |/HH/colon/mm/colon/ss/            -- 24-hour format

    hh = '1'..'12'
    HH = 'o0'..'23'
    mm = ss = '00'..'59'
    colon: :
    ampm = (ignorecase) <<|
                          |'AM'
                          |'PM'

Compiles to:

(?>1[0-2]|[1-9])
:[0-5]\d
:[0-5]\d (?i:AM|PM)
|
(?>2[0-3]|1\d|0?\d)
:[0-5]\d
:[0-5]\d(?!\d)

--

5. Blood Type

/type/rhesus/
    [type] = <<|
               |'AB'
               |1 of: A B O

    [rhesus] = <<|
                 |'+'
                 |'-'
                 |
(?P<type>AB|[ABO])
(?P<rhesus>\+|-|)

--

6. Quoted String (with escape support)

/opening_quote/contents/=opening_quote/
    [opening_quote] = 1 of quote
*)      quote: ' "
    contents = @1.. of <<|
                         |non-quote
                         |non_opening_quote
                         |escaped

        non_opening_quote = <@>
                      |quote|
            <!=opening_quote|

        escaped = <@>
            <backslash|
                      |quote|
(?P<opening_quote>['"])
(?:[^'"]|['"](?<!(?P=opening_quote))|(?<=\\)['"])++
(?P=opening_quote)

--

7. Comma-Separated Values

//value/more_values?//
    value = @1.. of non-comma
*)      comma: ,
    more_values = /comma/value/more_values?/
(?m:^)
[^,]++
(?P<more_values>,[^,]++(?&more_values)?)?
(?m:$)

--

8. Password Checks

(unicode)
<@>
|min_length_8>
|has_number>
|has_min_2_symbols>

    min_length_8 = @8.. of any
    has_number = /__?/digit/
    has_min_2_symbols = 2 of /__?/non-alnum/

(?V1wu)
(?=(?s:.){8,}+)
(?=\D*+\d)
(?=(?:\p{Alphanumeric}*+\P{Alphanumeric}){2})

--

9. Balanced Parentheses

/non_parens?/balanced_parens/non_parens?/.
    non_parens = @1.. of not: ( ) 
    balanced_parens = /open/contents?/close/
        open: (
        close: )
        contents = @1.. of <<|
                             |non_parens
                             |balanced_parens
[^()]*+
(?P<balanced_parens>\((?:[^()]++|(?&balanced_parens))*+\))
[^()]*+
\Z
Clone this wiki locally