An Mptn pattern is applied to a string. The result is a set of all possible variable assignments.
Atomic patterns
Matches itself. Characters ., [, ], {, }, (, ), `, ', <, >, ?, *, +, \, & and | must be escaped by a backslash.
Matches any character from the set listed. The set may contain ranges indicated by the starting and the ending characters separated by -. If you want to include character ^ in the set, do not put it immediately after the opening bracket.
Matches any character except those listed.
Matches any character.
If the variable named by identifier has a value, the string matched should coincide with this value. If the variable does not have a value, it gets the matched string as its new value. In case there is a pattern associated with the variable name, the string should match this pattern. Variable assignments inside the associated pattern do not propagate to the match where the variable occurs.
The matcher procedure associated with identifier is called. The string after the : is given it as a parameter. The procedure is allowed to set variable values in the match where it is called.
Non-atomic patterns
The string should either be empty or match pattern.
Matches zero or more occurrences of pattern.
Matches one or more occurrences of pattern.
Matches the concatenation of pattern1 and pattern2.
Same as concatenation, but only one possible assignment is given for pattern1, which makes the corresponding part of the string as short as possible.
Same as concatenation, but only one possible assignment is given for pattern1, which makes the corresponding part of the string as long as possible.
The string should match both pattern1 and pattern2.
The string should match at least one of pattern1 or pattern2.
pattern is matched against the string.
pattern is matched against the string; all the variable assignments done inside the pattern are forgotten.
In this section I will give several examples of Mptn usage. I will assume that the variable v is restricted to value [aeiou and variables c, c1 and c2 to value [bcdfgjhklmnpqrstvwxyz].
Pattern: abcd
Matches with: String abcd.
Pattern: a*
Matches with: A (possibly empty) string of letters a.
Pattern: {x}
Matches with: Any string (assigned to x).
Pattern: {x}{y}
Matches with: Any string. For a string n bytes long, the iterator will return n+1 variants, splitting the string between variables x and y.
Pattern: {x}>b{y}
Matches with: Any string containing at least one b. The variable x will contain the part of the string up to the last occurrence of b, variable y the part of the string after the last b.
Pattern: {c}
Matches with: Any consonant.
Pattern: {c}{v}
Matches with: An open syllable.
Pattern: ({x}&`{c1}{v}{c2}?')({y}&`{c1}{v}{c2}'*)
Matches with: A sequence of syllables of structure CV or CVC. The first syllable gets assigned to variable x, the rest of the string — to y. Variables c1, c2 and v remain unassigned.
The last example could be expressed in traditional regular expressions as ([bcdfghjklmnpqrstvwxyz][aeiou][bcdfghjklmnpqrstvwxyz]?)([bcdfghjklmnpqrstvwxyz][aeiou][bcdfghjklmnpqrstvwxyz]?). It seems to me that the Mptn variant is easier to write and understand.
At present, Mptn only works with 8-bit characters. However, some attempts were made to keep the code clean enough in order to be able to move to Unicode strings. So far, the biggest problem I see is representing sets of characters. In the future I may borrow Henry Spencer's code from Tcl's excellent regular expression package. But the only provision I have made for this so far is consistently renaming char to chr throughout Mptn source code. Everyone willing to help is welcome.