Lua LPEG library |
---|
LPeg is a pattern-matching library for Lua, based on Parsing Expression Grammars (PEGs). Operator Description lpeg.P(string) : Matches string literally lpeg.P(number) : Matches exactly number characters lpeg.S(string) : Matches any character in string (Set) lpeg.R("xy") : Matches any character between x and y (Range) patt^n : Matches at least n repetitions of patt patt^-n : Matches at most n repetitions of patt patt1 * patt2 : Matches patt1 followed by patt2 patt1 + patt2 : Matches patt1 or patt2 (ordered choice) patt1 - patt2 : Matches patt1 if patt2 does not match -patt : Equivalent to ("" - patt) #patt : Matches patt but consumes no input As a very simple example, lpeg.R("09")^1 creates a pattern that matches a non-empty sequence of digits. As a not so simple example, -lpeg.P(1) (which can be written as lpeg.P(-1) or simply -1 for operations expecting a pattern) matches an empty string only if it cannot match a single character; so, it succeeds only at the subject's end. In addition to the functions documented here, some arithmetic operators have special effects on patterns: #patt Returns a pattern that matches only if the input string matches patt, but without consuming any input, independently of success or failure. (This pattern is equivalent to &patt in the original PEG notation.) When it succeeds, #patt produces all captures produced by patt. -patt Returns a pattern that matches only if the input string does not match patt. It does not consume any input, independently of success or failure. (This pattern is equivalent to !patt in the original PEG notation.) As an example, the pattern -lpeg.P(1) matches only the end of string. This pattern never produces any captures, because either patt fails or -patt fails. (A failing pattern never produces captures.) patt1 + patt2 Returns a pattern equivalent to an ordered choice of patt1 and patt2. (This is denoted by patt1 / patt2 in the original PEG notation, not to be confused with the / operation in LPeg.) It matches either patt1 or patt2, with no backtracking once one of them succeeds. The identity element for this operation is the pattern lpeg.P(false), which always fails. If both patt1 and patt2 are character sets, this operation is equivalent to set union. patt1 - patt2 Returns a pattern equivalent to !patt2 patt1. This pattern asserts that the input does not match patt2 and then matches patt1. If both patt1 and patt2 are character sets, this operation is equivalent to set difference. Note that -patt is equivalent to "" - patt (or 0 - patt). If patt is a character set, 1 - patt is its complement. patt1 * patt2 Returns a pattern that matches patt1 and then matches patt2, starting where patt1 finished. The identity element for this operation is the pattern lpeg.P(true), which always succeeds. (LPeg uses the * operator [instead of the more obvious ..] both because it has the right priority and because in formal languages it is common to use a dot for denoting concatenation.) patt^n If n is nonnegative, this pattern is equivalent to pattn patt*. It matches at least n occurrences of patt. Otherwise, when n is negative, this pattern is equivalent to (patt?)-n. That is, it matches at most -n occurrences of patt. In particular, patt^0 is equivalent to patt*, patt^1 is equivalent to patt+, and patt^-1 is equivalent to patt? in the original PEG notation. In all cases, the resulting pattern is greedy with no backtracking (also called a possessive repetition). That is, it matches only the longest possible sequence of matches for patt. patt / string Creates a string capture. It creates a capture string based on string. The captured value is a copy of string, except that the character % works as an escape character: any sequence in string of the form %n, with n between 1 and 9, stands for the match of the n-th capture in patt. The sequence %0 stands for the whole match. The sequence %% stands for a single %. patt / table Creates a query capture. It indexes the given table using as key the first value captured by patt, or the whole match if patt produced no value. The value at that index is the final value of the capture. If the table does not have that key, there is no captured value. patt / function Creates a function capture. It calls the given function passing all captures made by patt as arguments, or the whole match if patt made no capture. The values returned by the function are the final values of the capture. In particular, if function returns no value, there is no captured value. Grammars With the use of Lua variables, it is possible to define patterns incrementally, with each new pattern using previously defined ones. However, this technique does not allow the definition of recursive patterns. For recursive patterns, we need real grammars. LPeg represents grammars with tables, where each entry is a rule. The call lpeg.V(v) creates a pattern that represents the nonterminal (or variable) with index v in a grammar. Because the grammar still does not exist when this function is evaluated, the result is an open reference to the respective rule. A table is fixed when it is converted to a pattern (either by calling lpeg.P or by using it wherein a pattern is expected). Then every open reference created by lpeg.V(v) is corrected to refer to the rule indexed by v in the table. When a table is fixed, the result is a pattern that matches its initial rule. The entry with index 1 in the table defines its initial rule. If that entry is a string, it is assumed to be the name of the initial rule. Otherwise, LPeg assumes that the entry 1 itself is the initial rule. As an example, the following grammar matches strings of a's and b's that have the same number of a's and b's: equalcount = lpeg.P{ "S"; -- initial rule name S = "a" * lpeg.V"B" + "b" * lpeg.V"A" + "", A = "a" * lpeg.V"S" + "b" * lpeg.V"A" * lpeg.V"A", B = "b" * lpeg.V"S" + "a" * lpeg.V"B" * lpeg.V"B", } * -1 Lua functions
Topics
(Help topic: general=lua_lpeg) |
Enter a search string to find matching documentation.
Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.