cvs.delorie.com/djgpp/doc/libc/libc_657.html | search |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
#include <sys/types.h> #include <regex.h> int regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t pmatch[], int eflags); |
regexec
matches the compiled RE pointed to by preg against
the string, subject to the flags in eflags, and reports
results using nmatch, pmatch, and the returned value. The
RE must have been compiled by a previous invocation of regcomp
(see section regcomp). The compiled form is not altered during execution of
regexec
, so a single compiled RE can be used simultaneously by
multiple threads.
By default, the NUL-terminated string pointed to by string is considered to be the text of an entire line, with the NUL indicating the end of the line. (That is, any other end-of-line marker is considered to have been removed and replaced by the NUL.)
The eflags argument is the bitwise OR of zero or more of the following flags:
REG_NOTBOL
The first character of the string is not the beginning of a line, so the
`^' anchor should not match before it. This does not affect the
behavior of newlines under REG_NEWLINE
(REG_NEWLINE,
see section regcomp).
REG_NOTEOL
The NUL terminating the string does not end a line, so the `$' anchor
should not match before it. This does not affect the behavior of
newlines under REG_NEWLINE
(REG_NEWLINE, see section regcomp).
REG_STARTEND
The string is considered to start at
string + pmatch[0].rm_so
and to have a terminating NUL
located at
string + pmatch[0].rm_eo
(there need not actually
be a NUL
at that location), regardless of the value of nmatch.
See below for the definition of pmatch and nmatch. This is
an extension, compatible with but not specified by POSIX 1003.2,
and should be used with caution in software intended to be portable to
other systems. Note that a non-zero rm_so
does not imply
REG_NOTBOL
; REG_STARTEND
affects only the location of the
string, not how it is matched.
REG_TRACE
trace execution (printed to stdout)
REG_LARGE
force large representation
REG_BACKR
force use of backref code
Regular Expressions' Syntax, See section regcomp, for a discussion of what is matched in situations where an RE or a portion thereof could match any of several substrings of string.
If REG_NOSUB
was specified in the compilation of the RE
(REG_NOSUB, see section regcomp), or if nmatch is 0, regexec
ignores the pmatch argument (but see below for the case where
REG_STARTEND
is specified). Otherwise, pmatch should point
to an array of nmatch structures of type regmatch_t
. Such
a structure has at least the members rm_so
and rm_eo
, both
of type regoff_t
(a signed arithmetic type at least as large as
an off_t
and a ssize_t
), containing respectively the offset
of the first character of a substring and the offset of the first
character after the end of the substring. Offsets are measured from the
beginning of the string argument given to regexec
. An
empty substring is denoted by equal offsets, both indicating the
character following the empty substring.
When regexec
returns, the 0th member of the pmatch array is
filled in to indicate what substring of string was matched by the
entire RE. Remaining members report what substring was matched by
parenthesized subexpressions within the RE; member i
reports
subexpression i
, with subexpressions counted (starting at 1) by
the order of their opening parentheses in the RE, left to right. Unused
entries in the array--corresponding either to subexpressions that did
not participate in the match at all, or to subexpressions that do not
exist in the RE (that is, i > preg->re_nsub
)---have both
rm_so
and rm_eo
set to -1
. If a subexpression
participated in the match several times, the reported substring is the
last one it matched. (Note, as an example in particular, that when the
RE `(b*)+' matches "bbb", the parenthesized subexpression
matches the three `b's and then an infinite number of empty
strings following the last `b', so the reported substring is one of the
empties.)
If REG_STARTEND
is specified in eflags, pmatch must
point to at least one regmatch_t
variable (even if nmatch
is 0 or REG_NOSUB
was specified in the compilation of the RE,
REG_NOSUB, see section regcomp), to hold the input offsets for
REG_STARTEND
. Use for output is still entirely controlled by
nmatch; if nmatch is 0 or REG_NOSUB
was specified,
the value of pmatch[0]
will not be changed by a successful
regexec
.
Normally, regexec
returns 0 for success and the non-zero code
REG_NOMATCH
for failure. Other non-zero error codes may be
returned in exceptional situations. The list of possible error return
values is below:
REG_ESPACE
ran out of memory
REG_BADPAT
the passed argument preg doesn't point to an RE compiled by
regcomp
REG_INVARG
invalid argument(s) (e.g.,
string + pmatch[0].rm_eo
is less
than string + pmatch[0].rm_so
)
This implementation of the POSIX regexp functionality was written by Henry Spencer.
regexec
performance is poor. nmatch exceeding 0 is
expensive; nmatch exceeding 1 is worse. regexec
is largely
insensitive to RE complexity except that back references are
massively expensive. RE length does matter; in particular, there is a
strong speed bonus for keeping RE length under about 30 characters, with
most special characters counting roughly double.
The implementation of word-boundary matching is a bit of a kludge, and bugs may lurk in combinations of word-boundary matching and anchoring.
ANSI/ISO C | No |
POSIX | 1003.2-1992; 1003.1-2001 |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
webmaster | delorie software privacy |
Copyright © 2004 | Updated Apr 2004 |