Converts a regex string into c code, which parses the given regular expression.
For exact syntax specifications, look at the comment in regex2c.c.
Run make to build the regex2c executable. The executable outputs either c code, or debug info.
To change that, change the preprocessor-definition PRINT_DEBUG in common.h, a value of 0 prints the c code, larger values print more debug info.
To clean the project, run make clean.
To test the project, write a regex pattern into the file test/pattern.regex and run make test in the root directory. This
build the executable test/pattern_matcher, which accepts strings from stdin, which match the given regex pattern.
The regex2c executable expects a non-empty regex-string from stdin and prints c code to stdout. It uses the following steps to convert the expression:
- Parse the regex and convert to an AST
- Convert the AST to a NFA using Thompson's algorithm (structures such as repititions and optionals are converted by inserting epsilon-transitions)
- Convert the NFA to a DFA using powerset construction (making it deterministic)
- Minimize the DFA using Moore's algorithm
- Convert the DFA into c code, which can be compiled and linked with other code