CParser
A compact C preprocessor and declaration parser written in pure Lua
Install / Use
/learn @facebookresearch/CParserREADME
CParser
This pure Lua module implements (1) a standard compliant C preprocessor with a couple useful extensions, and (2) a parser that provides a Lua friendly description of all global declarations and definitions in a C header or C program file.
The driver program lcpp invokes the preprocessor and outputs
preprocessed code. Although it can be used as a replacement for the
normal preprocessor, it is more useful as an extra preprocessing step
(see option -Zpass which is on by default.) The same capabilities
are offered by functions cparser.cpp and cparser.cppTokenIterator
provided by the module cparser.
The driver program lcdecl analyzes a C header file and a C program
file and outputs a short descriptions of the declarations and
definitions. This program is mostly useful to understand the
representations produced by the cparser function
cparser.declarationIterator.
This code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.
Program lcpp
Synopsis
lcpp [options] inputfile.c [-o outputfile.c]
Preprocess file inputfile.c and write the preprocessed code into
file outputfile.c or to the standard output.
Options
The following options are recognized:
-
-Werror
Cause all warning to be treated as errors. Note that parsing cannot resume after an error. The parser simply throws a Lua error. -
-w
Do not print warning messages. -
-Dsym[=val]
Define preprocessor symbolsymto valueval. The default value ofvalis1. Note that it is possible to define function-like symbols with syntax-Dsym(args)=val. -
-Usym
Undefine preprocessor symbolsym. -
-Idir
Add directorydirto the search path for included files. Note that there is no default search path. When an include file is not found the include directive is simply ignored with a warning (but see also option-Zpass). Therefore all include directives are ignored unless one uses option-Ito specify the search path. -
-I-
Marks the beginning of the system include path. When an included file is given with angle brackets, (as in#include <stdio.h>), one only searches directories specified by the-Ioptions that follow-I-. Therefore all these include directives are ignored unless one uses option-I-followed by one or more option-I. -
-dM
Instead of producing the preprocessed file, dumps all macros defined at the end of the parse. -
-Zcppdef
Run the native preprocessor using commandcpp -dM < dev/nulland copy its predefined symbols. This is useful when usinglcppas a full replacement for the standard preprocessor. -
-Zpass
This option is enabled by default (use-Znopassto disable) and indicates that the output oflcppis going to be reprocessed by a C preprocessor and compiler. This option triggers the following behavior:- The preprocessor directives
#pragmaand#identare copied verbatim into the output file. - When the included file cannot be found in the provided
search path, the preprocessor directive
#includeis copied into the output file. - Preprocessor directives prefixed with a double
##are copied verbatim into the output file with a single#prefix. This feature is useful for#ifdirectives that depend on symbols defined by unresolved#includedirectives.
- The preprocessor directives
-
-std=(c|gnu)(89|99|11)
This option selects a C dialect. In the context of the preprocessor, this impacts the symbols predefined bylcppand potentially enables GCC extensions of the variadic macro definition syntax.- Symbol
__CPARSER__is always defined with value <1>. - Symbols
__STDC__and__STDC_VERSION__are either defined by option-Zcppdefor take values suitable for the target C dialect. - Symbols
__GNUC__and__GNUC_MINOR__are either defined by option-Zcppdefor are defined to values4and2if the target dialect starts with stringgnu.
This can be further adjusted using the
-Dor-Uoptions. The default dialect isgnu99. - Symbol
Preprocessor extensions
The lcpp preprocessor implements several useful nonstandard features.
The main feature are multiline macros. The other features are mostly
here because they make multiline macros more useful.
String comparison in conditional expressions
The C standard specifies that the expressions following #if
directives are constant expressions of integral type. However this
processor also handles strings. The only valid operations on strings
are the equality and ordering comparisons. This is quite useful to
make special cases for certain values of the parameters of a multiline
macro, as shown later.
Multiline macros
Preprocessor directives #defmacro and #endmacro can be used to
define a function-like macro whose body spans several lines. The
#defmacro directive contains the macro name and a mandatory argument
list. The body of the macro is composed of all the following lines up
to the matching #endmacro. This offers several benefits:
-
The line numbers of the macro-expansion is preserved. This ensures that the compiler produces error messages with meaningful line numbers.
-
The multi-line macro can contain preprocessor directives. Conditional directives are very useful in this context. Note however that preprocessor definitions (with
#define,#defmacro, or#undef) nested inside multiline macros are only valid within the macro. -
The standard stringification
#and token concatenation##operators can be used freely in the body of multiline macros. Note that these operators only work with the parameters of the multiline macros and not with ordinary preprocessor definitions. This is consistent with the standard behavior of these operators in ordinary preprocessor macros.Example
#defmacro DEFINE_VDOT(TNAME, TYPE)
TYPE TNAME##Vector_dot(TYPE *a, TYPE *b, int n)
{
/* try cblas */
#if #TYPE == "float"
return cblas_sdot(n, a, 1, b, 1);
#elif #TYPE == "double"
return cblas_ddot(n, a, 1, b, 1);
#else
int i;
TYPE s = 0;
for(i=0;i<n;i++)
s += a[i] * b[i];
return s;
#endif
}
#endmacro
DEFINE_VDOT(Float,float);
DEFINE_VDOT(Double,double);
DEFINE_VDOT(Int,int);
Details -- The values of the macro parameters are normally macro-expanded before substituting them into the text of the macro. However this macro-expansion does not happen when the substitution occurs in the context of a stringification or token concatenation operator. All this is consistent with the standard. The novelty is that this macro-expansion does not occur either when the parameter appears in a nested preprocessor directive or multiline macro.
More details -- The stringification operator only works when the next non-space token is a macro parameter. This provides a good way to distinguish a nested directive from a stringification operator appearing in the beginning of a line.
Even more details -- The standard mandates that the tokens generated by a macro-expansion can be combined with the following tokens to compose a new macro invocation. This is not allowed for multiline macros. An error is signaled if the expansion of a multiline macro generates an incomplete macro argument list.
Negative comma in variadic macros
Consider the following variadic macro
#define macro(msg, ...) printf(msg, __VA_ARGS__)
The C standard says that it is an error to call this macro with only
one argument. Calling this macro with an empty second argument
--macro(msg,)-- leaves an annoying comma in the expansion
--printf(msg,)-- and causes a compiler syntax error.
This preprocessor accepts invocations of such a macro with a single
argument. The value of parameter __VA_ARGS__ is then a so-called
negative comma, meaning that the preceding comma is eliminated when
this parameter appears in the macro definition between a comma and a
closing parenthesis.
Recursive macros
When a new invocation of the macro appears in the expansion of a
macro, the standard specifies that the preprocessor must rescan the
expansion but should not recursively expand the macro. Although this
restriction is both wise and useful, there are rare cases where one
would like to use recursive macros. As an experiment, this recursion
prevention feature is turned off when one defines a multiline macro
with #defrecursivemacro instead of #defmacro. Note that this might
prevent the preprocessor from terminating unless the macro eventually
takes a conditional branch that does not recursively invoke the macro.
Program lcdecl
Synopsis
ldecl [options] inputfile.c [-o outputfile.txt]
Preprocess and parse file inputfile.c.
The output of a parser is a sequence of Lua data structures
representing each C definition or declaration encountered in the code.
Program ldecl prints each of them in two forms. The first form
directly represent the Lua tables composing the data structure. The
second form reconstructs a piece of C code representing the definition
or declaration of interest.
This program is mostly useful to people working with the Lua functions
offered by the cparser module because it provides a quick way to inspect
the resulting data structures.
Options
Program lcdecl accepts all the preprocessing options
documented for program lcpp. It also accepts an additional
option -Ttypename and also adds to the meaning of
options -Zpass and -std=dialect.
-Ttypename
Similar tolcpp, programlcdeclonly reads the include files that are found along the path specified by the-Ioptions. It is generally not desirable to rea
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
