[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the syntax and semantics of the AutoGen definition file. In order to instantiate a template, you normally must provide a definitions file that identifies itself and contains some value definitions. Consequently, we keep it very simple. For "advanced" users, there are preprocessing directives, sparse arrays, named indexes and comments that may be used as well.
The definitions file is used to associate values with names. Every value is implicitly an array of values, even if there is only one value. Values may be either simple strings or compound collections of name-value pairs. An array may not contain both simple and compound members. Fundamentally, it is as simple as:
prog_name = "autogen"; flag = { name = templ_dirs; value = L; descrip = "Template search directory list"; }; |
For purposes of commenting and controlling the processing of the
definitions, C-style comments and most C preprocessing directives are
honored. The major exception is that the #if
directive is
ignored, along with all following text through the matching
#endif
directive. The C preprocessor is not actually invoked, so
C macro substitution is not performed.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The first definition in this file is used to identify it as a
AutoGen file. It consists of the two keywords,
`autogen' and `definitions' followed by the default
template name and a terminating semi-colon (;
). That is:
AutoGen Definitions template-name; |
Note that, other than the name template-name, the words `AutoGen' and `Definitions' are searched for without case sensitivity. Most lookups in this program are case insensitive.
Also, if the input contains more identification definitions, they will be ignored. This is done so that you may include (see section 2.5 Controlling What Gets Processed) other definition files without an identification conflict.
AutoGen uses the name of the template to find the corresponding template file. It searches for the file in the following way, stopping when it finds the file:
If AutoGen fails to find the template file in one of these places, it prints an error message and exits.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Any name may have multiple values associated with it in the definition file. If there is more than one instance, the only way to expand all of the copies of it is by using the FOR (see section 3.6.13 FOR - Emit a template block multiple times) text function on it, as described in the next chapter.
There are two kinds of definitions, `simple' and `compound'. They are defined thus (see section 2.9 YACC Language Grammar):
compound_name '=' '{' definition-list '}' ';' simple_name '=' string ';' no_text_name ';' |
No_text_name
is a simple definition with a shorthand empty string
value. The string values for definitions may be specified in any of
several formation rules.
2.2.1 Definition List 2.2.2 Double Quote String 2.2.3 Single Quote String 2.2.5 An Unquoted String 2.2.4 Shell Output String 2.2.6 Scheme Result String 2.2.7 A Here String 2.2.8 Concatenated Strings
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
definition-list
is a list of definitions that may or may not
contain nested compound definitions. Any such definitions may
only be expanded within a FOR
block iterating over the
containing compound definition. See section 3.6.13 FOR - Emit a template block multiple times.
Here is, again, the example definitions from the previous chapter, with three additional name value pairs. Two with an empty value assigned (first and last), and a "global" group_name.
autogen definitions list; group_name = example; list = { list_element = alpha; first; list_info = "some alpha stuff"; }; list = { list_info = "more beta stuff"; list_element = beta; }; list = { list_element = omega; last; list_info = "final omega stuff"; }; |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The string follows the C-style escaping (\
, \n
, \f
,
\v
, etc.), plus octal character numbers specified as \ooo
.
The difference from "C" is that the string may span multiple lines.
Like ANSI "C", a series of these strings, possibly intermixed with
single quote strings, will be concatenated together.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This is similar to the shell single-quote string. However, escapes
\
are honored before another escape, single quotes '
and hash characters #
. This latter is done specifically
to disambiguate lines starting with a hash character inside
of a quoted string. In other words,
fumble = ' #endif '; |
could be misinterpreted by the definitions scanner, whereas this would not:
fumble = ' \#endif '; |
As with the double quote string, a series of these, even intermixed
with double quote strings, will be concatenated together.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This is assembled according to the same rules as the double quote string, except that there is no concatenation of strings and the resulting string is written to a shell server process. The definition takes on the value of the output string.
NB The text is interpreted by a server shell. There may be
left over state from previous `
processing and it may
leave state for subsequent processing. However, a cd
to the original directory is always issued before the new
command is issued.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A simple string that does not contain white space may be left
unquoted. The string must not contain any of the characters special to
the definition text (i.e. "
, #
, '
, (
,
)
, ,
, ;
, <
, =
, >
, [
,
]
, `
, {
, or }
). This list is subject to
change, but it will never contain underscore (_
), period
(.
), slash (/
), colon (:
), hyphen (-
) or
backslash (\\
). Basically, if the string looks like it is a
normal DOS or UNIX file or variable name, and it is not one of two
keywords (`autogen' or `definitions') then it is OK to not
quote it, otherwise you should.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A scheme result string must begin with an open parenthesis (
.
The scheme expression will be evaluated by Guile and the
value will be the result. The AutoGen expression functions
are disabled at this stage, so do not use them.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A `here string' is formed in much the same way as a shell here doc. It is denoted with a doubled less than character and, optionally, a hyphen. This is followed by optional horizontal white space and an ending marker-identifier. This marker must follow the syntax rules for identifiers. Unlike the shell version, however, you must not quote this marker. The resulting string will start with the first character on the next line and continue up to but not including the newline that precedes the line that begins with the marker token. No backslash or any other kind of processing is done on this string. The characters are copied directly into the result string.
Here are two examples:
str1 = <<- STR_END $quotes = " ' ` STR_END; str2 = << STR_END $quotes = " ' ` STR_END; STR_END; |
The second string contains one new line character. The first character
is the tab character preceeding the dollar sign. The last character is
the semicolon after the STR_END
. That STR_END
does not
end the string because it is not at the beginning of the line. In the
preceeding case, the leading tab was stripped.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If single or double quote characters are used, then you also have the option, a la ANSI-C syntax, of implicitly concatenating a series of them together, with intervening white space ignored.
NB You cannot use directives to alter the string content. That is,
str = "fumble" #ifdef LATER "stumble" #endif ; |
will result in a syntax error. The preprocessing directives are not carried out by the C preprocessor. However,
str = '"fumble\n" #ifdef LATER " stumble\n" #endif '; |
Will work. It will enclose the `#ifdef LATER'
and `#endif' in the string. But it may also wreak
havoc with the definition processing directives. The hash
characters in the first column should be disambiguated with
an escape \
or join them with previous lines:
"fumble\n#ifdef LATER...
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In AutoGen, every name is implicitly an array of values. When assigning values, they are usually implicitly assiged to the next highest slot. They can also be specified explicitly:
mumble[9] = stumble; mumble[0] = grumble; |
If, subsequently, you assign a value to mumble
without an
index, its index will be 10
, not 1
.
If indexes are specified, they must not cause conflicts.
#define
-d names may also be used for index values.
This is equivalent to the above:
#define FIRST 0 #define LAST 9 mumble[LAST] = stumble; mumble[FIRST] = grumble; |
All values in a range do not have to be filled in. If you leave gaps, then you will have a sparse array. This is fine (see section 3.6.13 FOR - Emit a template block multiple times). You have your choice of iterating over all the defined values, or iterating over a range of slots. This:
[+ FOR mumble +][+ ENDFOR +] |
iterates over all and only the defined entries, whereas this:
[+ FOR mumble (for-by 1) +][+ ENDFOR +] |
will iterate over all 10 "slots". Your template will likely have to contain something like this:
[+ IF (exist? (sprintf "mumble[%d]" (for-index))) +] |
or else "mumble" will have to be a compound value that, say, always contains a "grumble" value:
[+ IF (exist? "grumble") +] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are several methods for including dynamic content inside a definitions
file. Three of them are mentioned above (2.2.4 Shell Output String and
see section 2.2.6 Scheme Result String) in the discussion of string formation rules.
Another method uses the #shell
processing directive.
It will be discussed in the next section (see section 2.5 Controlling What Gets Processed).
Guile/Scheme may also be used to yield to create definitions.
When the Scheme expression is preceeded by a backslash and single quote, then the expression is expected to be an alist of names and values that will be used to create AutoGen definitions.
This method can be be used as follows:
\'( (name (value-expression)) (name2 (another-expr)) ) |
This is entirely equivalent to:
name = (value-expression); name2 = (another-expr); |
Under the covers, the expression gets handed off to a Guile function
named alist->autogen-def
in an expression that looks like this:
(alist->autogen-def ( (name (value-expression)) (name2 (another-expr)) ) ) |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Definition processing directives can only be processed
if the '#' character is the first character on a line. Also, if you
want a '#' as the first character of a line in one of your string
assignments, you should either escape it by preceding it with a
backslash `\', or by embedding it in the string as in "\n#"
.
All of the normal C preprocessing directives are recognized, though
several are ignored. There is also an additional #shell
-
#endshell
pair. Another minor difference is that AutoGen
directives must have the hash character (#
) in column 1.
The final tweak is that #!
is treated as a comment line.
Using this feature, you can use: `#! /usr/local/bin/autogen'
as the first line of a definitons file, set the mode to executable
and "run" the definitions file as if it were a direct invocation of
AutoGen. This was done for its hack value.
The ignored directives are:
`#assert', `#ident', `#pragma', and `#if'.
Note that when ignoring the #if
directive, all intervening
text through its matching #endif
is also ignored,
including the #else
clause.
The AutoGen directives that affect the processing of definitions are:
#define name [ <text> ]
Will add the name to the define list as if it were a DEFINE program argument. Its value will be the first non-whitespace token following the name. Quotes are not processed.
After the definitions file has been processed, any remaining entries in the define list will be added to the environment.
#elif
This must follow an #if
otherwise it will generate an error.
It will be ignored.
#else
This must follow an #if
, #ifdef
or #ifndef
.
If it follows the #if
, then it will be ignored. Otherwise,
it will change the processing state to the reverse of what it was.
#endif
This must follow an #if
, #ifdef
or #ifndef
.
In all cases, this will resume normal processing of text.
#endshell
Ends the text processed by a command shell into autogen definitions.
#error [ <descriptive text> ]
This directive will cause AutoGen to stop processing and exit with a status of EXIT_FAILURE.
#if [ <ignored conditional expression> ]
#if
expressions are not analyzed. Everything from here
to the matching #endif
is skipped.
#ifdef name-to-test
The definitions that follow, up to the matching #endif
will be
processed only if there is a corresponding -Dname
command line
option.
#ifndef name-to-test
The definitions that follow, up to the matching #endif
will be
processed only if there is not a corresponding -Dname
command line option or there was a canceling -Uname
option.
#include unadorned-file-name
This directive will insert definitions from another file into the current collection. If the file name is adorned with double quotes or angle brackets (as in a C program), then the include is ignored.
#line
Alters the current line number and/or file name. You may wish to
use this directive if you extract definition source from other files.
getdefs
uses this mechanism so AutoGen will report the correct
file and approximate line number of any errors found in extracted
definitions.
#option opt-name [ <text> ]
This directive will pass the option name and associated text to the AutoOpts optionLoadLine routine (see section process a text string for options). The option text may span multiple lines by continuing them with a backslash. The backslash/newline pair will be replaced with two space characters. This directive may be used to set a search path for locating template files For example, this:
#option templ-dirs $ENVVAR/dirname |
ENVVAR
environment variable to find
a directory named dirname
that (may) contain templates. Since these
directories are searched in most recently supplied first order, search
directories supplied in this way will be searched before any supplied on
the command line.
#shell
Invokes $SHELL
or `/bin/sh' on a script that should
generate AutoGen definitions. It does this using the same server
process that handles the back-quoted `
text.
CAUTION let not your $SHELL
be csh
.
#undef name-to-undefine
Will remove any entries from the define list that match the undef name pattern.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When AutoGen starts, it tries to determine several names from the
operating environment and put them into environment variables for use in
both #ifdef
tests in the definitions files and in shell scripts
with environment variable tests. __autogen__
is always defined.
For other names, AutoGen will first try to use the POSIX version of the
sysinfo(2)
system call. Failing that, it will try for the POSIX
uname(2)
call. If neither is available, then only
"__autogen__
" will be inserted into the environment.
In all cases, the associated names are converted to lower case, surrounded
by doubled underscores and non-symbol characters are replaced with
underscores.
With Solaris on a sparc platform, sysinfo(2)
is available.
The following strings are used:
SI_SYSNAME
(e.g., "__sunos__")
SI_HOSTNAME
(e.g., "__ellen__")
SI_ARCHITECTURE
(e.g., "__sparc__")
SI_HW_PROVIDER
(e.g., "__sun_microsystems__")
SI_PLATFORM
(e.g., "__sun_ultra_5_10__")
SI_MACHINE
(e.g., "__sun4u__")
For Linux and other operating systems that only support the
uname(2)
call, AutoGen will use these values:
sysname
(e.g., "__linux__")
machine
(e.g., "__i586__")
nodename
(e.g., "__bach__")
By testing these pre-defines in my definitions, you can select
pieces of the definitions without resorting to writing shell
scripts that parse the output of uname(1)
. You can also
segregate real C code from autogen definitions by testing for
"__autogen__
".
#ifdef __bach__ location = home; #else location = work; #endif |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The definitions file may contain C and C++ style comments.
/* * This is a comment. It continues for several lines and closes * when the characters '*' and '/' appear together. */ // this comment is a single line comment |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This is an extended example:
autogen definitions `template-name'; /* * This is a comment that describes what these * definitions are all about. */ global = "value for a global text definition."; /* * Include a standard set of definitions */ #include standards.def a_block = { a_field; a_subblock = { sub_name = first; sub_field = "sub value."; }; #ifdef FEATURE a_subblock = { sub_name = second; }; #endif }; |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The preprocessing directives and comments are not part of the grammar. They are handled by the scanner/lexer. The following was extracted directly from the defParse.y source file:
definitions : identity def_list TK_END { $$ = (YYSTYPE)(rootDefCtx.pDefs = (tDefEntry*)$2); } | identity TK_END { $$ = makeEmptyDefs(); } ; def_list : definition { $$ = $1; } | definition def_list { $$ = addSibMacro( $1, $2 ); } | identity def_list { $$ = $2; } ; identity : TK_AUTOGEN TK_DEFINITIONS filename ';' { $$ = identify( $3 ); } ; definition : value_name ';' { $$ = makeMacro( $1, (YYSTYPE)"", VALTYP_TEXT ); } | value_name '=' text_list ';' { $$ = makeMacroList( $1, $3, VALTYP_TEXT ); } | value_name '=' block_list ';' { $$ = makeMacroList( $1, $3, VALTYP_BLOCK ); } ; text_list : anystring { $$ = startList( $1 ); } | anystring ',' text_list { $$ = appendList( $1, $3 ); } ; block_list : def_block { $$ = startList( $1 ); } | def_block ',' block_list { $$ = appendList( $1, $3 ); } ; def_block : '{' def_list '}' { $$ = $2; } ; anystring : filename { $$ = $1; } | TK_NUMBER { $$ = $1; } ; filename : TK_OTHER_NAME { $$ = $1; } | TK_STRING { $$ = $1; } | TK_VAR_NAME { $$ = $1; } ; value_name : TK_VAR_NAME { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)NULL ); } | TK_VAR_NAME '[' TK_NUMBER ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } | TK_VAR_NAME '[' TK_VAR_NAME ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } ; |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are several methods for supplying data values for templates.
--override-tpl
and --no-definitions
options on the command line. See section 5. Invoking autogen.
REQUEST_METHOD
is defined
and set to either "GET" or "POST", See section 6.2 AutoGen as a CGI server. Obviously,
all the values are constrained to strings because there is no way
to represent nested values.
xml2ag
. Its output can
either be redirected to a file for later use, or the program can
be used as an AutoGen wrapper. See section 8.6 Invoking xml2ag.
The introductory template example (see section 1.2 A Simple Example) can be rewritten in XML as follows:
<EXAMPLE template="list.tpl"> <LIST list_element="alpha" list_info="some alpha stuff"/> <LIST list_info="more beta stuff" list_element="beta"/> <LIST list_element="omega" list_info="final omega stuff"/> </EXAMPLE> |
A more XML-normal form might look like this:
<EXAMPLE template="list.tpl"> <LIST list_element="alpha">some alpha stuff</LIST> <LIST list_element="beta" >more beta stuff</LIST> <LIST list_element="omega">final omega stuff</LIST> </EXAMPLE> |
list_info
references
into text
references.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |