2. AutoGen Definitions File

This chapter describes the syntax and semantics of the AutoGen definition file. In order to instantiate a template, you normally must provide a definitions file that identifies itself and contains some value definitions. Consequently, we keep it very simple. For "advanced" users, there are preprocessing directives, sparse arrays, named indexes and comments that may be used as well.

The definitions file is used to associate values with names. Every value is implicitly an array of values, even if there is only one value. Values may be either simple strings or compound collections of name-value pairs. An array may not contain both simple and compound members. Fundamentally, it is as simple as:

prog_name = "autogen"; flag = { name = templ_dirs; value = L; descrip = "Template search directory list"; };

For purposes of commenting and controlling the processing of the definitions, C-style comments and most C preprocessing directives are honored. The major exception is that the #if directive is ignored, along with all following text through the matching #endif directive. The C preprocessor is not actually invoked, so C macro substitution is not performed.

2.1 The Identification Definition

2.2 Named Definitions

2.3 Assigning an Index to a Definition

2.4 Dynamic Text

2.5 Controlling What Gets Processed

2.6 Pre-defined Names

2.7 Commenting Your Definitions

2.8 What it all looks like.

2.9 YACC Language Grammar

2.10 Alternate Definition Forms

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.1 The Identification Definition

The first definition in this file is used to identify it as a AutoGen file. It consists of the two keywords, `autogen' and `definitions' followed by the default template name and a terminating semi-colon (;). That is:

AutoGen Definitions template-name;

Note that, other than the name template-name, the words `AutoGen' and `Definitions' are searched for without case sensitivity. Most lookups in this program are case insensitive.

Also, if the input contains more identification definitions, they will be ignored. This is done so that you may include (see section 2.5 Controlling What Gets Processed) other definition files without an identification conflict.

AutoGen uses the name of the template to find the corresponding template file. It searches for the file in the following way, stopping when it finds the file:

It tries to open `./template-name'. If it fails,
it tries `./template-name.tpl'.
It searches for either of these files in the directories listed in the templ-dirs command line option.

If AutoGen fails to find the template file in one of these places, it prints an error message and exits.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2 Named Definitions

Any name may have multiple values associated with it in the definition file. If there is more than one instance, the only way to expand all of the copies of it is by using the FOR (see section 3.6.13 FOR - Emit a template block multiple times) text function on it, as described in the next chapter.

There are two kinds of definitions, `simple' and `compound'. They are defined thus (see section 2.9 YACC Language Grammar):

compound_name '=' '{' definition-list '}' ';' simple_name '=' string ';' no_text_name ';'

No_text_name is a simple definition with a shorthand empty string value. The string values for definitions may be specified in any of several formation rules.

2.2.1 Definition List

2.2.2 Double Quote String

2.2.3 Single Quote String

2.2.5 An Unquoted String

2.2.4 Shell Output String

2.2.6 Scheme Result String

2.2.7 A Here String

2.2.8 Concatenated Strings

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.1 Definition List

definition-list is a list of definitions that may or may not contain nested compound definitions. Any such definitions may only be expanded within a FOR block iterating over the containing compound definition. See section 3.6.13 FOR - Emit a template block multiple times.

Here is, again, the example definitions from the previous chapter, with three additional name value pairs. Two with an empty value assigned (first and last), and a "global" group_name.

autogen definitions list; group_name = example; list = { list_element = alpha; first; list_info = "some alpha stuff"; }; list = { list_info = "more beta stuff"; list_element = beta; }; list = { list_element = omega; last; list_info = "final omega stuff"; };

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.2 Double Quote String

The string follows the C-style escaping (\, \n, \f, \v, etc.), plus octal character numbers specified as \ooo. The difference from "C" is that the string may span multiple lines. Like ANSI "C", a series of these strings, possibly intermixed with single quote strings, will be concatenated together.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.3 Single Quote String

This is similar to the shell single-quote string. However, escapes \ are honored before another escape, single quotes ' and hash characters #. This latter is done specifically to disambiguate lines starting with a hash character inside of a quoted string. In other words,

fumble = ' #endif ';

could be misinterpreted by the definitions scanner, whereas this would not:

fumble = ' \#endif ';

As with the double quote string, a series of these, even intermixed with double quote strings, will be concatenated together.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.4 Shell Output String

This is assembled according to the same rules as the double quote string, except that there is no concatenation of strings and the resulting string is written to a shell server process. The definition takes on the value of the output string.

NB The text is interpreted by a server shell. There may be left over state from previous ` processing and it may leave state for subsequent processing. However, a cd to the original directory is always issued before the new command is issued.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.5 An Unquoted String

A simple string that does not contain white space may be left unquoted. The string must not contain any of the characters special to the definition text (i.e. ", #, ', (, ), ,, ;, <, =, >, [, ], `, {, or }). This list is subject to change, but it will never contain underscore (_), period (.), slash (/), colon (:), hyphen (-) or backslash (\\). Basically, if the string looks like it is a normal DOS or UNIX file or variable name, and it is not one of two keywords (`autogen' or `definitions') then it is OK to not quote it, otherwise you should.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.6 Scheme Result String

A scheme result string must begin with an open parenthesis (. The scheme expression will be evaluated by Guile and the value will be the result. The AutoGen expression functions are disabled at this stage, so do not use them.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.7 A Here String

A `here string' is formed in much the same way as a shell here doc. It is denoted with a doubled less than character and, optionally, a hyphen. This is followed by optional horizontal white space and an ending marker-identifier. This marker must follow the syntax rules for identifiers. Unlike the shell version, however, you must not quote this marker. The resulting string will start with the first character on the next line and continue up to but not including the newline that precedes the line that begins with the marker token. No backslash or any other kind of processing is done on this string. The characters are copied directly into the result string.

Here are two examples:

str1 = <<- STR_END $quotes = " ' ` STR_END; str2 = << STR_END $quotes = " ' ` STR_END; STR_END;
The first string contains no new line characters. The first character is the dollar sign, the last the back quote.

The second string contains one new line character. The first character is the tab character preceeding the dollar sign. The last character is the semicolon after the STR_END. That STR_END does not end the string because it is not at the beginning of the line. In the preceeding case, the leading tab was stripped.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.2.8 Concatenated Strings

If single or double quote characters are used, then you also have the option, a la ANSI-C syntax, of implicitly concatenating a series of them together, with intervening white space ignored.

NB You cannot use directives to alter the string content. That is,

str = "fumble" #ifdef LATER "stumble" #endif ;

will result in a syntax error. The preprocessing directives are not carried out by the C preprocessor. However,

str = '"fumble\n" #ifdef LATER " stumble\n" #endif ';

Will work. It will enclose the `#ifdef LATER' and `#endif' in the string. But it may also wreak havoc with the definition processing directives. The hash characters in the first column should be disambiguated with an escape \ or join them with previous lines: "fumble\n#ifdef LATER....

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.3 Assigning an Index to a Definition

In AutoGen, every name is implicitly an array of values. When assigning values, they are usually implicitly assiged to the next highest slot. They can also be specified explicitly:

mumble[9] = stumble; mumble[0] = grumble;

If, subsequently, you assign a value to mumble without an index, its index will be 10, not 1. If indexes are specified, they must not cause conflicts.

#define-d names may also be used for index values. This is equivalent to the above:

#define FIRST 0 #define LAST 9 mumble[LAST] = stumble; mumble[FIRST] = grumble;

All values in a range do not have to be filled in. If you leave gaps, then you will have a sparse array. This is fine (see section 3.6.13 FOR - Emit a template block multiple times). You have your choice of iterating over all the defined values, or iterating over a range of slots. This:

[+ FOR mumble +][+ ENDFOR +]

iterates over all and only the defined entries, whereas this:

[+ FOR mumble (for-by 1) +][+ ENDFOR +]

will iterate over all 10 "slots". Your template will likely have to contain something like this:

[+ IF (exist? (sprintf "mumble[%d]" (for-index))) +]

or else "mumble" will have to be a compound value that, say, always contains a "grumble" value:

[+ IF (exist? "grumble") +]

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.4 Dynamic Text

There are several methods for including dynamic content inside a definitions file. Three of them are mentioned above (2.2.4 Shell Output String and see section 2.2.6 Scheme Result String) in the discussion of string formation rules. Another method uses the #shell processing directive. It will be discussed in the next section (see section 2.5 Controlling What Gets Processed). Guile/Scheme may also be used to yield to create definitions.

When the Scheme expression is preceeded by a backslash and single quote, then the expression is expected to be an alist of names and values that will be used to create AutoGen definitions.

This method can be be used as follows:

\'( (name (value-expression)) (name2 (another-expr)) )

This is entirely equivalent to:

name = (value-expression); name2 = (another-expr);

Under the covers, the expression gets handed off to a Guile function named alist->autogen-def in an expression that looks like this:

(alist->autogen-def ( (name (value-expression)) (name2 (another-expr)) ) )

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.5 Controlling What Gets Processed

Definition processing directives can only be processed if the '#' character is the first character on a line. Also, if you want a '#' as the first character of a line in one of your string assignments, you should either escape it by preceding it with a backslash `\', or by embedding it in the string as in "\n#".

All of the normal C preprocessing directives are recognized, though several are ignored. There is also an additional #shell - #endshell pair. Another minor difference is that AutoGen directives must have the hash character (#) in column 1.

The final tweak is that #! is treated as a comment line. Using this feature, you can use: `#! /usr/local/bin/autogen' as the first line of a definitons file, set the mode to executable and "run" the definitions file as if it were a direct invocation of AutoGen. This was done for its hack value.

The ignored directives are: `#assert', `#ident', `#pragma', and `#if'. Note that when ignoring the #if directive, all intervening text through its matching #endif is also ignored, including the #else clause.

The AutoGen directives that affect the processing of definitions are:

#define name [ <text> ]

Will add the name to the define list as if it were a DEFINE program argument. Its value will be the first non-whitespace token following the name. Quotes are not processed.

After the definitions file has been processed, any remaining entries in the define list will be added to the environment.

#elif

This must follow an #if otherwise it will generate an error. It will be ignored.

#else

This must follow an #if, #ifdef or #ifndef. If it follows the #if, then it will be ignored. Otherwise, it will change the processing state to the reverse of what it was.

#endif

This must follow an #if, #ifdef or #ifndef. In all cases, this will resume normal processing of text.

#endshell

Ends the text processed by a command shell into autogen definitions.

#error [ <descriptive text> ]

This directive will cause AutoGen to stop processing and exit with a status of EXIT_FAILURE.

#if [ <ignored conditional expression> ]

#if expressions are not analyzed. Everything from here to the matching #endif is skipped.

#ifdef name-to-test

The definitions that follow, up to the matching #endif will be processed only if there is a corresponding -Dname command line option.

#ifndef name-to-test

The definitions that follow, up to the matching #endif will be processed only if there is not a corresponding -Dname command line option or there was a canceling -Uname option.

#include unadorned-file-name

This directive will insert definitions from another file into the current collection. If the file name is adorned with double quotes or angle brackets (as in a C program), then the include is ignored.

#line

Alters the current line number and/or file name. You may wish to use this directive if you extract definition source from other files. getdefs uses this mechanism so AutoGen will report the correct file and approximate line number of any errors found in extracted definitions.

#option opt-name [ <text> ]

This directive will pass the option name and associated text to the AutoOpts optionLoadLine routine (see section process a text string for options). The option text may span multiple lines by continuing them with a backslash. The backslash/newline pair will be replaced with two space characters. This directive may be used to set a search path for locating template files For example, this:

#option templ-dirs $ENVVAR/dirname
will direct autogen to use the ENVVAR environment variable to find a directory named dirname that (may) contain templates. Since these directories are searched in most recently supplied first order, search directories supplied in this way will be searched before any supplied on the command line.

#shell

Invokes $SHELL or `/bin/sh' on a script that should generate AutoGen definitions. It does this using the same server process that handles the back-quoted ` text. CAUTION let not your $SHELL be csh.

#undef name-to-undefine

Will remove any entries from the define list that match the undef name pattern.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.6 Pre-defined Names

When AutoGen starts, it tries to determine several names from the operating environment and put them into environment variables for use in both #ifdef tests in the definitions files and in shell scripts with environment variable tests. __autogen__ is always defined. For other names, AutoGen will first try to use the POSIX version of the sysinfo(2) system call. Failing that, it will try for the POSIX uname(2) call. If neither is available, then only "__autogen__" will be inserted into the environment. In all cases, the associated names are converted to lower case, surrounded by doubled underscores and non-symbol characters are replaced with underscores.

With Solaris on a sparc platform, sysinfo(2) is available. The following strings are used:

SI_SYSNAME (e.g., "__sunos__")
SI_HOSTNAME (e.g., "__ellen__")
SI_ARCHITECTURE (e.g., "__sparc__")
SI_HW_PROVIDER (e.g., "__sun_microsystems__")
SI_PLATFORM (e.g., "__sun_ultra_5_10__")
SI_MACHINE (e.g., "__sun4u__")

For Linux and other operating systems that only support the uname(2) call, AutoGen will use these values:

sysname (e.g., "__linux__")
machine (e.g., "__i586__")
nodename (e.g., "__bach__")

By testing these pre-defines in my definitions, you can select pieces of the definitions without resorting to writing shell scripts that parse the output of uname(1). You can also segregate real C code from autogen definitions by testing for "__autogen__".

#ifdef __bach__ location = home; #else location = work; #endif

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.7 Commenting Your Definitions

The definitions file may contain C and C++ style comments.

/* * This is a comment. It continues for several lines and closes * when the characters '*' and '/' appear together. */ // this comment is a single line comment

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.8 What it all looks like.

This is an extended example:

autogen definitions `template-name'; /* * This is a comment that describes what these * definitions are all about. */ global = "value for a global text definition."; /* * Include a standard set of definitions */ #include standards.def a_block = { a_field; a_subblock = { sub_name = first; sub_field = "sub value."; }; #ifdef FEATURE a_subblock = { sub_name = second; }; #endif };

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.9 YACC Language Grammar

The preprocessing directives and comments are not part of the grammar. They are handled by the scanner/lexer. The following was extracted directly from the defParse.y source file:

definitions : identity def_list TK_END { $$ = (YYSTYPE)(rootDefCtx.pDefs = (tDefEntry*)$2); } | identity TK_END { $$ = makeEmptyDefs(); } ; def_list : definition { $$ = $1; } | definition def_list { $$ = addSibMacro( $1, $2 ); } | identity def_list { $$ = $2; } ; identity : TK_AUTOGEN TK_DEFINITIONS filename ';' { $$ = identify( $3 ); } ; definition : value_name ';' { $$ = makeMacro( $1, (YYSTYPE)"", VALTYP_TEXT ); } | value_name '=' text_list ';' { $$ = makeMacroList( $1, $3, VALTYP_TEXT ); } | value_name '=' block_list ';' { $$ = makeMacroList( $1, $3, VALTYP_BLOCK ); } ; text_list : anystring { $$ = startList( $1 ); } | anystring ',' text_list { $$ = appendList( $1, $3 ); } ; block_list : def_block { $$ = startList( $1 ); } | def_block ',' block_list { $$ = appendList( $1, $3 ); } ; def_block : '{' def_list '}' { $$ = $2; } ; anystring : filename { $$ = $1; } | TK_NUMBER { $$ = $1; } ; filename : TK_OTHER_NAME { $$ = $1; } | TK_STRING { $$ = $1; } | TK_VAR_NAME { $$ = $1; } ; value_name : TK_VAR_NAME { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)NULL ); } | TK_VAR_NAME '[' TK_NUMBER ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } | TK_VAR_NAME '[' TK_VAR_NAME ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } ;

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

2.10 Alternate Definition Forms

There are several methods for supplying data values for templates.

`no definitions'

It is entirely possible to write a template that does not depend upon external definitions. Such a template would likely have an unvarying output, but be convenient nonetheless because of an external library of either AutoGen or Scheme functions, or both. This can be accommodated by providing the --override-tpl and --no-definitions options on the command line. See section 5. Invoking autogen.

`CGI'

AutoGen behaves as a CGI server if the definitions input is from stdin and the environment variable REQUEST_METHOD is defined and set to either "GET" or "POST", See section 6.2 AutoGen as a CGI server. Obviously, all the values are constrained to strings because there is no way to represent nested values.

`XML'

AutoGen comes with a program named, xml2ag. Its output can either be redirected to a file for later use, or the program can be used as an AutoGen wrapper. See section 8.6 Invoking xml2ag.

The introductory template example (see section 1.2 A Simple Example) can be rewritten in XML as follows:

A more XML-normal form might look like this:

<EXAMPLE template="list.tpl"> <LIST list_element="alpha">some alpha stuff</LIST> <LIST list_element="beta" >more beta stuff</LIST> <LIST list_element="omega">final omega stuff</LIST> </EXAMPLE>
but you would have to change the template list_info references into text references.

`standard AutoGen definitions'

Of course. :-)

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Bruce Korb on May 5, 2003 using texi2html