[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Autoconf is written on top of two layers: M4sugar, which provides convenient macros for pure M4 programming, and M4sh, which provides macros dedicated to shell script generation.
As of this version of Autoconf, these two layers are still experimental, and their interface might change in the future. As a matter of fact, anything that is not documented must not be used.
8.1 M4 Quotation Protecting macros from unwanted expansion 8.2 Using autom4te
The Autoconf executables backbone 8.3 Programming in M4sugar Convenient pure M4 macros 8.4 Programming in M4sh Common shell Constructs
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The most common problem with existing macros is an improper quotation. This section, which users of Autoconf can skip, but which macro writers must read, first justifies the quotation scheme that was chosen for Autoconf and then ends with a rule of thumb. Understanding the former helps one to follow the latter.
8.1.1 Active Characters Characters that change the behavior of M4 8.1.2 One Macro Call Quotation and one macro call 8.1.3 Quotation and Nested Macros Macros calling macros 8.1.4 changequote
is EvilWorse than INTERCAL: M4 + changequote 8.1.5 Quadrigraphs Another way to escape special characters 8.1.6 Quotation Rule Of Thumb One parenthesis, one quote
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To fully understand where proper quotation is important, you first need to know what the special characters are in Autoconf: `#' introduces a comment inside which no macro expansion is performed, `,' separates arguments, `[' and `]' are the quotes themselves, and finally `(' and `)' (which M4 tries to match by pairs).
In order to understand the delicate case of macro calls, we first have to present some obvious failures. Below they are "obvious-ified", but when you find them in real life, they are usually in disguise.
Comments, introduced by a hash and running up to the newline, are opaque tokens to the top level: active characters are turned off, and there is no macro expansion:
# define([def], ine) =># define([def], ine) |
Each time there can be a macro expansion, there is a quotation expansion, i.e., one level of quotes is stripped:
int tab[10]; =>int tab10; [int tab[10];] =>int tab[10]; |
Without this in mind, the reader will try hopelessly to use her macro
array
:
define([array], [int tab[10];]) array =>int tab10; [array] =>array |
How can you correctly output the intended results(2)?
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Let's proceed on the interaction between active characters and macros with this small macro, which just returns its first argument:
define([car], [$1]) |
The two pairs of quotes above are not part of the arguments of
define
; rather, they are understood by the top level when it
tries to find the arguments of define
. Therefore, it is
equivalent to write:
define(car, $1) |
But, while it is acceptable for a `configure.ac' to avoid unnecessary quotes, it is bad practice for Autoconf macros which must both be more robust and also advocate perfect style.
At the top level, there are only two possibilities: either you quote or you don't:
car(foo, bar, baz) =>foo [car(foo, bar, baz)] =>car(foo, bar, baz) |
Let's pay attention to the special characters:
car(#) error-->EOF in argument list |
The closing parenthesis is hidden in the comment; with a hypothetical quoting, the top level understood it this way:
car([#)] |
Proper quotation, of course, fixes the problem:
car([#]) =># |
The reader will easily understand the following examples:
car(foo, bar) =>foo car([foo, bar]) =>foo, bar car((foo, bar)) =>(foo, bar) car([(foo], [bar)]) =>(foo car([], []) => car([[]], [[]]) =>[] |
With this in mind, we can explore the cases where macros invoke macros....
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The examples below use the following macros:
define([car], [$1]) define([active], [ACT, IVE]) define([array], [int tab[10]]) |
Each additional embedded macro call introduces other possible interesting quotations:
car(active) =>ACT car([active]) =>ACT, IVE car([[active]]) =>active |
In the first case, the top level looks for the arguments of car
,
and finds `active'. Because M4 evaluates its arguments
before applying the macro, `active' is expanded, which results in:
car(ACT, IVE) =>ACT |
In the second case, the top level gives `active' as first and only
argument of car
, which results in:
active =>ACT, IVE |
i.e., the argument is evaluated after the macro that invokes it.
In the third case, car
receives `[active]', which results in:
[active] =>active |
exactly as we already saw above.
The example above, applied to a more realistic example, gives:
car(int tab[10];) =>int tab10; car([int tab[10];]) =>int tab10; car([[int tab[10];]]) =>int tab[10]; |
Huh? The first case is easily understood, but why is the second wrong,
and the third right? To understand that, you must know that after
M4 expands a macro, the resulting text is immediately subjected
to macro expansion and quote removal. This means that the quote removal
occurs twice--first before the argument is passed to the car
macro, and second after the car
macro expands to the first
argument.
As the author of the Autoconf macro car
, you then consider it to
be incorrect that your users have to double-quote the arguments of
car
, so you "fix" your macro. Let's call it qar
for
quoted car:
define([qar], [[$1]]) |
and check that qar
is properly fixed:
qar([int tab[10];]) =>int tab[10]; |
Ahhh! That's much better.
But note what you've done: now that the arguments are literal strings, if the user wants to use the results of expansions as arguments, she has to use an unquoted macro call:
qar(active) =>ACT |
where she wanted to reproduce what she used to do with car
:
car([active]) =>ACT, IVE |
Worse yet: she wants to use a macro that produces a set of cpp
macros:
define([my_includes], [#include <stdio.h>]) car([my_includes]) =>#include <stdio.h> qar(my_includes) error-->EOF in argument list |
This macro, qar
, because it double quotes its arguments, forces
its users to leave their macro calls unquoted, which is dangerous.
Commas and other active symbols are interpreted by M4 before
they are given to the macro, often not in the way the users expect.
Also, because qar
behaves differently from the other macros,
it's an exception that should be avoided in Autoconf.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
changequote
is Evil
The temptation is often high to bypass proper quotation, in particular
when it's late at night. Then, many experienced Autoconf hackers
finally surrender to the dark side of the force and use the ultimate
weapon: changequote
.
The M4 builtin changequote
belongs to a set of primitives that
allow one to adjust the syntax of the language to adjust it to one's
needs. For instance, by default M4 uses ``' and `'' as
quotes, but in the context of shell programming (and actually of most
programming languages), that's about the worst choice one can make:
because of strings and back-quoted expressions in shell code (such as
`'this'' and ``that`'), because of literal characters in usual
programming languages (as in `'0''), there are many unbalanced
``' and `''. Proper M4 quotation then becomes a nightmare, if
not impossible. In order to make M4 useful in such a context, its
designers have equipped it with changequote
, which makes it
possible to choose another pair of quotes. M4sugar, M4sh, Autoconf, and
Autotest all have chosen to use `[' and `]'. Not especially
because they are unlikely characters, but because they are
characters unlikely to be unbalanced.
There are other magic primitives, such as changecom
to specify
what syntactic forms are comments (it is common to see
`changecom(<!--, -->)' when M4 is used to produce HTML pages),
changeword
and changesyntax
to change other syntactic
details (such as the character to denote the n-th argument, `$' by
default, the parenthesis around arguments etc.).
These primitives are really meant to make M4 more useful for specific
domains: they should be considered like command line options:
`--quotes', `--comments', `--words', and
--syntax
. Nevertheless, they are implemented as M4 builtins, as
it makes M4 libraries self contained (no need for additional options).
There lies the problem....
The problem is that it is then tempting to use them in the middle of an M4 script, as opposed to its initialization. This, if not carefully thought out, can lead to disastrous effects: you are changing the language in the middle of the execution. Changing and restoring the syntax is often not enough: if you happened to invoke macros in between, these macros will be lost, as the current syntax will probably not be the one they were implemented with.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When writing an Autoconf macro you may occasionally need to generate special characters that are difficult to express with the standard Autoconf quoting rules. For example, you may need to output the regular expression `[^[]', which matches any character other than `['. This expression contains unbalanced brackets so it cannot be put easily into an M4 macro.
You can work around this problem by using one of the following quadrigraphs:
Quadrigraphs are replaced at a late stage of the translation process,
after m4
is run, so they do not get in the way of M4 quoting.
For example, the string `^@<:@', independently of its quotation,
will appear as `^[' in the output.
The empty quadrigraph can be used:
Trailing spaces are smashed by autom4te
. This is a feature.
For instance `@<@&t@:@' produces `@<:@'.
For instance you might want to mention AC_FOO
in a comment, while
still being sure that autom4te
will still catch unexpanded
`AC_*'. Then write `AC@&t@_FOO'.
The name `@&t@' was suggested by Paul Eggert:
I should give some credit to the `@&t@' pun. The `&' is my own invention, but the `t' came from the source code of the ALGOL68C compiler, written by Steve Bourne (of Bourne shell fame), and which used `mt' to denote the empty string. In C, it would have looked like something like:
char const mt[] = "";but of course the source code was written in Algol 68.
I don't know where he got `mt' from: it could have been his own invention, and I suppose it could have been a common pun around the Cambridge University computer lab at the time.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To conclude, the quotation rule of thumb is:
Never over-quote, never under-quote, in particular in the definition of macros. In the few places where the macros need to use brackets (usually in C program text or regular expressions), properly quote the arguments!
It is common to read Autoconf programs with snippets like:
AC_TRY_LINK( changequote(<<, >>)dnl <<#include <time.h> #ifndef tzname /* For SGI. */ extern char *tzname[]; /* RS6000 and others reject char **tzname. */ #endif>>, changequote([, ])dnl [atoi (*tzname);], ac_cv_var_tzname=yes, ac_cv_var_tzname=no) |
which is incredibly useless since AC_TRY_LINK
is already
double quoting, so you just need:
AC_TRY_LINK( [#include <time.h> #ifndef tzname /* For SGI. */ extern char *tzname[]; /* RS6000 and others reject char **tzname. */ #endif], [atoi (*tzname);], [ac_cv_var_tzname=yes], [ac_cv_var_tzname=no]) |
The M4-fluent reader will note that these two examples are rigorously equivalent, since M4 swallows both the `changequote(<<, >>)' and `<<' `>>' when it collects the arguments: these quotes are not part of the arguments!
Simplified, the example above is just doing this:
changequote(<<, >>)dnl <<[]>> changequote([, ])dnl |
instead of simply:
[[]] |
With macros that do not double quote their arguments (which is the rule), double-quote the (risky) literals:
AC_LINK_IFELSE([AC_LANG_PROGRAM( [[#include <time.h> #ifndef tzname /* For SGI. */ extern char *tzname[]; /* RS6000 and others reject char **tzname. */ #endif]], [atoi (*tzname);])], [ac_cv_var_tzname=yes], [ac_cv_var_tzname=no]) |
See section 8.1.5 Quadrigraphs, for what to do if you run into a hopeless case where quoting does not suffice.
When you create a configure
script using newly written macros,
examine it carefully to check whether you need to add more quotes in
your macros. If one or more words have disappeared in the M4
output, you need more quotes. When in doubt, quote.
However, it's also possible to put on too many layers of quotes. If
this happens, the resulting configure
script will contain
unexpanded macros. The autoconf
program checks for this problem
by doing `grep AC_ configure'.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
autom4te
The Autoconf suite, including M4sugar, M4sh, and Autotest, in addition
to Autoconf per se, heavily rely on M4. All these different uses
revealed common needs factored into a layer over m4
:
autom4te
(3).
autom4te
should basically considered as a replacement of
m4
itself.
8.2.1 Invoking autom4te
A GNU M4 wrapper 8.2.2 Customizing autom4te
Customizing the Autoconf package
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
autom4te
The command line arguments are modeled after M4's:
autom4te options files |
where the files are directly passed to m4
. In addition
to the regular expansion, it handles the replacement of the quadrigraphs
(see section 8.1.5 Quadrigraphs), and of `__oline__', the current line in the
output. It supports an extended syntax for the files:
Of course, it supports the Autoconf common subset of options:
As an extension of m4
, it includes the following options:
AC_DIAGNOSE
, for a comprehensive list of categories. Special
values include:
Warnings about `syntax' are enabled by default, and the environment
variable WARNINGS
, a comma separated list of categories, is
honored. autom4te -W category
will actually
behave as if you had run:
autom4te --warnings=syntax,$WARNINGS,category |
If you want to disable autom4te
's defaults and
WARNINGS
, but (for example) enable the warnings about obsolete
constructs, you would use `-W none,obsolete'.
autom4te
displays a back trace for errors, but not for
warnings; if you want them, just pass `-W error'. For instance,
on this `configure.ac':
AC_DEFUN([INNER], [AC_RUN_IFELSE([AC_LANG_PROGRAM([exit (0)])])]) AC_DEFUN([OUTER], [INNER]) AC_INIT OUTER |
you get:
$ autom4te -l autoconf -Wcross configure.ac:8: warning: AC_RUN_IFELSE called without default \ to allow cross compiling $ autom4te -l autoconf -Wcross,error -f configure.ac:8: error: AC_RUN_IFELSE called without default \ to allow cross compiling acgeneral.m4:3044: AC_RUN_IFELSE is expanded from... configure.ac:2: INNER is expanded from... configure.ac:5: OUTER is expanded from... configure.ac:8: the top level |
file.m4f
will be
replaced with file.m4
. This helps tracing the macros which
are executed only when the files are frozen, typically
m4_define
. For instance, running:
autom4te --melt 1.m4 2.m4f 3.m4 4.m4f input.m4 |
is roughly equivalent to running:
m4 1.m4 2.m4 3.m4 4.m4 input.m4 |
while
autom4te 1.m4 2.m4f 3.m4 4.m4f input.m4 |
is equivalent to:
m4 --reload-state=4.m4f input.m4 |
autom4te
freezing is stricter
than M4's: it must produce no warnings, and no output other than empty
lines (a line with whitespace is not empty) and comments
(starting with `#'). Please, note that contrary to m4
,
this options takes no argument:
autom4te 1.m4 2.m4 3.m4 --freeze --output=3.m4f |
corresponds to
m4 1.m4 2.m4 3.m4 --freeze-state=3.m4f |
As another additional feature over m4
, autom4te
caches its results. GNU M4 is able to produce a regular
output and traces at the same time. Traces are heavily used in the
GNU Build System: autoheader
uses them to build
`config.h.in', autoreconf
to determine what
GNU Build System components are used, automake
to
"parse" `configure.ac' etc. To save the long runs of
m4
, traces are cached while performing regular expansion,
and conversely. This cache is (actually, the caches are) stored in
the directory `autom4te.cache'. It can safely be removed
at any moment (especially if for some reason autom4te
considers it is trashed).
Because traces are so important to the GNU Build System,
autom4te
provides high level tracing features as compared to
M4, and helps exploiting the cache:
The format is a regular string, with newlines if desired, and several special escape codes. It defaults to `$f:$l:$n:$%'. It can use the following special escapes:
The escape `$%' produces single-line trace outputs (unless you put newlines in the `separator'), while `$@' and `$*' do not.
See section 3.4 Using autoconf
to Create configure
, for examples of trace uses.
autoconf
preselects all the macros that
autoheader
, automake
, autoreconf
etc. will
trace, so that running m4
is not needed to trace them: the
cache suffices. This results in a huge speed-up.
Finally, autom4te
introduces the concept of Autom4te
libraries. They consists in a powerful yet extremely simple feature:
sets of combined command line arguments:
M4sugar
M4sh
Autotest
Autoconf
As an example, if Autoconf is installed in its default location, `/usr/local', running `autom4te -l m4sugar foo.m4' is strictly equivalent to running `autom4te --prepend-include /usr/local/share/autoconf m4sugar/m4sugar.m4f --warnings syntax foo.m4'. Recursive expansion applies: running `autom4te -l m4sh foo.m4' is the same as `autom4te --language M4sugar m4sugar/m4sh.m4f foo.m4', i.e., `autom4te --prepend-include /usr/local/share/autoconf m4sugar/m4sugar.m4f m4sugar/m4sh.m4f --mode 777 foo.m4'. The definition of the languages is stored in `autom4te.cfg'.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
autom4te
One can customize autom4te
via `~/.autom4te.cfg' (i.e.,
as found in the user home directory), and `./.autom4te.cfg' (i.e.,
as found in the directory from which autom4te
is run). The
order is first reading `autom4te.cfg', then `~/.autom4te.cfg',
then `./.autom4te.cfg', and finally the command line arguments.
In these text files, comments are introduced with #
, and empty
lines are ignored. Customization is performed on a per-language basis,
wrapped in between a `begin-language: "language"',
`end-language: "language"' pair.
Customizing a language stands for appending options (see section 8.2.1 Invoking autom4te
) to the current definition of the language. Options, and
more generally arguments, are introduced by `args:
arguments'. You may use the traditional shell syntax to quote the
arguments.
As an example, to disable Autoconf caches (`autom4te.cache') globally, include the following lines in `~/.autom4te.cfg':
@verbatim ## ------------------ ## ## User Preferences. ## ## ------------------ ##
begin-language: "Autoconf" args: --no-cache end-language: "Autoconf"
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
M4 by itself provides only a small, but sufficient, set of all-purpose macros. M4sugar introduces additional generic macros. Its name was coined by Lars J. Aas: "Readability And Greater Understanding Stands 4 M4sugar".
8.3.1 Redefined M4 Macros M4 builtins changed in M4sugar 8.3.2 Evaluation Macros More quotation and evaluation control 8.3.3 Forbidden Patterns Catching unexpanded macros
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
With a few exceptions, all the M4 native macros are moved in the
`m4_' pseudo-namespace, e.g., M4sugar renames define
as
m4_define
etc.
Some M4 macros are redefined, and are slightly incompatible with their native equivalent.
m4_dnl
is defined.
m4_undefine
.
m4exit
.
ifelse
.
m4_ifdef([macro], [m4_undefine([macro])]) |
to recover the behavior of the builtin.
patsubst
. The name m4_patsubst
is kept for future versions of M4sh, on top of GNU M4 which will
provide extended regular expression syntax via epatsubst
.
m4_undefine
.
regexp
. The name m4_regexp
is kept for future versions of M4sh, on top of GNU M4 which will
provide extended regular expression syntax via eregexp
.
m4wrap
.
You are encouraged to end text with `[]', so that there are
no risks that two consecutive invocations of m4_wrap
result in an
unexpected pasting of tokens, as in
m4_define([foo], [Foo]) m4_define([bar], [Bar]) m4_define([foobar], [FOOBAR]) m4_wrap([bar]) m4_wrap([foo]) =>FOOBAR |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The following macros give some control over the order of the evaluation by adding or removing levels of quotes. They are meant for hard-core M4 programmers.
The following example aims at emphasizing the difference between (i), not
using these macros, (ii), using m4_quote
, and (iii), using
m4_dquote
.
$ cat example.m4 # Overquote, so that quotes are visible. m4_define([show], [$[]1 = [$1], $[]@ = [$@]]) m4_divert(0)dnl show(a, b) show(m4_quote(a, b)) show(m4_dquote(a, b)) $ autom4te -l m4sugar example.m4 $1 = a, $@ = [a],[b] $1 = a,b, $@ = [a,b] $1 = [a],[b], $@ = [[a],[b]] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
M4sugar provides a means to define suspicious patterns, patterns describing tokens which should not be found in the output. For instance, if an Autoconf `configure' script includes tokens such as `AC_DEFINE', or `dnl', then most probably something went wrong (typically a macro was not evaluated because of overquotation).
M4sugar forbids all the tokens matching `^m4_' and `^dnl$'.
Of course, you might encounter exceptions to these generic rules, for instance you might have to refer to `$m4_flags'.
m4_pattern_forbid
pattern.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
M4sh, pronounced "mash", is aiming at producing portable Bourne shell scripts. This name was coined by Lars J. Aas, who notes that, according to the Webster's Revised Unabridged Dictionary (1913):
Mash \Mash\, n. [Akin to G. meisch, maisch, meische, maische, mash, wash, and prob. to AS. miscian to mix. See "Mix".]
- A mass of mixed ingredients reduced to a soft pulpy state by beating or pressure....
- A mixture of meal or bran and water fed to animals.
- A mess; trouble. [Obs.] --Beau. & Fl.
For the time being, it is not mature enough to be widely used.
M4sh provides portable alternatives for some common shell constructs that unfortunately are not portable in practice.
dirname
command.
mkdir
that
lack support for the `-p' option.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |