Node:Definition Syntax, Next:Function Example, Previous:User-defined, Up:User-defined
Definitions of functions can appear anywhere between the rules of an
awk
program. Thus, the general form of an awk
program is
extended to include sequences of rules and user-defined function
definitions.
There is no need to put the definition of a function
before all uses of the function. This is because awk
reads the
entire program before starting to execute any of it.
The definition of a function named name looks like this:
function name(parameter-list) { body-of-function }
name is the name of the function to define. A valid function
name is like a valid variable name: a sequence of letters, digits, and
underscores that doesn't start with a digit.
Within a single awk
program, any particular name can only be
used as a variable, array, or function.
parameter-list is a list of the function's arguments and local variable names, separated by commas. When the function is called, the argument names are used to hold the argument values given in the call. The local variables are initialized to the empty string. A function cannot have two parameters with the same name, nor may it have a parameter with the same name as the function itself.
The body-of-function consists of awk
statements. It is the
most important part of the definition, because it says what the function
should actually do. The argument names exist to give the body a
way to talk about the arguments; local variables exist to give the body
places to keep temporary values.
Argument names are not distinguished syntactically from local variable names. Instead, the number of arguments supplied when the function is called determines how many argument variables there are. Thus, if three argument values are given, the first three names in parameter-list are arguments and the rest are local variables.
It follows that if the number of arguments is not the same in all calls to the function, some of the names in parameter-list may be arguments on some occasions and local variables on others. Another way to think of this is that omitted arguments default to the null string.
Usually when you write a function, you know how many names you intend to use for arguments and how many you intend to use as local variables. It is conventional to place some extra space between the arguments and the local variables, in order to document how your function is supposed to be used.
During execution of the function body, the arguments and local variable
values hide, or shadow, any variables of the same names used in the
rest of the program. The shadowed variables are not accessible in the
function definition, because there is no way to name them while their
names have been taken away for the local variables. All other variables
used in the awk
program can be referenced or set normally in the
function's body.
The arguments and local variables last only as long as the function body is executing. Once the body finishes, you can once again access the variables that were shadowed while the function was running.
The function body can contain expressions that call functions. They can even call this function, either directly or by way of another function. When this happens, we say the function is recursive. The act of a function calling itself is called recursion.
In many awk
implementations, including gawk
,
the keyword function
may be
abbreviated func
. However, POSIX only specifies the use of
the keyword function
. This actually has some practical implications.
If gawk
is in POSIX-compatibility mode
(see Command-Line Options), then the following
statement does not define a function:
func foo() { a = sqrt($1) ; print a }
Instead it defines a rule that, for each record, concatenates the value
of the variable func
with the return value of the function foo
.
If the resulting string is non-null, the action is executed.
This is probably not what is desired. (awk
accepts this input as
syntactically valid, because functions may be used before they are defined
in awk
programs.)
To ensure that your awk
programs are portable, always use the
keyword function
when defining a function.