Go to the first, previous, next, last section, table of contents.

The Basic Program/System Interface

Processes are the primitive units for allocation of system resources. Each process has its own address space and (usually) one thread of control. A process executes a program; you can have multiple processes executing the same program, but each process has its own copy of the program within its own address space and executes it independently of the other copies. Though it may have multiple threads of control within the same program and a program may be composed of multiple logically separate modules, a process always executes exactly one program.

Note that we are using a specific definition of "program" for the purposes of this manual, which corresponds to a common definition in the context of Unix system. In popular usage, "program" enjoys a much broader definition; it can refer for example to a system's kernel, an editor macro, a complex package of software, or a discrete section of code executing within a process.

Writing the program is what this manual is all about. This chapter explains the most basic interface between your program and the system that runs, or calls, it. This includes passing of parameters (arguments and environment) from the system, requesting basic services from the system, and telling the system the program is done.

A program starts another program with the exec family of system calls. This chapter looks at program startup from the execee's point of view. To see the event from the execor's point of view, See section Executing a File.

Program Arguments

The system starts a C program by calling the function main. It is up to you to write a function named main---otherwise, you won't even be able to link your program without errors.

In ISO C you can define main either to take no arguments, or to take two arguments that represent the command line arguments to the program, like this:

int main (int argc, char *argv[])

The command line arguments are the whitespace-separated tokens given in the shell command used to invoke the program; thus, in `cat foo bar', the arguments are `foo' and `bar'. The only way a program can look at its command line arguments is via the arguments of main. If main doesn't take arguments, then you cannot get at the command line.

The value of the argc argument is the number of command line arguments. The argv argument is a vector of C strings; its elements are the individual command line argument strings. The file name of the program being run is also included in the vector as the first element; the value of argc counts this element. A null pointer always follows the last element: argv[argc] is this null pointer.

For the command `cat foo bar', argc is 3 and argv has three elements, "cat", "foo" and "bar".

In Unix systems you can define main a third way, using three arguments:

int main (int argc, char *argv[], char *envp[])

The first two arguments are just the same. The third argument envp gives the program's environment; it is the same as the value of environ. See section Environment Variables. POSIX.1 does not allow this three-argument form, so to be portable it is best to write main to take two arguments, and use the value of environ.

Program Argument Syntax Conventions

POSIX recommends these conventions for command line arguments. getopt (see section Parsing program options using getopt) and argp_parse (see section Parsing Program Options with Argp) make it easy to implement them.

Arguments are options if they begin with a hyphen delimiter (`-').
Multiple options may follow a hyphen delimiter in a single token if the options do not take arguments. Thus, `-abc' is equivalent to `-a -b -c'.
Option names are single alphanumeric characters (as for isalnum; see section Classification of Characters).
Certain options require an argument. For example, the `-o' command of the ld command requires an argument--an output file name.
An option and its argument may or may not appear as separate tokens. (In other words, the whitespace separating them is optional.) Thus, `-o foo' and `-ofoo' are equivalent.
Options typically precede other non-option arguments. The implementations of getopt and argp_parse in the GNU C library normally make it appear as if all the option arguments were specified before all the non-option arguments for the purposes of parsing, even if the user of your program intermixed option and non-option arguments. They do this by reordering the elements of the argv array. This behavior is nonstandard; if you want to suppress it, define the _POSIX_OPTION_ORDER environment variable. See section Standard Environment Variables.
The argument `--' terminates all options; any following arguments are treated as non-option arguments, even if they begin with a hyphen.
A token consisting of a single hyphen character is interpreted as an ordinary non-option argument. By convention, it is used to specify input from or output to the standard input and output streams.
Options may be supplied in any order, or appear multiple times. The interpretation is left up to the particular application program.

GNU adds long options to these conventions. Long options consist of `--' followed by a name made of alphanumeric characters and dashes. Option names are typically one to three words long, with hyphens to separate words. Users can abbreviate the option names as long as the abbreviations are unique.

To specify an argument for a long option, write `--name=value'. This syntax enables a long option to accept an argument that is itself optional.

Eventually, the GNU system will provide completion for long option names in the shell.

Parsing Program Arguments

If the syntax for the command line arguments to your program is simple enough, you can simply pick the arguments off from argv by hand. But unless your program takes a fixed number of arguments, or all of the arguments are interpreted in the same way (as file names, for example), you are usually better off using getopt (see section Parsing program options using getopt) or argp_parse (see section Parsing Program Options with Argp) to do the parsing.

getopt is more standard (the short-option only version of it is a part of the POSIX standard), but using argp_parse is often easier, both for very simple and very complex option structures, because it does more of the dirty work for you.

Parsing program options using `getopt`

The getopt and getopt_long functions automate some of the chore involved in parsing typical unix command line options.

Using the `getopt` function

Here are the details about how to call the getopt function. To use this facility, your program must include the header file `unistd.h'.

Variable: int opterr: If the value of this variable is nonzero, then getopt prints an error message to the standard error stream if it encounters an unknown option character or an option with a missing required argument. This is the default behavior. If you set this variable to zero, getopt does not print any messages, but it still returns the character ? to indicate an error.

Variable: int optopt: When getopt encounters an unknown option character or an option with a missing required argument, it stores that option character in this variable. You can use this for providing your own diagnostic messages.

Variable: int optind: This variable is set by getopt to the index of the next element of the argv array to be processed. Once getopt has found all of the option arguments, you can use this variable to determine where the remaining non-option arguments begin. The initial value of this variable is 1.

Variable: char * optarg: This variable is set by getopt to point at the value of the option argument, for those options that accept arguments.

Function: int getopt (int argc, char **argv, const char *options)

The getopt function gets the next option argument from the argument list specified by the argv and argc arguments. Normally these values come directly from the arguments received by main.

The options argument is a string that specifies the option characters that are valid for this program. An option character in this string can be followed by a colon (`:') to indicate that it takes a required argument. If an option character is followed by two colons (`::'), its argument is optional; this is a GNU extension.

getopt has three ways to deal with options that follow non-options argv elements. The special argument `--' forces in all cases the end of option scanning.

The default is to permute the contents of argv while scanning it so that eventually all the non-options are at the end. This allows options to be given in any order, even with programs that were not written to expect this.
If the options argument string begins with a hyphen (`-'), this is treated specially. It permits arguments that are not options to be returned as if they were associated with option character `\1'.
POSIX demands the following behaviour: The first non-option stops option processing. This mode is selected by either setting the environment variable POSIXLY_CORRECT or beginning the options argument string with a plus sign (`+').

The getopt function returns the option character for the next command line option. When no more option arguments are available, it returns -1. There may still be more non-option arguments; you must compare the external variable optind against the argc parameter to check this.

If the option has an argument, getopt returns the argument by storing it in the variable optarg. You don't ordinarily need to copy the optarg string, since it is a pointer into the original argv array, not into a static area that might be overwritten.

If getopt finds an option character in argv that was not included in options, or a missing option argument, it returns `?' and sets the external variable optopt to the actual option character. If the first character of options is a colon (`:'), then getopt returns `:' instead of `?' to indicate a missing option argument. In addition, if the external variable opterr is nonzero (which is the default), getopt prints an error message.

Example of Parsing Arguments with `getopt`

Here is an example showing how getopt is typically used. The key points to notice are:

Normally, getopt is called in a loop. When getopt returns -1, indicating no more options are present, the loop terminates.
A switch statement is used to dispatch on the return value from getopt. In typical use, each case just sets a variable that is used later in the program.
A second loop is used to process the remaining non-option arguments.

#include <unistd.h>
#include <stdio.h>

int 
main (int argc, char **argv)
{
  int aflag = 0;
  int bflag = 0;
  char *cvalue = NULL;
  int index;
  int c;

  opterr = 0;

  while ((c = getopt (argc, argv, "abc:")) != -1)
    switch (c)
      {
      case 'a':
        aflag = 1;
        break;
      case 'b':
        bflag = 1;
        break;
      case 'c':
        cvalue = optarg;
        break;
      case '?':
        if (isprint (optopt))
          fprintf (stderr, "Unknown option `-%c'.\n", optopt);
        else
          fprintf (stderr,
                   "Unknown option character `\\x%x'.\n",
                   optopt);
        return 1;
      default:
        abort ();
      }

  printf ("aflag = %d, bflag = %d, cvalue = %s\n",
          aflag, bflag, cvalue);

  for (index = optind; index < argc; index++)
    printf ("Non-option argument %s\n", argv[index]);
  return 0;
}

Here are some examples showing what this program prints with different combinations of arguments:

% testopt
aflag = 0, bflag = 0, cvalue = (null)

% testopt -a -b
aflag = 1, bflag = 1, cvalue = (null)

% testopt -ab
aflag = 1, bflag = 1, cvalue = (null)

% testopt -c foo
aflag = 0, bflag = 0, cvalue = foo

% testopt -cfoo
aflag = 0, bflag = 0, cvalue = foo

% testopt arg1
aflag = 0, bflag = 0, cvalue = (null)
Non-option argument arg1

% testopt -a arg1
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument arg1

% testopt -c foo arg1
aflag = 0, bflag = 0, cvalue = foo
Non-option argument arg1

% testopt -a -- -b
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument -b

% testopt -a -
aflag = 1, bflag = 0, cvalue = (null)
Non-option argument -

Parsing Long Options with `getopt_long`

To accept GNU-style long options as well as single-character options, use getopt_long instead of getopt. This function is declared in `getopt.h', not `unistd.h'. You should make every program accept long options if it uses any options, for this takes little extra work and helps beginners remember how to use the program.

Data Type: struct option

This structure describes a single long option name for the sake of getopt_long. The argument longopts must be an array of these structures, one for each long option. Terminate the array with an element containing all zeros.

The struct option structure has these fields:

const char *name: This field is the name of the option. It is a string.
int has_arg: This field says whether the option takes an argument. It is an integer, and there are three legitimate values: no_argument, required_argument and optional_argument.
int *flag
int val: These fields control how to report or act on the option when it occurs. If flag is a null pointer, then the val is a value which identifies this option. Often these values are chosen to uniquely identify particular long options. If flag is not a null pointer, it should be the address of an int variable which is the flag for this option. The value in val is the value to store in the flag to indicate that the option was seen.

Function: int getopt_long (int argc, char *const *argv, const char *shortopts, struct option *longopts, int *indexptr)

Decode options from the vector argv (whose length is argc). The argument shortopts describes the short options to accept, just as it does in getopt. The argument longopts describes the long options to accept (see above).

When getopt_long encounters a short option, it does the same thing that getopt would do: it returns the character code for the option, and stores the options argument (if it has one) in optarg.

When getopt_long encounters a long option, it takes actions based on the flag and val fields of the definition of that option.

If flag is a null pointer, then getopt_long returns the contents of val to indicate which option it found. You should arrange distinct values in the val field for options with different meanings, so you can decode these values after getopt_long returns. If the long option is equivalent to a short option, you can use the short option's character code in val.

If flag is not a null pointer, that means this option should just set a flag in the program. The flag is a variable of type int that you define. Put the address of the flag in the flag field. Put in the val field the value you would like this option to store in the flag. In this case, getopt_long returns 0.

For any long option, getopt_long tells you the index in the array longopts of the options definition, by storing it into *indexptr. You can get the name of the option with longopts[*indexptr].name. So you can distinguish among long options either by the values in their val fields or by their indices. You can also distinguish in this way among long options that set flags.

When a long option has an argument, getopt_long puts the argument value in the variable optarg before returning. When the option has no argument, the value in optarg is a null pointer. This is how you can tell whether an optional argument was supplied.

When getopt_long has no more options to handle, it returns -1, and leaves in the variable optind the index in argv of the next remaining argument.

Since long option names were used before before the getopt_long options was invented there are program interfaces which require programs to recognize options like `-option value' instead of `--option value'. To enable these programs to use the GNU getopt functionality there is one more function available.

Function: int getopt_long_only (int argc, char *const *argv, const char *shortopts, struct option *longopts, int *indexptr)

The getopt_long_only function is equivalent to the getopt_long function but it allows to specify the user of the application to pass long options with only `-' instead of `--'. The `--' prefix is still recognized but instead of looking through the short options if a `-' is seen it is first tried whether this parameter names a long option. If not, it is parsed as a short option.

Assuming getopt_long_only is used starting an application with

  app -foo

the getopt_long_only will first look for a long option named `foo'. If this is not found, the short options `f', `o', and again `o' are recognized.

Example of Parsing Long Options with `getopt_long`

#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

/* Flag set by `--verbose'. */
static int verbose_flag;

int
main (argc, argv)
     int argc;
     char **argv;
{
  int c;

  while (1)
    {
      static struct option long_options[] =
        {
          /* These options set a flag. */
          {"verbose", no_argument,       &verbose_flag, 1},
          {"brief",   no_argument,       &verbose_flag, 0},
          /* These options don't set a flag.
             We distinguish them by their indices. */
          {"add",     required_argument, 0, 'a'},
          {"append",  no_argument,       0, 'b'},
          {"delete",  required_argument, 0, 'd'},
          {"create",  no_argument,       0, 'c'},
          {"file",    required_argument, 0, 'f'},
          {0, 0, 0, 0}
        };
      /* getopt_long stores the option index here. */
      int option_index = 0;

      c = getopt_long (argc, argv, "abc:d:f:",
                       long_options, &option_index);

      /* Detect the end of the options. */
      if (c == -1)
        break;

      switch (c)
        {
        case 0:
          /* If this option set a flag, do nothing else now. */
          if (long_options[option_index].flag != 0)
            break;
          printf ("option %s", long_options[option_index].name);
          if (optarg)
            printf (" with arg %s", optarg);
          printf ("\n");
          break;

        case 'a':
          puts ("option -a\n");
          break;

        case 'b':
          puts ("option -b\n");
          break;

        case 'c':
          printf ("option -c with value `%s'\n", optarg);
          break;

        case 'd':
          printf ("option -d with value `%s'\n", optarg);
          break;

        case 'f':
          printf ("option -f with value `%s'\n", optarg);
          break;

        case '?':
          /* getopt_long already printed an error message. */
          break;

        default:
          abort ();
        }
    }

  /* Instead of reporting `--verbose'
     and `--brief' as they are encountered,
     we report the final status resulting from them. */
  if (verbose_flag)
    puts ("verbose flag is set");

  /* Print any remaining command line arguments (not options). */
  if (optind < argc)
    {
      printf ("non-option ARGV-elements: ");
      while (optind < argc)
        printf ("%s ", argv[optind++]);
      putchar ('\n');
    }

  exit (0);
}

Parsing Program Options with Argp

Argp is an interface for parsing unix-style argument vectors (see section Program Arguments).

Unlike the more common getopt interface, it provides many related convenience features in addition to parsing options, such as automatically producing output in response to `--help' and `--version' options (as defined by the GNU coding standards). Doing these things in argp results in a more consistent look for programs that use it, and makes less likely that implementors will neglect to implement them or keep them up-to-date.

Argp also provides the ability to merge several independently defined option parsers into one, mediating conflicts between them, and making the result appear seamless. A library can export an argp option parser, which programs can easily use in conjunction with their own option parser. This results in less work for user programs (indeed, some may use only argument parsers exported by libraries, and have no options of their own), and more consistent option-parsing for the abstractions implemented by the library.

The header file `<argp.h>' should be included to use argp.

The `argp_parse` Function

The main interface to argp is the argp_parse function; often, a call to argp_parse is the only argument-parsing code needed in main (see section Program Arguments).

Function: error_t argp_parse (const struct argp *argp, int argc, char **argv, unsigned flags, int *arg_index, void *input)

The argp_parse function parses the arguments in argv, of length argc, using the argp parser argp (see section Specifying Argp Parsers); a value of zero is the same as a struct argp containing all zeros. flags is a set of flag bits that modify the parsing behavior (see section Flags for argp_parse). input is passed through to the argp parser argp, and has meaning defined by it; a typical usage is to pass a pointer to a structure which can be used for specifying parameters to the parser and passing back results from it.

Unless the ARGP_NO_EXIT or ARGP_NO_HELP flags are included in flags, calling argp_parse may result in the program exiting--for instance when an unknown option is encountered. See section Program Termination.

If arg_index is non-null, the index of the first unparsed option in argv is returned in it.

The return value is zero for successful parsing, or an error code (see section Error Codes) if an error was detected. Different argp parsers may return arbitrary error codes, but standard ones are ENOMEM if a memory allocation error occurred, or EINVAL if an unknown option or option argument was encountered.

Argp Global Variables

These variables make it very easy for every user program to implement the `--version' option and provide a bug-reporting address in the `--help' output (which is implemented by argp regardless).

Variable: const char * argp_program_version: If defined or set by the user program to a non-zero value, then a `--version' option is added when parsing with argp_parse (unless the ARGP_NO_HELP flag is used), which will print this string followed by a newline and exit (unless the ARGP_NO_EXIT flag is used).

Variable: const char * argp_program_bug_address: If defined or set by the user program to a non-zero value, argp_program_bug_address should point to a string that is the bug-reporting address for the program. It will be printed at the end of the standard output for the `--help' option, embedded in a sentence that says something like `Report bugs to address.'.

Variable: argp_program_version_hook

If defined or set by the user program to a non-zero value, then a `--version' option is added when parsing with argp_parse (unless the ARGP_NO_HELP flag is used), which calls this function to print the version, and then exits with a status of 0 (unless the ARGP_NO_EXIT flag is used). It should point to a function with the following type signature:

void print-version (FILE *stream, struct argp_state *state)

See section Argp Parsing State, for an explanation of state.

This variable takes precedent over argp_program_version, and is useful if a program has version information that cannot be easily specified as a simple string.

Variable: error_t argp_err_exit_status: The exit status that argp will use when exiting due to a parsing error. If not defined or set by the user program, this defaults to EX_USAGE from `<sysexits.h>'.

Specifying Argp Parsers

The first argument to the argp_parse function is a pointer to a struct argp, which known as an argp parser:

Data Type: struct argp

This structure specifies how to parse a given set of options and arguments, perhaps in conjunction with other argp parsers. It has the following fields:

const struct argp_option *options: A pointer to a vector of argp_option structures specifying which options this argp parser understands; it may be zero if there are no options at all. See section Specifying Options in an Argp Parser.
argp_parser_t parser: A pointer to a function that defines actions for this parser; it is called for each option parsed, and at other well-defined points in the parsing process. A value of zero is the same as a pointer to a function that always returns ARGP_ERR_UNKNOWN. See section Argp Parser Functions.
const char *args_doc: If non-zero, a string describing what non-option arguments are wanted by this parser; it is only used to print the `Usage:' message. If it contains newlines, the strings separated by them are considered alternative usage patterns, and printed on separate lines (lines after the first are prefixed by ` or: ' instead of `Usage:').
const char *doc: If non-zero, a string containing extra text to be printed before and after the options in a long help message, with the two sections separated by a vertical tab ('\v', '\013') character. By convention, the documentation before the options is just a short string saying what the program does, and that afterwards is longer, describing the behavior in more detail.
const struct argp_child *children: A pointer to a vector of argp_children structures specifying additional argp parsers that should be combined with this one. See section Combining Multiple Argp Parsers.
char *(*help_filter)(int key, const char *text, void *input): If non-zero, a pointer to a function to filter the output of help messages. See section Customizing Argp Help Output.
const char *argp_domain: If non-zero, the strings used in the argp library are translated using the domain described by this string. Otherwise the currently installed default domain is used.

The options, parser, args_doc, and doc fields are usually all that are needed. If an argp parser is defined as an initialized C variable, only the used fields need be specified in the initializer--the rest will default to zero due to the way C structure initialization works (this fact is exploited for most argp structures, grouping the most-used fields near the beginning, so that unused fields can simply be left unspecified).

Specifying Options in an Argp Parser

The options field in a struct argp points to a vector of struct argp_option structures, each of which specifies an option that argp parser supports (actually, sometimes multiple entries may used for a single option if it has many names). It should be terminated by an entry with zero in all fields (note that when using an initialized C array for options, writing { 0 } is enough to achieve this).

Data Type: struct argp_option

This structure specifies a single option that an argp parser understands, and how to parse and document it. It has the following fields:

const char *name: The long name for this option, corresponding to the long option `--name'; this field can be zero if this option only has a short name. To specify multiple names for an option, additional entries may follow this one, with the OPTION_ALIAS flag set (see section Flags for Argp Options).
int key: The integer key that is provided to the argp parser's parsing function when this option is being parsed. Also, if key has a value that is a printable ASCII character (i.e., isascii (key) is true), it also specifies a short option `-char', where char is the ASCII character with the code key.
const char *arg: If non-zero, this is the name of an argument associated with this option, which must be provided (e.g., with the `--name=value' or `-char value' syntaxes) unless the OPTION_ARG_OPTIONAL flag (see section Flags for Argp Options) is set, in which case it may be provided.
int flags: Flags associated with this option (some of which are referred to above). See section Flags for Argp Options.
const char *doc: A documentation string for this option, for printing in help messages. If both the name and key fields are zero, this string will be printed out-dented from the normal option column, making it useful as a group header (it will be the first thing printed in its group); in this usage, it's conventional to end the string with a `:' character.
int group: The group this option is in. In a long help message, options are sorted alphabetically within each group, and the groups presented in the order 0, 1, 2, ..., n, -m, ..., -2, -1. Every entry in an options array with this field 0 will inherit the group number of the previous entry, or zero if it's the first one, unless its a group header (name and key fields both zero), in which case, the previous entry + 1 is the default. Automagic options such as `--help' are put into group -1. Note that because of C structure initialization rules, this field often need not be specified, because 0 is the right value.

Flags for Argp Options

The following flags may be or'd together in the flags field of a struct argp_option, and control various aspects of how that option is parsed or displayed in help messages:

OPTION_ARG_OPTIONAL: The argument associated with this option is optional.
OPTION_HIDDEN: This option isn't displayed in any help messages.
OPTION_ALIAS: This option is an alias for the closest previous non-alias option. This means that it will be displayed in the same help entry, and will inherit fields other than name and key from the aliased option.
OPTION_DOC: This option isn't actually an option (and so should be ignored by the actual option parser), but rather an arbitrary piece of documentation that should be displayed in much the same manner as the options (known as a documentation option). If this flag is set, then the option name field is displayed unmodified (e.g., no `--' prefix is added) at the left-margin (where a short option would normally be displayed), and the documentation string in the normal place. For purposes of sorting, any leading whitespace and punctuation is ignored, except that if the first non-whitespace character is not `-', this entry is displayed after all options (and OPTION_DOC entries with a leading `-') in the same group.
OPTION_NO_USAGE: This option shouldn't be included in `long' usage messages (but is still included in help messages). This is mainly intended for options that are completely documented in an argp's args_doc field (see section Specifying Argp Parsers), in which case including the option in the generic usage list would be redundant. For instance, if args_doc is "FOO BAR\n-x BLAH", and the `-x' option's purpose is to distinguish these two cases, `-x' should probably be marked OPTION_NO_USAGE.

Argp Parser Functions

The function pointed to by the parser field in a struct argp (see section Specifying Argp Parsers) defines what actions take place in response to each option or argument that is parsed, and is also used as a hook, to allow a parser to do something at certain other points during parsing.

Argp parser functions have the following type signature:

error_t parser (int key, char *arg, struct argp_state *state)

where the arguments are as follows:

key: For each option that is parsed, parser is called with a value of key from that option's key field in the option vector (see section Specifying Options in an Argp Parser). parser is also called at other times with special reserved keys, such as ARGP_KEY_ARG for non-option arguments. See section Special Keys for Argp Parser Functions.
arg: If key is an option, arg is the value given for it, or zero if no value was specified. Only options that have a non-zero arg field can ever have a value, and those must always have a value, unless the OPTION_ARG_OPTIONAL flag was specified (if the input being parsed specifies a value for an option that doesn't allow one, an error results before parser ever gets called). If key is ARGP_KEY_ARG, arg is a non-option argument; other special keys always have a zero arg.
state: state points to a struct argp_state, containing useful information about the current parsing state for use by parser. See section Argp Parsing State.

When parser is called, it should perform whatever action is appropriate for key, and return either 0 for success, ARGP_ERR_UNKNOWN, if the value of key is not handled by this parser function, or a unix error code if a real error occurred (see section Error Codes).

Macro: int ARGP_ERR_UNKNOWN: Argp parser functions should return ARGP_ERR_UNKNOWN for any key value they do not recognize, or for non-option arguments (key == ARGP_KEY_ARG) that they do not wish to handle.

A typical parser function uses a switch statement on key:

error_t
parse_opt (int key, char *arg, struct argp_state *state)
{
  switch (key)
    {
    case option_key:
      action
      break;
    ...
    default:
      return ARGP_ERR_UNKNOWN;
    }
  return 0;
}

Special Keys for Argp Parser Functions

In addition to key values corresponding to user options, the key argument to argp parser functions may have a number of other special values (arg and state refer to parser function arguments; see section Argp Parser Functions):

ARGP_KEY_ARG

This is not an option at all, but rather a command line argument, whose value is pointed to by arg. When there are multiple parser functions (due to argp parsers being combined), it's impossible to know which one wants to handle an argument, so each is called in turn, until one returns 0 or an error other than ARGP_ERR_UNKNOWN; if an argument is handled by no one, argp_parse immediately returns success, without parsing any more arguments. Once a parser function returns success for this key, that fact is recorded, and the ARGP_KEY_NO_ARGS case won't be used. However, if while processing the argument, a parser function decrements the next field of its state argument, the option won't be considered processed; this is to allow you to actually modify the argument (perhaps into an option), and have it processed again.

ARGP_KEY_ARGS

If a parser function returns ARGP_ERR_UNKNOWN for ARGP_KEY_ARG, it is immediately called again with the key ARGP_KEY_ARGS, which has a similar meaning, but is slightly more convenient for consuming all remaining arguments. arg is 0, and the tail of the argument vector may be found at

state->argv
+ state->next

. If success is returned for this key, and state->next is unchanged, then all remaining arguments are considered to have been consumed, otherwise, the amount by which state->next has been adjust indicates how many were used. For instance, here's an example that uses both, for different args:

...
case ARGP_KEY_ARG:
  if (state->arg_num == 0)
    /* First argument */
    first_arg = arg;
  else
    /* Let the next case parse it.  */
    return ARGP_KEY_UNKNOWN;
  break;
case ARGP_KEY_ARGS:
  remaining_args = state->argv + state->next;
  num_remaining_args = state->argc - state->next;
  break;

ARGP_KEY_END

There are no more command line arguments at all. The parser functions are called in different order (means children first) for this value which allows each parser to clean up its state for the parent.

ARGP_KEY_NO_ARGS

Because it's common to want to do some special processing if there aren't any non-option args, parser functions are called with this key if they didn't successfully process any non-option arguments. Called just before ARGP_KEY_END (where more general validity checks on previously parsed arguments can take place).

ARGP_KEY_INIT

Passed in before any parsing is done. Afterwards, the values of each element of the child_input field of state, if any, are copied to each child's state to be the initial value of the input when their parsers are called.

ARGP_KEY_SUCCESS

Passed in when parsing has successfully been completed (even if there are still arguments remaining).

ARGP_KEY_ERROR

Passed in if an error has occurred, and parsing terminated (in which case a call with a key of ARGP_KEY_SUCCESS is never made).

ARGP_KEY_FINI

The final key ever seen by any parser (even after ARGP_KEY_SUCCESS and ARGP_KEY_ERROR). Any resources allocated by ARGP_KEY_INIT may be freed here (although sometimes certain resources allocated there are to be returned to the caller after a successful parse; in that case, those particular resources can be freed in the ARGP_KEY_ERROR case).

In all cases, ARGP_KEY_INIT is the first key seen by parser functions, and ARGP_KEY_FINI the last (unless an error was returned by the parser for ARGP_KEY_INIT). Other keys can occur in one the following orders (opt refers to an arbitrary option key):

opt... ARGP_KEY_NO_ARGS ARGP_KEY_END ARGP_KEY_SUCCESS: The arguments being parsed contained no non-option arguments at all.
( opt | ARGP_KEY_ARG )... ARGP_KEY_END ARGP_KEY_SUCCESS: All non-option arguments were successfully handled by a parser function (there may be multiple parser functions if multiple argp parsers were combined).
( opt | ARGP_KEY_ARG )... ARGP_KEY_SUCCESS: Some non-option argument was unrecognized. This occurs when every parser function returns ARGP_KEY_UNKNOWN for an argument, in which case parsing stops at that argument. If arg_index is a null pointer otherwise an error occurs.

In all cases, if a non-null value for arg_index was passed to argp_parse, the index of the first unparsed command-line argument is passed back in it.

If an error occurs (either detected by argp, or because a parser function returned an error value), then each parser is called with ARGP_KEY_ERROR, and no further calls are made except the final call with ARGP_KEY_FINI.

Functions For Use in Argp Parsers

Argp provides a number of functions for the user of argp parser functions (see section Argp Parser Functions), mostly for producing error messages. These take as their first argument the state argument to the parser function (see section Argp Parsing State).

Function: void argp_usage (const struct argp_state *state): Output the standard usage message for the argp parser referred to by state to state->err_stream and terminate the program with exit (argp_err_exit_status) (see section Argp Global Variables).

Function: void argp_error (const struct argp_state *state, const char *fmt, ...): Print the printf format string fmt and following args, preceded by the program name and `:', and followed by a `Try ... --help' message, and terminate the program with an exit status of argp_err_exit_status (see section Argp Global Variables).

Function: void argp_failure (const struct argp_state *state, int status, int errnum, const char *fmt, ...)

Similarly to the standard gnu error-reporting function error, print the printf format string fmt and following args, preceded by the program name and `:', and followed by the standard unix error text for errnum if it is non-zero; then if status is non-zero, terminate the program with that as its exit status.

The difference between this function and argp_error is that argp_error is for parsing errors, whereas argp_failure is for other problems that occur during parsing but don't reflect a syntactic problem with the input--such as illegal values for options, bad phase of the moon, etc.

Function: void argp_state_help (const struct argp_state *state, FILE *stream, unsigned flags): Output a help message for the argp parser referred to by state to stream. The flags argument determines what sort of help message is produced. See section Flags for the argp_help Function.

Error output is sent to state->err_stream, and the program name printed is state->name.

The output or program termination behavior of these functions may be suppressed if the ARGP_NO_EXIT or ARGP_NO_ERRS flags, respectively, were passed to argp_parse. See section Flags for argp_parse.

This behavior is useful if an argp parser is exported for use by other programs (e.g., by a library), and may be used in a context where it is not desirable to terminate the program in response to parsing errors. In argp parsers intended for such general use, calls to any of these functions should be followed by code return of an appropriate error code for the case where the program doesn't terminate; for example:

if (bad argument syntax)
  {
     argp_usage (state);
     return EINVAL;
  }

If it's known that a parser function will only be used when ARGP_NO_EXIT is not set, the return may be omitted.

Argp Parsing State

The third argument to argp parser functions (see section Argp Parser Functions) is a pointer to a struct argp_state, which contains information about the state of the option parsing.

Data Type: struct argp_state

This structure has the following fields, which may be modified as noted:

const struct argp *const root_argp: The top level argp parser being parsed. Note that this is often not the same struct argp passed into argp_parse by the invoking program (see section Parsing Program Options with Argp), but instead an internal argp parser that contains options implemented by argp_parse itself (such as `--help').
int argc
char **argv: The argument vector being parsed. May be modified.
int next: The index in argv of the next argument to be parsed. May be modified. One way to consume all remaining arguments in the input is to set state->next = state->argc (perhaps after recording the value of the next field to find the consumed arguments). Also, you can cause the current option to be re-parsed by decrementing this field, and then modifying state->argv[state->next] to be the option that should be reexamined.
unsigned flags: The flags supplied to argp_parse. May be modified, although some flags may only take effect when argp_parse is first invoked. See section Flags for argp_parse.
unsigned arg_num: While calling a parsing function with the key argument ARGP_KEY_ARG, this is the number of the current arg, starting at 0, and incremented after each such call returns. At all other times, this is the number of such arguments that have been processed.
int quoted: If non-zero, the index in argv of the first argument following a special `--' argument (which prevents anything following being interpreted as an option). Only set once argument parsing has proceeded past this point.
void *input: An arbitrary pointer passed in from the caller of argp_parse, in the input argument.
void **child_inputs: Values to pass to child parsers. This vector will be the same length as the number of children in the current parser, and each child parser will be given the value of state->child_inputs[i] as its state->input field, where i is the index of the child in the this parser's children field. See section Combining Multiple Argp Parsers.
void *hook: For the parser function's use. Initialized to 0, but otherwise ignored by argp.
char *name: The name used when printing messages. This is initialized to argv[0], or program_invocation_name if that is unavailable.
FILE *err_stream
FILE *out_stream: Stdio streams used when argp prints something; error messages are printed to err_stream, and all other output (such as `--help' output) to out_stream. These are initialized to stderr and stdout respectively (see section Standard Streams).
void *pstate: Private, for use by the argp implementation.

Combining Multiple Argp Parsers

The children field in a struct argp allows other argp parsers to be combined with the referencing one to parse a single set of arguments. It should point to a vector of struct argp_child, terminated by an entry having a value of zero in the argp field.

Where conflicts between combined parsers arise (for instance, if two specify an option with the same name), they are resolved in favor of the parent argp parsers, or earlier argp parsers in the list of children.

Data Type: struct argp_child

An entry in the list of subsidiary argp parsers pointed to by the children field in a struct argp. The fields are as follows:

const struct argp *argp: The child argp parser, or zero to end the list.
int flags: Flags for this child.
const char *header: If non-zero, an optional header to be printed in help output before the child options. As a side-effect, a non-zero value forces the child options to be grouped together; to achieve this effect without actually printing a header string, use a value of "". As with header strings specified in an option entry, the value conventionally has `:' as the last character. See section Specifying Options in an Argp Parser.
int group: Where to group the child options relative to the other (`consolidated') options in the parent argp parser. The values are the same as the group field in struct argp_option (see section Specifying Options in an Argp Parser), but all child-groupings follow parent options at a particular group level. If both this field and header are zero, then the child's options aren't grouped together at all, but rather merged with the parent options (merging the child's grouping levels with the parents).

Flags for `argp_parse`

The default behavior of argp_parse is designed to be convenient for the most common case of parsing program command line argument. To modify these defaults, the following flags may be or'd together in the flags argument to argp_parse:

ARGP_PARSE_ARGV0: Don't ignore the first element of the argv argument to argp_parse. Normally (and always unless ARGP_NO_ERRS is set) the first element of the argument vector is skipped for option parsing purposes, as it corresponds to the program name in a command line.
ARGP_NO_ERRS: Don't print error messages for unknown options to stderr; unless this flag is set, ARGP_PARSE_ARGV0 is ignored, as argv[0] is used as the program name in the error messages. This flag implies ARGP_NO_EXIT (on the assumption that silent exiting upon errors is bad behaviour).
ARGP_NO_ARGS: Don't parse any non-option args. Normally non-option args are parsed by calling the parse functions with a key of ARGP_KEY_ARG, and the actual arg as the value. This flag needn't normally be set, as the normal behavior is to stop parsing as soon as some argument isn't accepted by a parsing function. See section Argp Parser Functions.
ARGP_IN_ORDER: Parse options and arguments in the same order they occur on the command line--normally they're rearranged so that all options come first
ARGP_NO_HELP: Don't provide the standard long option `--help', which ordinarily causes usage and option help information to be output to stdout, and exit (0) called.
ARGP_NO_EXIT: Don't exit on errors (they may still result in error messages).
ARGP_LONG_ONLY: Use the gnu getopt `long-only' rules for parsing arguments. This allows long-options to be recognized with only a single `-' (for instances, `-help'), but results in a generally somewhat less useful interface, that conflicts with the way most GNU programs work. For this reason, its use is discouraged.
ARGP_SILENT: Turns off any message-printing/exiting options, specifically ARGP_NO_EXIT, ARGP_NO_ERRS, and ARGP_NO_HELP.

Customizing Argp Help Output

The help_filter field in a struct argp is a pointer to a function to filter the text of help messages before displaying them. They have a function signature like:

char *help-filter (int key, const char *text, void *input)

where key is either a key from an option, in which case text is that option's help text (see section Specifying Options in an Argp Parser), or one of the special keys with names beginning with `ARGP_KEY_HELP_', describing which other help text text is (see section Special Keys for Argp Help Filter Functions).

The function should return either text, if it should be used as-is, a replacement string, which should be allocated using malloc, and will be freed by argp, or zero, meaning `print nothing'. The value of text supplied is after any translation has been done, so if any of the replacement text also needs translation, that should be done by the filter function. input is either the input supplied to argp_parse, or zero, if argp_help was called directly by the user.

Special Keys for Argp Help Filter Functions

The following special values may be passed to an argp help filter function as the first argument, in addition to key values for user options, and specify which help text the text argument contains:

ARGP_KEY_HELP_PRE_DOC: Help text preceding options.
ARGP_KEY_HELP_POST_DOC: Help text following options.
ARGP_KEY_HELP_HEADER: Option header string.
ARGP_KEY_HELP_EXTRA: After all other documentation; text is zero for this key.
ARGP_KEY_HELP_DUP_ARGS_NOTE: The explanatory note emitted when duplicate option arguments have been suppressed.
ARGP_KEY_HELP_ARGS_DOC: The argument doc string (the args_doc field from the argp parser; see section Specifying Argp Parsers).

The `argp_help` Function

Normally programs using argp need not worry too much about printing argument-usage-type help messages, because the standard `--help' option is handled automatically by argp, and the typical error cases can be handled using argp_usage and argp_error (see section Functions For Use in Argp Parsers).

However, if it's desirable to print a standard help message in some context other than parsing the program options, argp offers the argp_help interface.

Function: void argp_help (const struct argp *argp, FILE *stream, unsigned flags, char *name)

Output a help message for the argp parser argp to stream. What sort of messages is printed is determined by flags.

Any options such as `--help' that are implemented automatically by argp itself will not be present in the help output; for this reason, it is better to use argp_state_help if calling from within an argp parser function. See section Functions For Use in Argp Parsers.

Flags for the `argp_help` Function

When calling argp_help (see section The argp_help Function), or argp_state_help (see section Functions For Use in Argp Parsers), exactly what is output is determined by the flags argument, which should consist of any of the following flags, or'd together:

ARGP_HELP_USAGE: A unix `Usage:' message that explicitly lists all options.
ARGP_HELP_SHORT_USAGE: A unix `Usage:' message that displays only an appropriate placeholder to indicate where the options go; useful for showing the non-option argument syntax.
ARGP_HELP_SEE: A `Try ... for more help' message; `...' contains the program name and `--help'.
ARGP_HELP_LONG: A verbose option help message that gives each option understood along with its documentation string.
ARGP_HELP_PRE_DOC: The part of the argp parser doc string that precedes the verbose option help.
ARGP_HELP_POST_DOC: The part of the argp parser doc string that follows the verbose option help.
ARGP_HELP_DOC: (ARGP_HELP_PRE_DOC | ARGP_HELP_POST_DOC)
ARGP_HELP_BUG_ADDR: A message saying where to report bugs for this program, if the argp_program_bug_address variable contains one.
ARGP_HELP_LONG_ONLY: Modify any output appropriately to reflect ARGP_LONG_ONLY mode.

The following flags are only understood when used with argp_state_help, and control whether the function returns after printing its output, or terminates the program:

ARGP_HELP_EXIT_ERR: Terminate the program with exit (argp_err_exit_status).
ARGP_HELP_EXIT_OK: Terminate the program with exit (0).

The following flags are combinations of the basic ones for printing standard messages:

ARGP_HELP_STD_ERR: Assuming an error message for a parsing error has already printed, prints a note on how to get help, and terminates the program with an error.
ARGP_HELP_STD_USAGE: Prints a standard usage message and terminates the program with an error. This is used when no more specific error message is appropriate.
ARGP_HELP_STD_HELP: Prints the standard response for a `--help' option, and terminates the program successfully.

Argp Examples

These example programs demonstrate the basic usage of argp.

A Minimal Program Using Argp

This is (probably) the smallest possible program that uses argp. It won't do much except give an error messages and exit when there are any arguments, and print a (rather pointless) message for `--help'.

/* Argp example #1 -- a minimal program using argp */

/* This is (probably) the smallest possible program that
   uses argp.  It won't do much except give an error
   messages and exit when there are any arguments, and print
   a (rather pointless) messages for --help. */

#include <argp.h>

int main (int argc, char **argv)
{
  argp_parse (0, argc, argv, 0, 0, 0);
  exit (0);
}

A Program Using Argp with Only Default Options

This program doesn't use any options or arguments, but uses argp to be compliant with the GNU standard command line format.

In addition to making sure no arguments are given, and implementing a `--help' option, this example will have a `--version' option, and will put the given documentation string and bug address in the `--help' output, as per GNU standards.

The variable argp contains the argument parser specification; adding fields to this structure is the way most parameters are passed to argp_parse (the first three fields are usually used, but not in this small program). There are also two global variables that argp knows about defined here, argp_program_version and argp_program_bug_address (they are global variables because they will almost always be constant for a given program, even if it uses different argument parsers for various tasks).

/* Argp example #2 -- a pretty minimal program using argp */

/* This program doesn't use any options or arguments, but uses
   argp to be compliant with the GNU standard command line
   format.

   In addition to making sure no arguments are given, and
   implementing a --help option, this example will have a
   --version option, and will put the given documentation string
   and bug address in the --help output, as per GNU standards.

   The variable ARGP contains the argument parser specification;
   adding fields to this structure is the way most parameters are
   passed to argp_parse (the first three fields are usually used,
   but not in this small program).  There are also two global
   variables that argp knows about defined here,
   ARGP_PROGRAM_VERSION and ARGP_PROGRAM_BUG_ADDRESS (they are
   global variables becuase they will almost always be constant
   for a given program, even if it uses different argument
   parsers for various tasks). */

#include <argp.h>

const char *argp_program_version =
  "argp-ex2 1.0";
const char *argp_program_bug_address =
  "<bug-gnu-utils@gnu.org>";

/* Program documentation. */
static char doc[] =
  "Argp example #2 -- a pretty minimal program using argp";

/* Our argument parser.  The options, parser, and
   args_doc fields are zero because we have neither options or
   arguments; doc and argp_program_bug_address will be
   used in the output for `--help', and the `--version'
   option will print out argp_program_version. */
static struct argp argp = { 0, 0, 0, doc };

int main (int argc, char **argv)
{
  argp_parse (&argp, argc, argv, 0, 0, 0);
  exit (0);
}

A Program Using Argp with User Options

This program uses the same features as example 2, and adds user options and arguments.

We now use the first four fields in argp (see section Specifying Argp Parsers), and specifies parse_opt as the parser function (see section Argp Parser Functions).

Note that in this example, main uses a structure to communicate with the parse_opt function, a pointer to which it passes in the input argument to argp_parse (see section Parsing Program Options with Argp), and is retrieved by parse_opt through the input field in its state argument (see section Argp Parsing State). Of course, it's also possible to use global variables instead, but using a structure like this is somewhat more flexible and clean.

/* Argp example #3 -- a program with options and arguments using argp */

/* This program uses the same features as example 2, and uses options and
   arguments.

   We now use the first four fields in ARGP, so here's a description of them:
     OPTIONS  -- A pointer to a vector of struct argp_option (see below)
     PARSER   -- A function to parse a single option, called by argp
     ARGS_DOC -- A string describing how the non-option arguments should look
     DOC      -- A descriptive string about this program; if it contains a
                 vertical tab character (\v), the part after it will be
                 printed *following* the options

   The function PARSER takes the following arguments:
     KEY  -- An integer specifying which option this is (taken
             from the KEY field in each struct argp_option), or
             a special key specifying something else; the only
             special keys we use here are ARGP_KEY_ARG, meaning
             a non-option argument, and ARGP_KEY_END, meaning
             that all arguments have been parsed
     ARG  -- For an option KEY, the string value of its
             argument, or NULL if it has none
     STATE-- A pointer to a struct argp_state, containing
             various useful information about the parsing state; used here
             are the INPUT field, which reflects the INPUT argument to
             argp_parse, and the ARG_NUM field, which is the number of the
             current non-option argument being parsed
   It should return either 0, meaning success, ARGP_ERR_UNKNOWN, meaning the
   given KEY wasn't recognized, or an errno value indicating some other
   error.

   Note that in this example, main uses a structure to communicate with the
   parse_opt function, a pointer to which it passes in the INPUT argument to
   argp_parse.  Of course, it's also possible to use global variables
   instead, but this is somewhat more flexible.

   The OPTIONS field contains a pointer to a vector of struct argp_option's;
   that structure has the following fields (if you assign your option
   structures using array initialization like this example, unspecified
   fields will be defaulted to 0, and need not be specified):
     NAME   -- The name of this option's long option (may be zero)
     KEY    -- The KEY to pass to the PARSER function when parsing this option,
               *and* the name of this option's short option, if it is a
               printable ascii character
     ARG    -- The name of this option's argument, if any
     FLAGS  -- Flags describing this option; some of them are:
                 OPTION_ARG_OPTIONAL -- The argument to this option is optional
                 OPTION_ALIAS        -- This option is an alias for the
                                        previous option
                 OPTION_HIDDEN       -- Don't show this option in --help output
     DOC    -- A documentation string for this option, shown in --help output

   An options vector should be terminated by an option with all fields zero. */

#include <argp.h>

const char *argp_program_version =
  "argp-ex3 1.0";
const char *argp_program_bug_address =
  "<bug-gnu-utils@gnu.org>";

/* Program documentation. */
static char doc[] =
  "Argp example #3 -- a program with options and arguments using argp";

/* A description of the arguments we accept. */
static char args_doc[] = "ARG1 ARG2";

/* The options we understand. */
static struct argp_option options[] = {
  {"verbose",  'v', 0,      0,  "Produce verbose output" },
  {"quiet",    'q', 0,      0,  "Don't produce any output" },
  {"silent",   's', 0,      OPTION_ALIAS },
  {"output",   'o', "FILE", 0,
   "Output to FILE instead of standard output" },
  { 0 }
};

/* Used by main to communicate with parse_opt. */
struct arguments
{
  char *args[2];                /* arg1 & arg2 */
  int silent, verbose;
  char *output_file;
};

/* Parse a single option. */
static error_t
parse_opt (int key, char *arg, struct argp_state *state)
{
  /* Get the input argument from argp_parse, which we
     know is a pointer to our arguments structure. */
  struct arguments *arguments = state->input;

  switch (key)
    {
    case 'q': case 's':
      arguments->silent = 1;
      break;
    case 'v':
      arguments->verbose = 1;
      break;
    case 'o':
      arguments->output_file = arg;
      break;

    case ARGP_KEY_ARG:
      if (state->arg_num >= 2)
        /* Too many arguments. */
        argp_usage (state);

      arguments->args[state->arg_num] = arg;

      break;

    case ARGP_KEY_END:
      if (state->arg_num < 2)
        /* Not enough arguments. */
        argp_usage (state);
      break;

    default:
      return ARGP_ERR_UNKNOWN;
    }
  return 0;
}

/* Our argp parser. */
static struct argp argp = { options, parse_opt, args_doc, doc };

int main (int argc, char **argv)
{
  struct arguments arguments;

  /* Default values. */
  arguments.silent = 0;
  arguments.verbose = 0;
  arguments.output_file = "-";

  /* Parse our arguments; every option seen by parse_opt will
     be reflected in arguments. */
  argp_parse (&argp, argc, argv, 0, 0, &arguments);

  printf ("ARG1 = %s\nARG2 = %s\nOUTPUT_FILE = %s\n"
          "VERBOSE = %s\nSILENT = %s\n",
          arguments.args[0], arguments.args[1],
          arguments.output_file,
          arguments.verbose ? "yes" : "no",
          arguments.silent ? "yes" : "no");

  exit (0);
}

A Program Using Multiple Combined Argp Parsers

This program uses the same features as example 3, but has more options, and somewhat more structure in the `--help' output. It also shows how you can `steal' the remainder of the input arguments past a certain point, for programs that accept a list of items, and the special key value ARGP_KEY_NO_ARGS, which is only given if no non-option arguments were supplied to the program (see section Special Keys for Argp Parser Functions).

For structuring the help output, two features are used: headers, which are entries in the options vector (see section Specifying Options in an Argp Parser) with the first four fields being zero, and a two part documentation string (in the variable doc), which allows documentation both before and after the options (see section Specifying Argp Parsers); the two parts of doc are separated by a vertical-tab character ('\v', or '\013'). By convention, the documentation before the options is just a short string saying what the program does, and that afterwards is longer, describing the behavior in more detail. All documentation strings are automatically filled for output, although newlines may be included to force a line break at a particular point. All documentation strings are also passed to the gettext function, for possible translation into the current locale.

/* Argp example #4 -- a program with somewhat more complicated options */

/* This program uses the same features as example 3, but has more
   options, and somewhat more structure in the -help output.  It
   also shows how you can `steal' the remainder of the input
   arguments past a certain point, for programs that accept a
   list of items.  It also shows the special argp KEY value
   ARGP_KEY_NO_ARGS, which is only given if no non-option
   arguments were supplied to the program.

   For structuring the help output, two features are used,
   *headers* which are entries in the options vector with the
   first four fields being zero, and a two part documentation
   string (in the variable DOC), which allows documentation both
   before and after the options; the two parts of DOC are
   separated by a vertical-tab character ('\v', or '\013').  By
   convention, the documentation before the options is just a
   short string saying what the program does, and that afterwards
   is longer, describing the behavior in more detail.  All
   documentation strings are automatically filled for output,
   although newlines may be included to force a line break at a
   particular point.  All documentation strings are also passed to
   the `gettext' function, for possible translation into the
   current locale. */

#include <stdlib.h>
#include <error.h>
#include <argp.h>

const char *argp_program_version =
  "argp-ex4 1.0";
const char *argp_program_bug_address =
  "<bug-gnu-utils@prep.ai.mit.edu>";

/* Program documentation. */
static char doc[] =
  "Argp example #4 -- a program with somewhat more complicated\
options\
\vThis part of the documentation comes *after* the options;\
 note that the text is automatically filled, but it's possible\
 to force a line-break, e.g.\n<-- here.";

/* A description of the arguments we accept. */
static char args_doc[] = "ARG1 [STRING...]";

/* Keys for options without short-options. */
#define OPT_ABORT  1            /* --abort */

/* The options we understand. */
static struct argp_option options[] = {
  {"verbose",  'v', 0,       0, "Produce verbose output" },
  {"quiet",    'q', 0,       0, "Don't produce any output" },
  {"silent",   's', 0,       OPTION_ALIAS },
  {"output",   'o', "FILE",  0,
   "Output to FILE instead of standard output" },

  {0,0,0,0, "The following options should be grouped together:" },
  {"repeat",   'r', "COUNT", OPTION_ARG_OPTIONAL,
   "Repeat the output COUNT (default 10) times"},
  {"abort",    OPT_ABORT, 0, 0, "Abort before showing any output"},

  { 0 }
};

/* Used by main to communicate with parse_opt. */
struct arguments
{
  char *arg1;                   /* arg1 */
  char **strings;               /* [string...] */
  int silent, verbose, abort;   /* `-s', `-v', `--abort' */
  char *output_file;            /* file arg to `--output' */
  int repeat_count;             /* count arg to `--repeat' */
};

/* Parse a single option. */
static error_t
parse_opt (int key, char *arg, struct argp_state *state)
{
  /* Get the input argument from argp_parse, which we
     know is a pointer to our arguments structure. */
  struct arguments *arguments = state->input;

  switch (key)
    {
    case 'q': case 's':
      arguments->silent = 1;
      break;
    case 'v':
      arguments->verbose = 1;
      break;
    case 'o':
      arguments->output_file = arg;
      break;
    case 'r':
      arguments->repeat_count = arg ? atoi (arg) : 10;
      break;
    case OPT_ABORT:
      arguments->abort = 1;
      break;

    case ARGP_KEY_NO_ARGS:
      argp_usage (state);

    case ARGP_KEY_ARG:
      /* Here we know that state->arg_num == 0, since we
         force argument parsing to end before any more arguments can
         get here. */
      arguments->arg1 = arg;

      /* Now we consume all the rest of the arguments.
         state->next is the index in state->argv of the
         next argument to be parsed, which is the first string
         we're interested in, so we can just use
         &state->argv[state->next] as the value for
         arguments->strings.

         In addition, by setting state->next to the end
         of the arguments, we can force argp to stop parsing here and
         return. */
      arguments->strings = &state->argv[state->next];
      state->next = state->argc;

      break;

    default:
      return ARGP_ERR_UNKNOWN;
    }
  return 0;
}

/* Our argp parser. */
static struct argp argp = { options, parse_opt, args_doc, doc };

int main (int argc, char **argv)
{
  int i, j;
  struct arguments arguments;

  /* Default values. */
  arguments.silent = 0;
  arguments.verbose = 0;
  arguments.output_file = "-";
  arguments.repeat_count = 1;
  arguments.abort = 0;

  /* Parse our arguments; every option seen by parse_opt will be
     reflected in arguments. */
  argp_parse (&argp, argc, argv, 0, 0, &arguments);

  if (arguments.abort)
    error (10, 0, "ABORTED");

  for (i = 0; i < arguments.repeat_count; i++)
    {
      printf ("ARG1 = %s\n", arguments.arg1);
      printf ("STRINGS = ");
      for (j = 0; arguments.strings[j]; j++)
        printf (j == 0 ? "%s" : ", %s", arguments.strings[j]);
      printf ("\n");
      printf ("OUTPUT_FILE = %s\nVERBOSE = %s\nSILENT = %s\n",
              arguments.output_file,
              arguments.verbose ? "yes" : "no",
              arguments.silent ? "yes" : "no");
    }

  exit (0);
}

Argp User Customization

The way formatting of argp `--help' output may be controlled to some extent by a program's users, by setting the ARGP_HELP_FMT environment variable to a comma-separated list (whitespace is ignored) of the following tokens:

`dup-args'
`no-dup-args': Turn duplicate-argument-mode on or off. In duplicate argument mode, if an option which accepts an argument has multiple names, the argument is shown for each name; otherwise, it is only shown for the first long option, and a note is emitted later so the user knows that it applies to the other names as well. The default is `no-dup-args', which is less consistent, but prettier.
`dup-args-note'
`no-dup-args-note': Enable or disable the note informing the user of suppressed option argument duplication. The default is `dup-args-note'.
`short-opt-col=n': Show the first short option in column n (default 2).
`long-opt-col=n': Show the first long option in column n (default 6).
`doc-opt-col=n': Show `documentation options' (see section Flags for Argp Options) in column n (default 2).
`opt-doc-col=n': Show the documentation for options starting in column n (default 29).
`header-col=n': Indent group headers (which document groups of options) to column n (default 1).
`usage-indent=n': Indent continuation lines in `Usage:' messages to column n (default 12).
`rmargin=n': Word wrap help output at or before column n (default 79).

Parsing of Suboptions

Having a single level of options is sometimes not enough. There might be too many options which have to be available or a set of options is closely related.

For this case some programs use suboptions. One of the most prominent programs is certainly mount(8). The -o option take one argument which itself is a comma separated list of options. To ease the programming of code like this the function getsubopt is available.

Function: int getsubopt (char **optionp, const char* const *tokens, char **valuep)

The optionp parameter must be a pointer to a variable containing the address of the string to process. When the function returns the reference is updated to point to the next suboption or to the terminating `\0' character if there is no more suboption available.

The tokens parameter references an array of strings containing the known suboptions. All strings must be `\0' terminated and to mark the end a null pointer must be stored. When getsubopt finds a possible legal suboption it compares it with all strings available in the tokens array and returns the index in the string as the indicator.

In case the suboption has an associated value introduced by a `=' character, a pointer to the value is returned in valuep. The string is `\0' terminated. If no argument is available valuep is set to the null pointer. By doing this the caller can check whether a necessary value is given or whether no unexpected value is present.

In case the next suboption in the string is not mentioned in the tokens array the starting address of the suboption including a possible value is returned in valuep and the return value of the function is `-1'.

Parsing of Suboptions Example

The code which might appear in the mount(8) program is a perfect example of the use of getsubopt:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int do_all;
const char *type;
int read_size;
int write_size;
int read_only;

enum
{
  RO_OPTION = 0,
  RW_OPTION,
  READ_SIZE_OPTION,
  WRITE_SIZE_OPTION,
  THE_END
};

const char *mount_opts[] =
{
  [RO_OPTION] = "ro",
  [RW_OPTION] = "rw",
  [READ_SIZE_OPTION] = "rsize",
  [WRITE_SIZE_OPTION] = "wsize",
  [THE_END] = NULL
};

int
main (int argc, char *argv[])
{
  char *subopts, *value;
  int opt;

  while ((opt = getopt (argc, argv, "at:o:")) != -1)
    switch (opt)
      {
      case 'a':
        do_all = 1;
        break;
      case 't':
        type = optarg;
        break;
      case 'o':
        subopts = optarg;
        while (*subopts != '\0')
          switch (getsubopt (&subopts, mount_opts, &value))
            {
            case RO_OPTION:
              read_only = 1;
              break;
            case RW_OPTION:
              read_only = 0;
              break;
            case READ_SIZE_OPTION:
              if (value == NULL)
                abort ();
              read_size = atoi (value);
              break;
            case WRITE_SIZE_OPTION:
              if (value == NULL)
                abort ();
              write_size = atoi (value);
              break;
            default:
              /* Unknown suboption. */
              printf ("Unknown suboption `%s'\n", value);
              break;
            }
        break;
      default:
        abort ();
      }

  /* Do the real work. */

  return 0;
}

Environment Variables

When a program is executed, it receives information about the context in which it was invoked in two ways. The first mechanism uses the argv and argc arguments to its main function, and is discussed in section Program Arguments. The second mechanism uses environment variables and is discussed in this section.

The argv mechanism is typically used to pass command-line arguments specific to the particular program being invoked. The environment, on the other hand, keeps track of information that is shared by many programs, changes infrequently, and that is less frequently used.

The environment variables discussed in this section are the same environment variables that you set using assignments and the export command in the shell. Programs executed from the shell inherit all of the environment variables from the shell.

Standard environment variables are used for information about the user's home directory, terminal type, current locale, and so on; you can define additional variables for other purposes. The set of all environment variables that have values is collectively known as the environment.

Names of environment variables are case-sensitive and must not contain the character `='. System-defined environment variables are invariably uppercase.

The values of environment variables can be anything that can be represented as a string. A value must not contain an embedded null character, since this is assumed to terminate the string.

Environment Access

The value of an environment variable can be accessed with the getenv function. This is declared in the header file `stdlib.h'. All of the following functions can be safely used in multi-threaded programs. It is made sure that concurrent modifications to the environment do not lead to errors.

Function: char * getenv (const char *name): This function returns a string that is the value of the environment variable name. You must not modify this string. In some non-Unix systems not using the GNU library, it might be overwritten by subsequent calls to getenv (but not by any other library function). If the environment variable name is not defined, the value is a null pointer.

Function: int putenv (char *string)

The putenv function adds or removes definitions from the environment. If the string is of the form `name=value', the definition is added to the environment. Otherwise, the string is interpreted as the name of an environment variable, and any definition for this variable in the environment is removed.

The difference to the setenv function is that the exact string given as the parameter string is put into the environment. If the user should change the string after the putenv call this will reflect in automatically in the environment. This also requires that string is no automatic variable which scope is left before the variable is removed from the environment. The same applies of course to dynamically allocated variables which are freed later.

This function is part of the extended Unix interface. Since it was also available in old SVID libraries you should define either _XOPEN_SOURCE or _SVID_SOURCE before including any header.

Function: int setenv (const char *name, const char *value, int replace)

The setenv function can be used to add a new definition to the environment. The entry with the name name is replaced by the value `name=value'. Please note that this is also true if value is the empty string. To do this a new string is created and the strings name and value are copied. A null pointer for the value parameter is illegal. If the environment already contains an entry with key name the replace parameter controls the action. If replace is zero, nothing happens. Otherwise the old entry is replaced by the new one.

Please note that you cannot remove an entry completely using this function.

This function was originally part of the BSD library but is now part of the Unix standard.

Function: int unsetenv (const char *name)

Using this function one can remove an entry completely from the environment. If the environment contains an entry with the key name this whole entry is removed. A call to this function is equivalent to a call to putenv when the value part of the string is empty.

The function return -1 if name is a null pointer, points to an empty string, or points to a string containing a = character. It returns 0 if the call succeeded.

This function was originall part of the BSD library but is now part of the Unix standard. The BSD version had no return value, though.

There is one more function to modify the whole environment. This function is said to be used in the POSIX.9 (POSIX bindings for Fortran 77) and so one should expect it did made it into POSIX.1. But this never happened. But we still provide this function as a GNU extension to enable writing standard compliant Fortran environments.

Function: int clearenv (void)

The clearenv function removes all entries from the environment. Using putenv and setenv new entries can be added again later.

If the function is successful it returns 0. Otherwise the return value is nonzero.

You can deal directly with the underlying representation of environment objects to add more variables to the environment (for example, to communicate with another program you are about to execute; see section Executing a File).

Variable: char ** environ

The environment is represented as an array of strings. Each string is of the format `name=value'. The order in which strings appear in the environment is not significant, but the same name must not appear more than once. The last element of the array is a null pointer.

This variable is declared in the header file `unistd.h'.

If you just want to get the value of an environment variable, use getenv.

Unix systems, and the GNU system, pass the initial value of environ as the third argument to main. See section Program Arguments.

Standard Environment Variables

These environment variables have standard meanings. This doesn't mean that they are always present in the environment; but if these variables are present, they have these meanings. You shouldn't try to use these environment variable names for some other purpose.

HOME

This is a string representing the user's home directory, or initial default working directory. The user can set HOME to any value. If you need to make sure to obtain the proper home directory for a particular user, you should not use HOME; instead, look up the user's name in the user database (see section User Database). For most purposes, it is better to use HOME, precisely because this lets the user specify the value.

LOGNAME

This is the name that the user used to log in. Since the value in the environment can be tweaked arbitrarily, this is not a reliable way to identify the user who is running a program; a function like getlogin (see section Identifying Who Logged In) is better for that purpose. For most purposes, it is better to use LOGNAME, precisely because this lets the user specify the value.

PATH

A path is a sequence of directory names which is used for searching for a file. The variable PATH holds a path used for searching for programs to be run. The execlp and execvp functions (see section Executing a File) use this environment variable, as do many shells and other utilities which are implemented in terms of those functions. The syntax of a path is a sequence of directory names separated by colons. An empty string instead of a directory name stands for the current directory (see section Working Directory). A typical value for this environment variable might be a string like:

:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local/bin

This means that if the user tries to execute a program named foo, the system will look for files named `foo', `/bin/foo', `/etc/foo', and so on. The first of these files that exists is the one that is executed.

TERM

This specifies the kind of terminal that is receiving program output. Some programs can make use of this information to take advantage of special escape sequences or terminal modes supported by particular kinds of terminals. Many programs which use the termcap library (see section `Finding a Terminal Description' in The Termcap Library Manual) use the TERM environment variable, for example.

TZ

This specifies the time zone. See section Specifying the Time Zone with TZ, for information about the format of this string and how it is used.

LANG

This specifies the default locale to use for attribute categories where neither LC_ALL nor the specific environment variable for that category is set. See section Locales and Internationalization, for more information about locales.

LC_ALL

If this environment variable is set it overrides the selection for all the locales done using the other LC_* environment variables. The value of the other LC_* environment variables is simply ignored in this case.

LC_COLLATE

This specifies what locale to use for string sorting.

LC_CTYPE

This specifies what locale to use for character sets and character classification.

LC_MESSAGES

This specifies what locale to use for printing messages and to parse responses.

LC_MONETARY

This specifies what locale to use for formatting monetary values.

LC_NUMERIC

This specifies what locale to use for formatting numbers.

LC_TIME

This specifies what locale to use for formatting date/time values.

NLSPATH

This specifies the directories in which the catopen function looks for message translation catalogs.

_POSIX_OPTION_ORDER

If this environment variable is defined, it suppresses the usual reordering of command line arguments by getopt and argp_parse. See section Program Argument Syntax Conventions.

System Calls

A system call is a request for service that a program makes of the kernel. The service is generally something that only the kernel has the privilege to do, such as doing I/O. Programmers don't normally need to be concerned with system calls because there are functions in the GNU C library to do virtually everything that system calls do. These functions work by making system calls themselves. For example, there is a system call that changes the permissions of a file, but you don't need to know about it because you can just use the GNU C library's chmod function.

System calls are sometimes called kernel calls.

However, there are times when you want to make a system call explicitly, and for that, the GNU C library provides the syscall function. syscall is harder to use and less portable than functions like chmod, but easier and more portable than coding the system call in assembler instructions.

syscall is most useful when you are working with a system call which is special to your system or is newer than the GNU C library you are using. syscall is implemented in an entirely generic way; the function does not know anything about what a particular system call does or even if it is valid.

The description of syscall in this section assumes a certain protocol for system calls on the various platforms on which the GNU C library runs. That protocol is not defined by any strong authority, but we won't describe it here either because anyone who is coding syscall probably won't accept anything less than kernel and C library source code as a specification of the interface between them anyway.

syscall is declared in `unistd.h'.

Function: long int syscall (long int sysno, ...)

syscall performs a generic system call.

sysno is the system call number. Each kind of system call is identified by a number. Macros for all the possible system call numbers are defined in `sys/syscall.h'

The remaining arguments are the arguments for the system call, in order, and their meanings depend on the kind of system call. Each kind of system call has a definite number of arguments, from zero to five. If you code more arguments than the system call takes, the extra ones to the right are ignored.

The return value is the return value from the system call, unless the system call failed. In that case, syscall returns -1 and sets errno to an error code that the system call returned. Note that system calls do not return -1 when they succeed.

If you specify an invalid sysno, syscall returns -1 with errno = ENOSYS.

Example:


#include <unistd.h>
#include <sys/syscall.h>
#include <errno.h>

...

int rc;

rc = syscall(SYS_chmod, "/etc/passwd", 0444);

if (rc == -1)
   fprintf(stderr, "chmod failed, errno = %d\n", errno);

This, if all the compatibility stars are aligned, is equivalent to the following preferable code:


#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>

...

int rc;

rc = chmod("/etc/passwd", 0444);
if (rc == -1)
   fprintf(stderr, "chmod failed, errno = %d\n", errno);

Program Termination

The usual way for a program to terminate is simply for its main function to return. The exit status value returned from the main function is used to report information back to the process's parent process or shell.

A program can also terminate normally by calling the exit function.

In addition, programs can be terminated by signals; this is discussed in more detail in section Signal Handling. The abort function causes a signal that kills the program.

Normal Termination

A process terminates normally when its program signals it is done by calling exit. Returning from main is equivalent to calling exit, and the value that main returns is used as the argument to exit.

Function: void exit (int status)

The exit function tells the system that the program is done, which causes it to terminate the process.

status is the program's exit status, which becomes part of the process' termination status. This function does not return.

Normal termination causes the following actions:

Functions that were registered with the atexit or on_exit functions are called in the reverse order of their registration. This mechanism allows your application to specify its own "cleanup" actions to be performed at program termination. Typically, this is used to do things like saving program state information in a file, or unlocking locks in shared data bases.
All open streams are closed, writing out any buffered output data. See section Closing Streams. In addition, temporary files opened with the tmpfile function are removed; see section Temporary Files.
_exit is called, terminating the program. See section Termination Internals.

Exit Status

When a program exits, it can return to the parent process a small amount of information about the cause of termination, using the exit status. This is a value between 0 and 255 that the exiting process passes as an argument to exit.

Normally you should use the exit status to report very broad information about success or failure. You can't provide a lot of detail about the reasons for the failure, and most parent processes would not want much detail anyway.

There are conventions for what sorts of status values certain programs should return. The most common convention is simply 0 for success and 1 for failure. Programs that perform comparison use a different convention: they use status 1 to indicate a mismatch, and status 2 to indicate an inability to compare. Your program should follow an existing convention if an existing convention makes sense for it.

A general convention reserves status values 128 and up for special purposes. In particular, the value 128 is used to indicate failure to execute another program in a subprocess. This convention is not universally obeyed, but it is a good idea to follow it in your programs.

Warning: Don't try to use the number of errors as the exit status. This is actually not very useful; a parent process would generally not care how many errors occurred. Worse than that, it does not work, because the status value is truncated to eight bits. Thus, if the program tried to report 256 errors, the parent would receive a report of 0 errors--that is, success.

For the same reason, it does not work to use the value of errno as the exit status--these can exceed 255.

Portability note: Some non-POSIX systems use different conventions for exit status values. For greater portability, you can use the macros EXIT_SUCCESS and EXIT_FAILURE for the conventional status value for success and failure, respectively. They are declared in the file `stdlib.h'.

Macro: int EXIT_SUCCESS

This macro can be used with the exit function to indicate successful program completion.

On POSIX systems, the value of this macro is 0. On other systems, the value might be some other (possibly non-constant) integer expression.

Macro: int EXIT_FAILURE

This macro can be used with the exit function to indicate unsuccessful program completion in a general sense.

On POSIX systems, the value of this macro is 1. On other systems, the value might be some other (possibly non-constant) integer expression. Other nonzero status values also indicate failures. Certain programs use different nonzero status values to indicate particular kinds of "non-success". For example, diff uses status value 1 to mean that the files are different, and 2 or more to mean that there was difficulty in opening the files.

Don't confuse a program's exit status with a process' termination status. There are lots of ways a process can terminate besides having it's program finish. In the event that the process termination is caused by program termination (i.e. exit), though, the program's exit status becomes part of the process' termination status.

Cleanups on Exit

Your program can arrange to run its own cleanup functions if normal termination happens. If you are writing a library for use in various application programs, then it is unreliable to insist that all applications call the library's cleanup functions explicitly before exiting. It is much more robust to make the cleanup invisible to the application, by setting up a cleanup function in the library itself using atexit or on_exit.

Function: int atexit (void (*function) (void))

The atexit function registers the function function to be called at normal program termination. The function is called with no arguments.

The return value from atexit is zero on success and nonzero if the function cannot be registered.

Function: int on_exit (void (*function)(int status, void *arg), void *arg)

This function is a somewhat more powerful variant of atexit. It accepts two arguments, a function function and an arbitrary pointer arg. At normal program termination, the function is called with two arguments: the status value passed to exit, and the arg.

This function is included in the GNU C library only for compatibility for SunOS, and may not be supported by other implementations.

Here's a trivial program that illustrates the use of exit and atexit:

#include <stdio.h>
#include <stdlib.h>

void 
bye (void)
{
  puts ("Goodbye, cruel world....");
}

int
main (void)
{
  atexit (bye);
  exit (EXIT_SUCCESS);
}

When this program is executed, it just prints the message and exits.

Aborting a Program

You can abort your program using the abort function. The prototype for this function is in `stdlib.h'.

Function: void abort (void)

The abort function causes abnormal program termination. This does not execute cleanup functions registered with atexit or on_exit.

This function actually terminates the process by raising a SIGABRT signal, and your program can include a handler to intercept this signal; see section Signal Handling.

Future Change Warning: Proposed Federal censorship regulations may prohibit us from giving you information about the possibility of calling this function. We would be required to say that this is not an acceptable way of terminating a program.

Termination Internals

The _exit function is the primitive used for process termination by exit. It is declared in the header file `unistd.h'.

Function: void _exit (int status): The _exit function is the primitive for causing a process to terminate with status status. Calling this function does not execute cleanup functions registered with atexit or on_exit.

Function: void _Exit (int status)

The _Exit function is the ISO C equivalent to _exit. The ISO C committee members were not sure whether the definitions of _exit and _Exit were compatible so they have not used the POSIX name.

This function was introduced in ISO C99 and is declared in `stdlib.h'.

When a process terminates for any reason--either because the program terminates, or as a result of a signal--the following things happen:

All open file descriptors in the process are closed. See section Low-Level Input/Output. Note that streams are not flushed automatically when the process terminates; see section Input/Output on Streams.
A process exit status is saved to be reported back to the parent process via wait or waitpid; see section Process Completion. If the program exited, this status includes as its low-order 8 bits the program exit status.
Any child processes of the process being terminated are assigned a new parent process. (On most systems, including GNU, this is the init process, with process ID 1.)
A SIGCHLD signal is sent to the parent process.
If the process is a session leader that has a controlling terminal, then a SIGHUP signal is sent to each process in the foreground job, and the controlling terminal is disassociated from that session. See section Job Control.
If termination of a process causes a process group to become orphaned, and any member of that process group is stopped, then a SIGHUP signal and a SIGCONT signal are sent to each process in the group. See section Job Control.

Go to the first, previous, next, last section, table of contents.