Node:PC Using, Next:, Previous:PC Compiling, Up:PC Installation



Using gawk on PC Operating Systems

With the exception of the Cygwin environment, the |& operator and TCP/IP networking (see Using gawk for Network Programming) are not supported for MS-DOS or MS-Windows. EMX (OS/2 only) does support at least the |& operator.

The OS/2 and MS-DOS versions of gawk search for program files as described in The AWKPATH Environment Variable. However, semicolons (rather than colons) separate elements in the AWKPATH variable. If AWKPATH is not set or is empty, then the default search path for OS/2 (16 bit) and MS-DOS versions is ".;c:/lib/awk;c:/gnu/lib/awk".

The search path for OS/2 (32 bit, EMX) is determined by the prefix directory (most likely /usr or c:/usr) that has been specified as an option of the configure script like it is the case for the Unix versions. If c:/usr is the prefix directory then the default search path contains . and c:/usr/share/awk. Additionally, to support binary distributions of gawk for OS/2 systems whose drive c: might not support long file names or might not exist at all, there is a special environment variable. If UNIXROOT specifies a drive then this specific drive is also searched for program files. E.g., if UNIXROOT is set to e: the complete default search path is ".;c:/usr/share/awk;e:/usr/share/awk".

An sh-like shell (as opposed to command.com under MS-DOS or cmd.exe under OS/2) may be useful for awk programming. Ian Stewartson has written an excellent shell for MS-DOS and OS/2, Daisuke Aoyama has ported GNU bash to MS-DOS using the DJGPP tools, and several shells are available for OS/2, including ksh. The file README_d/README.pc in the gawk distribution contains information on these shells. Users of Stewartson's shell on DOS should examine its documentation for handling command lines; in particular, the setting for gawk in the shell configuration may need to be changed and the ignoretype option may also be of interest.

Under OS/2 and DOS, gawk (and many other text programs) silently translate end-of-line "\r\n" to "\n" on input and "\n" to "\r\n" on output. A special BINMODE variable allows control over these translations and is interpreted as follows:

The modes for standard input and standard output are set one time only (after the command line is read, but before processing any of the awk program). Setting BINMODE for standard input or standard output is accomplished by using an appropriate -v BINMODE=N option on the command line. BINMODE is set at the time a file or pipe is opened and cannot be changed mid-stream.

The name BINMODE was chosen to match mawk (see Other Freely Available awk Implementations). Both mawk and gawk handle BINMODE similarly; however, mawk adds a -W BINMODE=N option and an environment variable that can set BINMODE, RS, and ORS. The files binmode[1-3].awk (under gnu/lib/awk in some of the prepared distributions) have been chosen to match mawk's -W BINMODE=N option. These can be changed or discarded; in particular, the setting of RS giving the fewest "surprises" is open to debate. mawk uses RS = "\r\n" if binary mode is set on read, which is appropriate for files with the DOS-style end-of-line.

To illustrate, the following examples set binary mode on writes for standard output and other files, and set ORS as the "usual" DOS-style end-of-line:

gawk -v BINMODE=2 -v ORS="\r\n" ...

or:

gawk -v BINMODE=w -f binmode2.awk ...

These give the same result as the -W BINMODE=2 option in mawk. The following changes the record separator to "\r\n" and sets binary mode on reads, but does not affect the mode on standard input:

gawk -v RS="\r\n" --source "BEGIN { BINMODE = 1 }" ...

or:

gawk -f binmode1.awk ...

With proper quoting, in the first example the setting of RS can be moved into the BEGIN rule.