[Top] [Contents] [Index] [ ? ]

The GNU Awk User's Guide

This file documents awk, a program that you can use to select particular records in a file and perform operations upon them.

This is Edition 3 of GAWK: Effective AWK Programming: A User's Guide for GNU Awk, for the 3.1.0 version of the GNU implementation of AWK.

Foreword  Some nice words about this Web page.
Preface  What this Web page is about; brief history and acknowledgments.
2. Getting Started with awk  A basic introduction to using
                                   awk. How to run an awk
                                   program. Command-line syntax.
3. Regular Expressions  All about matching things using regular expressions.
4. Reading Input Files  How to read files and manipulate fields.
5. Printing Output  How to print using awk. Describes the print and printf statements. Also describes redirection of output.
6. Expressions  Expressions are the basic building blocks of statements.
7. Patterns, Actions, and Variables  Overviews of patterns and actions.
8. Arrays in awk  The description and use of arrays. Also includes array-oriented control statements.
9. Functions  Built-in and user-defined functions.
10. Internationalization with gawk  Getting gawk to speak your language.
11. Advanced Features of gawk  Stuff for advanced users, specific to
                                   gawk.
12. Running awk and gawk  How to run gawk.
13. A Library of awk Functions  
14. Practical awk Programs  Many awk programs with complete explanations.
A. The Evolution of the awk Language  The evolution of the awk language.
B. Installing gawk  Installing gawk under various operating systems.
C. Implementation Notes  Notes about gawk extensions and possible future work.
D. Basic Programming Concepts  A very quick intoduction to programming concepts.
Glossary  An explanation of some unfamiliar terms.
GNU General Public License  Your right to copy and distribute
                                   gawk.
GNU Free Documentation License  The license for this Web page.
Index  Concept and Variable Index.

History of awk and gawk  The history of gawk and
                                   awk.
1.0 A Rose by Any Other Name  What name to use to find awk.
1.1 Using This Book  Using this Web page. Includes sample input files that you can use.
1.2 Typographical Conventions  
The GNU Project and This Book  Brief history of the GNU project and this Web page.
How to Contribute  Helping to save the world.
Acknowledgments  
2.1 How to Run awk Programs  How to run gawk programs; includes command-line syntax.
2.1.1 One-Shot Throw-Away awk Programs  Running a short throw-away awk program.
2.1.2 Running awk Without Input Files  Using no input files (input from terminal instead).
2.1.3 Running Long Programs  Putting permanent awk programs in files.
2.1.4 Executable awk Programs  Making self-contained awk programs.
2.1.5 Comments in awk Programs  Adding documentation to gawk programs.
2.1.6 Shell Quoting Issues  More discussion of shell quoting issues.
2.2 Data Files for the Examples  Sample data files for use in the
                                   awk programs illustrated in this
                                   Web page.
2.3 Some Simple Examples  A very simple example.
2.4 An Example with Two Rules  A less simple one-line example using two rules.
2.5 A More Complex Example  A more complex example.
2.6 awk Statements Versus Lines  Subdividing or combining statements into lines.
2.7 Other Features of awk  
2.8 When to Use awk  When to use gawk and when to use other things.
3.1 How to Use Regular Expressions  
3.2 Escape Sequences  How to write non-printing characters.
3.3 Regular Expression Operators  
3.4 Using Character Lists  What can go between `[...]'.
3.5 gawk-Specific Regexp Operators  Operators specific to GNU software.
3.6 Case Sensitivity in Matching  How to do case-insensitive matching.
3.7 How Much Text Matches?  How much text matches.
3.8 Using Dynamic Regexps  
4.1 How Input Is Split into Records  Controlling how data is split into records.
4.2 Examining Fields  An introduction to fields.
4.3 Non-Constant Field Numbers  Non-constant Field Numbers.
4.4 Changing the Contents of a Field  
4.5 Specifying How Fields Are Separated  The field separator and how to change it.
4.5.1 Using Regular Expressions to Separate Fields  Using regexps as the field separator.
4.5.2 Making Each Character a Separate Field  Making each character a separate field.
4.5.3 Setting FS from the Command Line  Setting FS from the command-line.
4.5.4 Field Splitting Summary  Some final points and a summary table.
4.6 Reading Fixed-Width Data  Reading constant width data.
4.7 Multiple-Line Records  Reading multi-line records.
4.8 Explicit Input with getline  Reading files under explicit program control using the getline function.
4.8.1 Using getline with No Arguments  Using getline with no arguments.
4.8.2 Using getline into a Variable  Using getline into a variable.
4.8.3 Using getline from a File  Using getline from a file.
4.8.4 Using getline into a Variable from a File  Using getline into a variable from a file.
4.8.5 Using getline from a Pipe  Using getline from a pipe.
4.8.6 Using getline into a Variable from a Pipe  Using getline into a variable from a pipe.
4.8.7 Using getline from a Coprocess  Using getline from a coprocess.
4.8.8 Using getline into a Variable from a Coprocess  Using getline into a variable from a coprocess.
4.8.9 Points About getline to Remember  Important things to know about
                                   getline.
4.8.10 Summary of getline Variants  
5.1 The print Statement  The print statement.
5.2 Examples of print Statements  Simple examples of print statements.
5.3 Output Separators  The output separators and how to change them.
5.4 Controlling Numeric Output with print  Controlling Numeric Output With
                                   print.
5.5 Using printf Statements for Fancier Printing  The printf statement.
5.5.1 Introduction to the printf Statement  Syntax of the printf statement.
5.5.2 Format-Control Letters  Format-control letters.
5.5.3 Modifiers for printf Formats  Format-specification modifiers.
5.5.4 Examples Using printf  Several examples.
5.6 Redirecting Output of print and printf  How to redirect output to multiple files and pipes.
5.7 Special File Names in gawk  File name interpretation in gawk.
                                   gawk allows access to inherited
                                   file descriptors.
5.7.1 Special Files for Standard Descriptors  Special files for I/O.
5.7.2 Special Files for Process-Related Information  Special files for process information.
5.7.3 Special Files for Network Communications  Special files for network communications.
5.7.4 Special File Name Caveats  Things to watch out for.
5.8 Closing Input and Output Redirections  Closing Input and Output Files and Pipes.
6.1 Constant Expressions  String, numeric and regexp constants.
6.1.1 Numeric and String Constants  Numeric and string constants.
6.1.2 Octal and Hexadecimal Numbers  What are octal and hex numbers.
6.1.3 Regular Expression Constants  Regular Expression constants.
6.2 Using Regular Expression Constants  When and how to use a regexp constant.
6.3 Variables  Variables give names to values for later use.
6.3.1 Using Variables in a Program  Using variables in your programs.
6.3.2 Assigning Variables on the Command Line  Setting variables on the command-line and a summary of command-line syntax. This is an advanced method of input.
6.4 Conversion of Strings and Numbers  The conversion of strings to numbers and vice versa.
6.5 Arithmetic Operators  Arithmetic operations (`+', `-', etc.)
6.6 String Concatenation  Concatenating strings.
6.7 Assignment Expressions  Changing the value of a variable or a field.
6.8 Increment and Decrement Operators  Incrementing the numeric value of a variable.
6.9 True and False in awk  What is "true" and what is "false".
6.10 Variable Typing and Comparison Expressions  How variables acquire types and how this affects comparison of numbers and strings with `<', etc.
6.11 Boolean Expressions  Combining comparison expressions using boolean operators `||' ("or"),
                                   `&&' ("and") and `!' ("not").
6.12 Conditional Expressions  Conditional expressions select between two subexpressions under control of a third subexpression.
6.13 Function Calls  A function call is an expression.
6.14 Operator Precedence (How Operators Nest)  How various operators nest.
7.1 Pattern Elements  What goes into a pattern.
7.1.1 Regular Expressions as Patterns  Using regexps as patterns.
7.1.2 Expressions as Patterns  Any expression can be used as a pattern.
7.1.3 Specifying Record Ranges with Patterns  Pairs of patterns specify record ranges.
7.1.4 The BEGIN and END Special Patterns  Specifying initialization and cleanup rules.
7.1.4.1 Startup and Cleanup Actions  How and why to use BEGIN/END rules.
7.1.4.2 Input/Output from BEGIN and END Rules  I/O issues in BEGIN/END rules.
7.1.5 The Empty Pattern  The empty pattern, which matches every record.
7.2 Using Shell Variables in Programs  How to use shell variables with
                                   awk.
7.3 Actions  What goes into an action.
7.4 Control Statements in Actions  Describes the various control statements in detail.
7.4.1 The if-else Statement  Conditionally execute some awk statements.
7.4.2 The while Statement  Loop until some condition is satisfied.
7.4.3 The do-while Statement  Do specified action while looping until some condition is satisfied.
7.4.4 The for Statement  Another looping statement, that provides initialization and increment clauses.
7.4.5 The break Statement  Immediately exit the innermost enclosing loop.
7.4.6 The continue Statement  Skip to the end of the innermost enclosing loop.
7.4.7 The next Statement  Stop processing the current input record.
7.4.8 Using gawk's nextfile Statement  Stop processing the current file.
7.4.9 The exit Statement  Stop execution of awk.
7.5 Built-in Variables  Summarizes the built-in variables.
7.5.1 Built-in Variables That Control awk  Built-in variables that you change to control awk.
7.5.2 Built-in Variables That Convey Information  Built-in variables where awk gives you information.
7.5.3 Using ARGC and ARGV  Ways to use ARGC and ARGV.
8.1 Introduction to Arrays  
8.2 Referring to an Array Element  How to examine one element of an array.
8.3 Assigning Array Elements  How to change an element of an array.
8.4 Basic Array Example  Basic Example of an Array
8.5 Scanning All Elements of an Array  A variation of the for statement. It loops through the indices of an array's existing elements.
8.6 The delete Statement  The delete statement removes an element from an array.
8.7 Using Numbers to Subscript Arrays  How to use numbers as subscripts in
                                   awk.
8.8 Using Uninitialized Variables as Subscripts  Using Uninitialized variables as subscripts.
8.9 Multidimensional Arrays  Emulating multidimensional arrays in
                                   awk.
8.10 Scanning Multidimensional Arrays  Scanning multidimensional arrays.
8.11 Sorting Array Values and Indices with gawk  Sorting array values and indices.
9.1 Built-in Functions  Summarizes the built-in functions.
9.1.1 Calling Built-in Functions  How to call built-in functions.
9.1.2 Numeric Functions  Functions that work with numbers, including
                                   intsin and rand.
9.1.3 String Manipulation Functions  Functions for string manipulation, such as
                                   splitmatch and
                                   sprintf.
9.1.3.1 More About `\' and `&' with sub, gsub, and gensub  More than you want to know about `\' and `&' with sub, gsub, and gensub.
9.1.4 Input/Output Functions  Functions for files and shell commands.
9.1.5 Using gawk's Timestamp Functions  Functions for dealing with timestamps.
9.1.6 Using gawk's Bit Manipulation Functions  Functions for bitwise operations.
9.1.7 Using gawk's String Translation Functions  Functions for string translation.
9.2 User-Defined Functions  Describes User-defined functions in detail.
9.2.1 Function Definition Syntax  How to write definitions and what they mean.
9.2.2 Function Definition Examples  An example function definition and what it does.
9.2.3 Calling User-Defined Functions  Things to watch out for.
9.2.4 The return Statement  Specifying the value a function returns.
9.2.5 Functions and Their Effect on Variable Typing  How variable types can change at runtime.
10.1 Internationalization and Localization  
10.2 GNU gettext  How GNU gettext works.
10.3 Internationalizing awk Programs  Features for the programmer.
10.4 Translating awk Programs  Features for the translator.
10.4.1 Extracting Marked Strings  Extracting marked strings.
10.4.2 Rearranging printf Arguments  Rearranging printf arguments.
10.4.3 awk Portability Issues  awk-level portability issues.
10.5 A Simple Internationalization Example  A simple i18n example.
10.6 gawk Can Speak Your Language  gawk is also internationalized.
11.1 Allowing Non-Decimal Input Data  Allowing non-decimal input data.
11.2 Two-Way Communications with Another Process  Two-way communications with another process.
11.3 Using gawk for Network Programming  Using gawk for network programming.
11.4 Using gawk with BSD Portals  Using gawk with BSD portals.
11.5 Profiling Your awk Programs  Profiling your awk programs.
12.1 Invoking awk  How to run awk.
12.2 Command-Line Options  Command-line options and their meanings.
12.3 Other Command-Line Arguments  Input file names and variable assignments.
12.4 The AWKPATH Environment Variable  Searching directories for awk programs.
12.5 Obsolete Options and/or Features  Obsolete Options and/or features.
12.6 Undocumented Options and Features  
12.7 Known Bugs in gawk  
13.1 Naming Library Function Global Variables  How to best name private global variables in library functions.
13.2 General Programming  Functions that are of general use.
13.2.1 Implementing nextfile as a Function  Two implementations of a nextfile function.
13.2.2 Assertions  A function for assertions in awk programs.
13.2.3 Rounding Numbers  A function for rounding if sprintf does not do it correctly.
13.2.4 The Cliff Random Number Generator  
13.2.5 Translating Between Characters and Numbers  Functions for using characters as numbers and vice versa.
13.2.6 Merging an Array into a String  A function to join an array into a string.
13.2.7 Managing the Time of Day  A function to get formatted times.
13.3 Data File Management  Functions for managing command-line data files.
13.3.1 Noting Data File Boundaries  A function for handling data file transitions.
13.3.2 Rereading the Current File  A function for rereading the current file.
13.3.3 Checking for Readable Data Files  Checking that data files are readable.
13.3.4 Treating Assignments as File Names  Treating assignments as file names.
13.4 Processing Command-Line Options  A function for processing command-line arguments.
13.5 Reading the User Database  Functions for getting user information.
13.6 Reading the Group Database  Functions for getting group information.
14.1 Running the Example Programs  How to run these examples.
14.2 Reinventing Wheels for Fun and Profit  Clones of common utilities.
14.2.1 Cutting out Fields and Columns  The cut utility.
14.2.2 Searching for Regular Expressions in Files  The egrep utility.
14.2.3 Printing out User Information  The id utility.
14.2.4 Splitting a Large File into Pieces  The split utility.
14.2.5 Duplicating Output into Multiple Files  The tee utility.
14.2.6 Printing Non-Duplicated Lines of Text  The uniq utility.
14.2.7 Counting Things  The wc utility.
14.3 A Grab Bag of awk Programs  Some interesting awk programs.
14.3.1 Finding Duplicated Words in a Document  Finding duplicated words in a document.
14.3.2 An Alarm Clock Program  An alarm clock.
14.3.3 Transliterating Characters  A program similar to the tr utility.
14.3.4 Printing Mailing Labels  Printing mailing labels.
14.3.5 Generating Word Usage Counts  A program to produce a word usage count.
14.3.6 Removing Duplicates from Unsorted Text  Eliminating duplicate entries from a history file.
14.3.7 Extracting Programs from Texinfo Source Files  Pulling out programs from Texinfo source files.
14.3.8 A Simple Stream Editor  
14.3.9 An Easy Way to Use Library Functions  A wrapper for awk that includes files.
A.1 Major Changes Between V7 and SVR3.1  The major changes between V7 and System V Release 3.1.
A.2 Changes Between SVR3.1 and SVR4  Minor changes between System V Releases 3.1 and 4.
A.3 Changes Between SVR4 and POSIX awk  New features from the POSIX standard.
A.4 Extensions in the Bell Laboratories awk  New features from the Bell Laboratories version of awk.
A.5 Extensions in gawk Not in POSIX awk  The extensions in gawk not in POSIX awk.
A.6 Major Contributors to gawk  The major contributors to gawk.
B.1 The gawk Distribution  What is in the gawk distribution.
B.1.1 Getting the gawk Distribution  How to get the distribution.
B.1.2 Extracting the Distribution  How to extract the distribution.
B.1.3 Contents of the gawk Distribution  What is in the distribution.
B.2 Compiling and Installing gawk on Unix  Installing gawk under various versions of Unix.
B.2.1 Compiling gawk for Unix  Compiling gawk under Unix.
B.2.2 Additional Configuration Options  Other compile-time options.
B.2.3 The Configuration Process  How it's all supposed to work.
B.3 Installation on Other Operating Systems  
B.3.1 Installing gawk on an Amiga  
B.3.2 Installing gawk on BeOS  
B.3.3 Installation on PC Operating Systems  Installing and Compiling gawk on MS-DOS and OS/2.
B.3.3.1 Installing a Prepared Distribution for PC Systems  Installing a prepared distribution.
B.3.3.2 Compiling gawk for PC Operating Systems  Compiling gawk for MS-DOS, Win32, and OS/2.
B.3.3.3 Using gawk on PC Operating Systems  Running gawk on MS-DOS, Win32 and OS/2.
B.3.4 How to Compile and Install gawk on VMS  Installing gawk on VMS.
B.3.4.1 Compiling gawk on VMS  How to compile gawk under VMS.
B.3.4.2 Installing gawk on VMS  How to install gawk under VMS.
B.3.4.3 Running gawk on VMS  How to run gawk under VMS.
B.3.4.4 Building and Using gawk on VMS POSIX  Alternate instructions for VMS POSIX.
B.4 Unsupported Operating System Ports  Systems whose ports are no longer supported.
B.4.1 Installing gawk on the Atari ST  
B.4.1.1 Compiling gawk on the Atari ST  Compiling gawk on Atari.
B.4.1.2 Running gawk on the Atari ST  Running gawk on Atari.
B.4.2 Installing gawk on a Tandem  
B.5 Reporting Problems and Bugs  
B.6 Other Freely Available awk Implementations  Other freely available awk implementations.
C.1 Downward Compatibility and Debugging  How to disable certain gawk extensions.
C.2 Making Additions to gawk  Making Additions To gawk.
C.2.1 Adding New Features  Adding code to the main body of
                                   gawk.
C.2.2 Porting gawk to a New Operating System  Porting gawk to a new operating system.
C.3 Adding New Built-in Functions to gawk  Adding new built-in functions to
                                   gawk.
C.3.1 A Minimal Introduction to gawk Internals  A brief look at some gawk internals.
C.3.2 Directory and File Operation Built-ins  A example of new functions.
C.3.2.1 Using chdir and stat  What the new functions will do.
C.3.2.2 C Code for chdir and stat  The code for internal file operations.
C.3.2.3 Integrating the Extensions  How to use an external extension.
C.4 Probable Future Extensions  New features that may be implemented one day.
D.1 What a Program Does  The high level view.
D.2 Data Values in a Computer  A very quick intro to data types.
D.3 Floating-Point Number Caveats  Stuff to know about floating-point numbers.

To Miriam, for making me complete.
To Chana, for the joy you bring us.
To Rivka, for the exponential increase.
To Nachum, for the added dimension.
To Malka, for the new beginning.



This document was generated on May 2, 2002 using texi2html