Node:More Complex, Next:Statements/Lines, Previous:Two Rules, Up:Getting Started
Now that we've mastered some simple tasks, let's look at
what typical awk
programs do. This example shows how awk
can be used to
summarize, select, and rearrange the output of another utility. It uses
features that haven't been covered yet, so don't worry if you don't
understand all the details:
ls -l | awk '$6 == "Nov" { sum += $5 } END { print sum }'
This command prints the total number of bytes in all the files in the
current directory that were last modified in November (of any year).
1
The ls -l
part of this example is a system command that gives
you a listing of the files in a directory, including each file's size and the date
the file was last modified. Its output looks like this:
-rw-r--r-- 1 arnold user 1933 Nov 7 13:05 Makefile -rw-r--r-- 1 arnold user 10809 Nov 7 13:03 awk.h -rw-r--r-- 1 arnold user 983 Apr 13 12:14 awk.tab.h -rw-r--r-- 1 arnold user 31869 Jun 15 12:20 awk.y -rw-r--r-- 1 arnold user 22414 Nov 7 13:03 awk1.c -rw-r--r-- 1 arnold user 37455 Nov 7 13:03 awk2.c -rw-r--r-- 1 arnold user 27511 Dec 9 13:07 awk3.c -rw-r--r-- 1 arnold user 7989 Nov 7 13:03 awk4.c
The first field contains read-write permissions, the second field contains the number of links to the file, and the third field identifies the owner of the file. The fourth field identifies the group of the file. The fifth field contains the size of the file in bytes. The sixth, seventh, and eighth fields contain the month, day, and time, respectively, that the file was last modified. Finally, the ninth field contains the name of the file.2
The $6 == "Nov"
in our awk
program is an expression that
tests whether the sixth field of the output from ls -l
matches the string Nov
. Each time a line has the string
Nov
for its sixth field, the action sum += $5
is
performed. This adds the fifth field (the file's size) to the variable
sum
. As a result, when awk
has finished reading all the
input lines, sum
is the total of the sizes of the files whose
lines matched the pattern. (This works because awk
variables
are automatically initialized to zero.)
After the last line of output from ls
has been processed, the
END
rule executes and prints the value of sum
.
In this example, the value of sum
is 140963.
These more advanced awk
techniques are covered in later sections
(see Actions). Before you can move on to more
advanced awk
programming, you have to know how awk
interprets
your input and displays your output. By manipulating fields and using
print
statements, you can produce some very useful and
impressive-looking reports.
In the C shell (csh
), you need to type
a semicolon and then a backslash at the end of the first line; see
awk
Statements Versus Lines, for an
explanation. In a POSIX-compliant shell, such as the Bourne
shell or bash
, you can type the example as shown. If the command
echo $path
produces an empty output line, you are most likely
using a POSIX-compliant shell. Otherwise, you are probably using the
C shell or a shell derived from it.
On some
very old systems, you may need to use ls -lg
to get this output.