[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6. Summarizing files

These commands generate just a few numbers representing entire contents of files.

6.1 wc: Print byte, word, and line counts  Print byte, word, and line counts.
6.2 sum: Print checksum and block counts  Print checksum and block counts.
6.3 cksum: Print CRC checksum and byte counts  Print CRC checksum and byte counts.
6.4 md5sum: Print or check message-digests  Print or check message-digests.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1 wc: Print byte, word, and line counts

wc counts the number of bytes, characters, whitespace-separated words, and newlines in each given file, or standard input if none are given or for a file of `-'. Synopsis:

 
wc [option]... [file]...

wc prints one line of counts for each file, and if the file was given as an argument, it prints the file name following the counts. If more than one file is given, wc prints a final line containing the cumulative counts, with the file name `total'. The counts are printed in this order: newlines, words, characters, bytes. By default, each count is output right-justified in a 7-byte field with one space between fields so that the numbers and file names line up nicely in columns. However, POSIX requires that there be exactly one space separating columns. You can make wc use the POSIX-mandated output format by setting the POSIXLY_CORRECT environment variable.

By default, wc prints three counts: the newline, words, and byte counts. Options can specify that only certain counts be printed. Options do not undo others previously given, so

 
wc --bytes --words

prints both the byte counts and the word counts.

With the --max-line-length option, wc prints the length of the longest line per file, and if there is more than one file it prints the maximum (not the sum) of those lengths.

The program accepts the following options. Also see 2. Common options.

`-c'
`--bytes'
Print only the byte counts.

`-m'
`--chars'
Print only the character counts.

`-w'
`--words'
Print only the word counts.

`-l'
`--lines'
Print only the newline counts.

`-L'
`--max-line-length'
Print only the maximum line lengths.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2 sum: Print checksum and block counts

sum computes a 16-bit checksum for each given file, or standard input if none are given or for a file of `-'. Synopsis:

 
sum [option]... [file]...

sum prints the checksum for each file followed by the number of blocks in the file (rounded up). If more than one file is given, file names are also printed (by default). (With the `--sysv' option, corresponding file names are printed when there is at least one file argument.)

By default, GNU sum computes checksums using an algorithm compatible with BSD sum and prints file sizes in units of 1024-byte blocks.

The program accepts the following options. Also see 2. Common options.

`-r'
Use the default (BSD compatible) algorithm. This option is included for compatibility with the System V sum. Unless `-s' was also given, it has no effect.

`-s'
`--sysv'
Compute checksums using an algorithm compatible with System V sum's default, and print file sizes in units of 512-byte blocks.

sum is provided for compatibility; the cksum program (see next section) is preferable in new applications.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3 cksum: Print CRC checksum and byte counts

cksum computes a cyclic redundancy check (CRC) checksum for each given file, or standard input if none are given or for a file of `-'. Synopsis:

 
cksum [option]... [file]...

cksum prints the CRC checksum for each file along with the number of bytes in the file, and the filename unless no arguments were given.

cksum is typically used to ensure that files transferred by unreliable means (e.g., netnews) have not been corrupted, by comparing the cksum output for the received files with the cksum output for the original files (typically given in the distribution).

The CRC algorithm is specified by the POSIX standard. It is not compatible with the BSD or System V sum algorithms (see the previous section); it is more robust.

The only options are `--help' and `--version'. See section 2. Common options.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.4 md5sum: Print or check message-digests

md5sum computes a 128-bit checksum (or fingerprint or message-digest) for each specified file. If a file is specified as `-' or if no files are given md5sum computes the checksum for the standard input. md5sum can also determine whether a file and checksum are consistent. Synopses:

 
md5sum [option]... [file]...
md5sum [option]... --check [file]

For each file, `md5sum' outputs the MD5 checksum, a flag indicating a binary or text input file, and the filename. If file is omitted or specified as `-', standard input is read.

The program accepts the following options. Also see 2. Common options.

`-b'
`--binary'
Treat all input files as binary. This option has no effect on Unix systems, since they don't distinguish between binary and text files. This option is useful on systems that have different internal and external character representations. On MS-DOS and MS-Windows, this is the default.

`-c'
`--check'
Read filenames and checksum information from the single file (or from stdin if no file was specified) and report whether each named file and the corresponding checksum data are consistent. The input to this mode of md5sum is usually the output of a prior, checksum-generating run of `md5sum'. Each valid line of input consists of an MD5 checksum, a binary/text flag, and then a filename. Binary files are marked with `*', text with ` '. For each such line, md5sum reads the named file and computes its MD5 checksum. Then, if the computed message digest does not match the one on the line with the filename, the file is noted as having failed the test. Otherwise, the file passes the test. By default, for each valid line, one line is written to standard output indicating whether the named file passed the test. After all checks have been performed, if there were any failures, a warning is issued to standard error. Use the `--status' option to inhibit that output. If any listed file cannot be opened or read, if any valid line has an MD5 checksum inconsistent with the associated file, or if no valid line is found, md5sum exits with nonzero status. Otherwise, it exits successfully.

`--status'
This option is useful only when verifying checksums. When verifying checksums, don't generate the default one-line-per-file diagnostic and don't output the warning summarizing any failures. Failures to open or read a file still evoke individual diagnostics to standard error. If all listed files are readable and are consistent with the associated MD5 checksums, exit successfully. Otherwise exit with a status code indicating there was a failure.

`-t'
`--text'
Treat all input files as text files. This is the reverse of `--binary'.

`-w'
`--warn'
When verifying checksums, warn about improperly formatted MD5 checksum lines. This option is useful only if all but a few lines in the checked input are valid.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Jeff Bailey on December, 28 2002 using texi2html