Go to the first, previous, next, last section, table of contents.
When downloading material from the web, you will often want to restrict
the retrieval to only certain file types. For example, if you are
interested in downloading GIFs, you will not be overjoyed to get
loads of PostScript documents, and vice versa.
Wget offers two options to deal with this problem. Each option
description lists a short name, a long name, and the equivalent command
in `.wgetrc'.
- `-A acclist'
-
- `--accept acclist'
-
- `accept = acclist'
-
The argument to `--accept' option is a list of file suffixes or
patterns that Wget will download during recursive retrieval. A suffix
is the ending part of a file, and consists of "normal" letters,
e.g. `gif' or `.jpg'. A matching pattern contains shell-like
wildcards, e.g. `books*' or `zelazny*196[0-9]*'.
So, specifying `wget -A gif,jpg' will make Wget download only the
files ending with `gif' or `jpg', i.e. GIFs and
JPEGs. On the other hand, `wget -A "zelazny*196[0-9]*"' will
download only files beginning with `zelazny' and containing numbers
from 1960 to 1969 anywhere within. Look up the manual of your shell for
a description of how pattern matching works.
Of course, any number of suffixes and patterns can be combined into a
comma-separated list, and given as an argument to `-A'.
- `-R rejlist'
-
- `--reject rejlist'
-
- `reject = rejlist'
-
The `--reject' option works the same way as `--accept', only
its logic is the reverse; Wget will download all files except the
ones matching the suffixes (or patterns) in the list.
So, if you want to download a whole page except for the cumbersome
MPEGs and .AU files, you can use `wget -R mpg,mpeg,au'.
Analogously, to download all files except the ones beginning with
`bjork', use `wget -R "bjork*"'. The quotes are to prevent
expansion by the shell.
The `-A' and `-R' options may be combined to achieve even
better fine-tuning of which files to retrieve. E.g. `wget -A
"*zelazny*" -R .ps' will download all the files having `zelazny' as
a part of their name, but not the PostScript files.
Note that these two options do not affect the downloading of HTML
files; Wget must load all the HTMLs to know where to go at
all--recursive retrieval would make no sense otherwise.
Go to the first, previous, next, last section, table of contents.