Go to the first, previous, next, last section, table of contents.
The format of the set1 and set2 arguments resembles
the format of regular expressions; however, they are not regular
expressions, only lists of characters. Most characters simply
represent themselves in these strings, but the strings can contain
the shorthands listed below, for convenience. Some of them can be
used only in set1 or set2, as noted below.
- Backslash escapes.
-
A backslash followed by a character not listed below causes an error
message.
- `\a'
-
Control-G,
- `\b'
-
Control-H,
- `\f'
-
Control-L,
- `\n'
-
Control-J,
- `\r'
-
Control-M,
- `\t'
-
Control-I,
- `\v'
-
Control-K,
- `\ooo'
-
The character with the value given by ooo, which is 1 to 3
octal digits,
- `\\'
-
A backslash.
- Ranges.
-
The notation `m-n' expands to all of the characters
from m through n, in ascending order. m should
collate before n; if it doesn't, an error results. As an example,
`0-9' is the same as `0123456789'. Although GNU
tr
does not support the System V syntax that uses square brackets to
enclose ranges, translations specified in that format will still work as
long as the brackets in string1 correspond to identical brackets
in string2.
- Repeated characters.
-
The notation `[c*n]' in set2 expands to n
copies of character c. Thus, `[y*6]' is the same as
`yyyyyy'. The notation `[c*]' in string2 expands
to as many copies of c as are needed to make set2 as long as
set1. If n begins with `0', it is interpreted in
octal, otherwise in decimal.
- Character classes.
-
The notation `[:class:]' expands to all of the characters in
the (predefined) class class. The characters expand in no
particular order, except for the
upper
and lower
classes,
which expand in ascending order. When the `--delete' (`-d')
and `--squeeze-repeats' (`-s') options are both given, any
character class can be used in set2. Otherwise, only the
character classes lower
and upper
are accepted in
set2, and then only if the corresponding character class
(upper
and lower
, respectively) is specified in the same
relative position in set1. Doing this specifies case conversion.
The class names are given below; an error results when an invalid class
name is given.
alnum
-
Letters and digits.
alpha
-
Letters.
blank
-
Horizontal whitespace.
cntrl
-
Control characters.
digit
-
Digits.
graph
-
Printable characters, not including space.
lower
-
Lowercase letters.
print
-
Printable characters, including space.
punct
-
Punctuation characters.
space
-
Horizontal or vertical whitespace.
upper
-
Uppercase letters.
xdigit
-
Hexadecimal digits.
- Equivalence classes.
-
The syntax `[=c=]' expands to all of the characters that are
equivalent to c, in no particular order. Equivalence classes are
a relatively recent invention intended to support non-English alphabets.
But there seems to be no standard way to define them or determine their
contents. Therefore, they are not fully implemented in GNU
tr
;
each character's equivalence class consists only of that character,
which is of no particular use.
Go to the first, previous, next, last section, table of contents.