GNU Emacs Manual: Specify Coding

[Index]

Q.9 Specifying a Coding System

In cases where Emacs does not automatically choose the right coding system, you can use these commands to specify one:

C-x RET f coding RET: Use coding system coding for the visited file in the current buffer.
C-x RET c coding RET: Specify coding system coding for the immediately following command.
C-x RET k coding RET: Use coding system coding for keyboard input.
C-x RET t coding RET: Use coding system coding for terminal output.
C-x RET p input-coding RET output-coding RET: Use coding systems input-coding and output-coding for subprocess input and output in the current buffer.
C-x RET x coding RET: Use coding system coding for transferring selections to and from other programs through the window system.
C-x RET X coding RET: Use coding system coding for transferring one selection--the next one--to or from the window system.

The command C-x RET f (set-buffer-file-coding-system) specifies the file coding system for the current buffer--in other words, which coding system to use when saving or rereading the visited file. You specify which coding system using the minibuffer. Since this command applies to a file you have already visited, it affects only the way the file is saved.

Another way to specify the coding system for a file is when you visit the file. First use the command C-x RET c (universal-coding-system-argument); this command uses the minibuffer to read a coding system name. After you exit the minibuffer, the specified coding system is used for the immediately following command.

So if the immediately following command is C-x C-f, for example, it reads the file using that coding system (and records the coding system for when the file is saved). Or if the immediately following command is C-x C-w, it writes the file using that coding system. Other file commands affected by a specified coding system include C-x C-i and C-x C-v, as well as the other-window variants of C-x C-f.

C-x RET c also affects commands that start subprocesses, including M-x shell (see section AC.15 Running Shell Commands from Emacs).

However, if the immediately following command does not use the coding system, then C-x RET c ultimately has no effect.

An easy way to visit a file with no conversion is with the M-x find-file-literally command. See section M.2 Visiting Files.

The variable default-buffer-file-coding-system specifies the choice of coding system to use when you create a new file. It applies when you find a new file, and when you create a buffer and then save it in a file. Selecting a language environment typically sets this variable to a good choice of default coding system for that language environment.

The command C-x RET t (set-terminal-coding-system) specifies the coding system for terminal output. If you specify a character code for terminal output, all characters output to the terminal are translated into that coding system.

This feature is useful for certain character-only terminals built to support specific languages or character sets--for example, European terminals that support one of the ISO Latin character sets. You need to specify the terminal coding system when using multibyte text, so that Emacs knows which characters the terminal can actually handle.

By default, output to the terminal is not translated at all, unless Emacs can deduce the proper coding system from your terminal type or your locale specification (see section Q.3 Language Environments).

The command C-x RET k (set-keyboard-coding-system) or the Custom option keyboard-coding-system specifies the coding system for keyboard input. Character-code translation of keyboard input is useful for terminals with keys that send non-ASCII graphic characters--for example, some terminals designed for ISO Latin-1 or subsets of it.

By default, keyboard input is not translated at all.

There is a similarity between using a coding system translation for keyboard input, and using an input method: both define sequences of keyboard input that translate into single characters. However, input methods are designed to be convenient for interactive use by humans, and the sequences that are translated are typically sequences of ASCII printing characters. Coding systems typically translate sequences of non-graphic characters.

The command C-x RET x (set-selection-coding-system) specifies the coding system for sending selected text to the window system, and for receiving the text of selections made in other applications. This command applies to all subsequent selections, until you override it by using the command again. The command C-x RET X (set-next-selection-coding-system) specifies the coding system for the next selection made in Emacs or read by Emacs.

The command C-x RET p (set-buffer-process-coding-system) specifies the coding system for input and output to a subprocess. This command applies to the current buffer; normally, each subprocess has its own buffer, and thus you can use this command to specify translation to and from a particular subprocess by giving the command in the corresponding buffer.

The default for translation of process input and output depends on the current language environment.

The variable file-name-coding-system specifies a coding system to use for encoding file names. If you set the variable to a coding system name (as a Lisp symbol or a string), Emacs encodes file names using that coding system for all file operations. This makes it possible to use non-ASCII characters in file names--or, at least, those non-ASCII characters which the specified coding system can encode.

If file-name-coding-system is nil, Emacs uses a default coding system determined by the selected language environment. In the default language environment, any non-ASCII characters in file names are not encoded specially; they appear in the file system using the internal Emacs representation.

Warning: if you change file-name-coding-system (or the language environment) in the middle of an Emacs session, problems can result if you have already visited files whose names were encoded using the earlier coding system and cannot be encoded (or are encoded differently) under the new coding system. If you try to save one of these buffers under the visited file name, saving may use the wrong file name, or it may get an error. If such a problem happens, use C-x C-w to specify a new file name for that buffer.

The variable locale-coding-system specifies a coding system to use when encoding and decoding system strings such as system error messages and format-time-string formats and time stamps. That coding system is also used for decoding non-ASCII keyboard input on X Window systems. You should choose a coding system that is compatible with the underlying system's text representation, which is normally specified by one of the environment variables LC_ALL, LC_CTYPE, and LANG. (The first one, in the order specified above, whose value is nonempty is the one that determines the text representation.)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated on April 2, 2002 using texi2html