Name

console-terminal-emulator — emulate a real terminal using a pseudo-terminal

Synopsis

console-terminal-emulator [--linux] [--sco] [--freebsd] [--netbsd] [--decvt] [--vcsa] [--inverted] {directory}

Description

console-terminal-emulator is a utility that expects file descriptor 4 to be the back end (called "master side" in older documentation) of a pseudo-terminal and the TTY environment variable to be the full device filename of the front end (called "slave side" in older documentation). (pty-get-tty(1) can be used to set this process state up.)

First it sets up the various files in directory, opening the input FIFO for reading and the display buffers for reading and writing. The FIFOs are created unconditionally, and opened in non-blocking mode. The display buffers are created if they do not exist. All created display buffer files have permissions rw-r-----. All created input FIFO files have permissions rw--w----.

It then enters a loop where it simultaneously:

  • processes all data received from the pseudo-terminal back end as terminal output, handling printing characters, control characters, escape sequences, and control sequences.

  • processes all input events from the input FIFO, sending terminal character and escape sequences to the back end.

It finishes when the back end signals hangup (i.e. when the line discipline would set modem control lines on a real serial device to signal hangup because the last front-end file descriptor is closed) and there are no more received data to output. At termination, it unlinks the file for the front end, and erases the display as if by a Clear Display control sequence.

FILES

The files used are as follows:

directory/tty

A stable and predictable name for the front end of the pseudo-terminal device. This is a link to the device, if possible, or a symbolic link if not. (It is usually not. On Linux, the pseudo-terminal devices are on a filesystem of their own, which would make a link a cross-filesystem link; and on BSDs the devfs driver disallows the creation of directories under /dev forcing directory to be on another filesystem.)

If directory is /run/dev/vc1, for example, then the stable name for the terminal, for use in login services, will be /run/dev/vc1/tty.

directory/input

The input FIFO, through which realizer processes send keyboard and mouse events. Events are in a uniform packet format, which the terminal emulator converts into appropriate escape sequences. This has its group ID explicitly set to the effective GID of the emulator process.

directory/display

The UTF-32, 24-bit colour, display buffer. This has its group ID explicitly set to the effective GID of the emulator process.

directory/vcsa

An 8-bit character set, 3-bit colour, display buffer that is compatible with the Linux vcsa devices for kernel virtual terminals. This is provided for compatibility with screen reader softwares. This has its group ID explicitly set to the effective GID of the emulator process.

Security

console-terminal-emulator requires no superuser privileges and is designed to be run entirely under the aegis of a dedicated unprivileged user account. It only requires write and search access to directory and need not have owner access to it.

Usually directory will be set-group-ID to a group different to the effective group ID of the emulator process. Changing the groups of directory/input, directory/display, and directory/vcsa to the effective group ID of the emulator process thus distinguishes group access to those files, allowing one to add ordinary users to that group in order to give them direct realizer access to the emulated terminal.

Terminal output

console-terminal-emulator emulates the character sequence processing logic of a hardware terminal, taking the data received from the back end of the pseudo-terminal and translating them into modifications to the screen buffer data files.

All character data are first decoded from UTF-8 to a stream of Unicode code points. The decoding is directly from UTF-8 to the code points; in particular, UTF-16 is not employed as an intermediary step. If a UTF-8 sequence decodes to one of the reserved code points used for UTF-16, no special treatment is given, and the code point is treated as any other.

The decoder has to do something with invalid UTF-8 encodings, including overlong and incomplete character sequences. Whatever code point they decode to, such characters are not processed as control characters or as parts of control/escape sequences. They abort any control/escape sequence that they interrupt, and if not incomplete encodings they are printed as ordinary printing characters even if they are the code points for control characters. This behaviour should not be relied upon, and programs should not send such UTF-8 sequences to a terminal.

vcsa buffer

The --vcsa command line option enables output to the directory/vcsa file.

This file is structured as per the Linux character device file of that name, as a 4-byte header giving size and cursor position information followed by a series of 2-byte character and attribute pairs in IBM PC CGA format. It is implemented mainly as a compatibility mechanism for the benefits of terminal realizing softwares that expect Linux vcsa devices, such as screen readers for the blind or partially sighted.

The terminal emulator does not recognize nor handle CSI sequences that deal with 8-bit character sets. The vcsa screen buffer is always ISO 8859-1; with Unicode code points outwith that range converted to code point 255. To handle a wider range of characters, terminal realizing softwares should use the Unicode screen buffer. Code points in the ranges 0x00 to 0x1F and 0x80 to 0x9F may appear, and terminal realizing softwares are expected to render these with some form of ordinary printing graphic.

The 24-bit RGB foreground and background colour values for character cells are mapped to 3-bit CGA by comparing the relative intensities of red, green, and blue. The "bright foreground" and "bright background" CGA attribute bits are taken from the boldface and blink terminal attributes, respectively. No attempt is made to provide MDA or VGA in monochrome mode attributes, such as underline. To handle the full range of attributes, terminal realizing softwares should use the Unicode screen buffer.

Unicode buffer

The directory/display file begins with a 16-byte header:

  1. 4-byte UCS-4 Byte Order Mark in host byte order.

  2. 2-byte width in host byte order.

  3. 2-byte height in host byte order.

  4. 2-byte cursor X position in host byte order.

  5. 2-byte cursor Y position in host byte order.

  6. cursor glyph type byte (upper nybble reserved).

  7. cursor attributes byte (upper nybble reserved).

  8. screen flags and pointer attributes byte, upper and lower nybble.

  9. Reserved byte.

That is followed by a series of 16-byte records, one per character cell, containing:

  1. Foreground alpha value byte.

  2. Foreground red value byte.

  3. Foreground green value byte.

  4. Foreground blue value byte.

  5. Background alpha value byte.

  6. Background red value byte.

  7. Background green value byte.

  8. Background blue value byte.

  9. 4-byte UCS-4 value in host byte order.

  10. 2-byte attributes.

  11. Reserved bytes.

Unassigned code points, reserved code points, control code points, combining code points, and zero-width code points may appear, and terminal realizing softwares are expected to render these with some form of ordinary printing graphic. For forwanrds and backwards compatibility, reserved bits in records should be written as zeroes, ignored when read, and preserved when copied.

Control character processing

Characters in the "Cc" ("Other, Control") Unicode code point category (a.k.a. "C0" and "C1" characters) are control characters. They are always processed, even in the middle of escape or control sequences. All control characters are no-ops except for the following:

CR

Carriage return. Move to column 0.

NEL

Newline. Move to column 0 and one row down.

LF, VT, FF, and IND

Linefeed/Vertical tab/Index. Move one row down, remaining in the current column.

RI

Reverse index. Move one row up, remaining in the current column.

TAB

Horizontal tab. Move to the next tabstop, or the last column.

BS

Backspace. Nondestructively move to the previous column, stopping at the first column.

DEL

Delete. Delete the character at the cursor position, moving the remainder of the row to the left and padding the final column with a space.

HTS

Horizontal tab set. Set a tabstop at the current column.

CAN

Cancel. Cancel any control/escape sequence currently in progress.

ESC

Escape. Cancel any control/escape sequence currently in progress and begin an escape sequence.

CSI

Control sequence introducer. Cancel any control/escape sequence currently in progress and begin a control sequence.

Escape sequences

Escape sequences are multiple-character sequences comprising:

  • the ESC character, an optional intermediate character in the range U+0020 to U+002F, and a single final character in the range U+0040 to U+007E.

    Most such escape sequences for real terminals are ISO 2022 character set switching sequences, which the terminal emulator has no need of since it uses UTF-8 natively, and so does not support. The only supported escape sequences are:

    DECALN

    the DEC VT extension that fills the display with the letter "E" as a screen alignment test

    S7C1T

    set the terminal emulator to send C1 characters in control sequences encoded with the ECMA-48 7-bit extensions (as below)

    S8C1T

    set the terminal emulator to send C1 characters in control sequences as plain single C1 code points (but UTF-8 encoded)

  • the ESC character and a single final character in the range U+0040 to U+005F.

    This is the two-character 7-bit mechanism of ECMA-48 section 5. It is strictly speaking unneeded given that the terminal emulator is 8-bit clean and employs UTF-8 as standard. The entire U+0080 to U+009F control character range is accessible via this mechanism. So NEL can be emitted as either the single character U+0085 or as the two-character sequence U+001B U+0045. (Of course, U+0085 encodes to two characters in UTF-8.) Similarly, CSI can be emitted as either the single character U+009B or as the two-character sequence U+001B U+005B.

Control sequences

Per ECMA-48, control sequences are multiple-character sequences comprising the CSI character followed by parameter characters in the range U+0030 to U+003F followed optionally by an intermediate character in the range U+0020 to U+002F and terminated by a single final character in the range U+0040 to U+007E.

Parameters comprise the digits, semi-colon (U+003B), and colon (U+003A), forming up to 16 semi-colon separated sub-sequences; with an initial character in the range U+003C to U+003F denoting a DEC vendor-private control sequence that is an extension to ECMA-48. Each sub-sequence is up to 16 colon separated digit sequences. More than 16 parameters or sub-parameters simply causes trailing ones to be discarded. As an extension to DEC, the terminal allows vendor-private characters anywhere in the parameter character sequence, rather than only in the initial position, ignoring all but the last one. However, this is largely to simplify implementation and should not be relied upon.

The digit sequences are parameters to the action, and zero-length digit sequences are a parameter with the value zero (or, in some cases, 1 or 2). In most cases, a control sequence with N parameters is equivalent to N such control sequences in order each with one of the parameters.

ISO 8613-6/ITU T.416 SGR colour extensions make use of this mechanism, with colour values being encoded as sub-parameters behind a leading 38 or 48 sub-parameter. That standard explicitly states in section 13.1.8 that "Pe" values are separated by character "3/10" (i.e. U+003A). Such SGR sequences are unambiguous, since it is not possible to mistake a sub-parameter specifying a colour value for an SGR attribute code: CSI 1;4;38:5:14;48:2:0:224:3:7m.

As extensions, the row and column counts to the DTTerm DECSLPP 8 control sequence can also be expressed using sub-parameters.

As explained at length in the console-control-sequence(1) user manual, the RIS control sequence has been proscribed for decades. This terminal emulator implements the DECSTR (Soft Terminal Reset) control sequence, which should always be used instead of RIS.

There is one control sequence that is peculiar to this terminal emulator. DEC Private Mode #1369 is so-called "square" mode. If it is off, all Unicode characters classified as "full-width" or "wide" have an extra space printed after them. This is for the benefit of realizers that have oblong cells that cannot hold full-width or wide glyphs. It defaults to on, where no such special processing occurs.

Colours and attributes

The terminal emulator maintains a current set of character attributes, a 24-bit RGB foregrond colour, and a 24-bit RGB background colour. These are combined with character code points when writing to the character cells of each screen buffer. Each character cell thus has its own, independent, attributes and 24-bit foreground and background colours.

The ANSI colour attributes set by the SGR control sequence map from the 3-bit RGB ANSI colours to 24-bit RGB using 255 for a 1 bit and 0 for a 0 bit. The terminal emulator also supports the ISO 8613-6/ITU T.416 SGR colour extensions, that are often misattributed to xterm:

  • The indexed colour extension (SGR 38:5 and SGR 48:5) has the conventional mapping of 16 "standard" colours, a 24 value grayscale, and a 6×6×6 colour cube.

  • The RGB direct colour extension (SGR 38:2 and SGR 48:2) simply sets the 24-bit RGB values directly.

When characters or lines are deleted, or characters or lines or the screen are erased, the background colour used for the "filler" that is placed in the newly blanked cells is the current background colour. This is "background colour erase" and is the default for DEC VT520 terminals as well as the only option on Linux and BSD kernel virtual terminals. By changing the "DECECM" private mode setting, it can be switched to "screen colour erase", which uses the default terminal background colour (as set by SGR 0 and SGR 49).

The --invert command line option determines the initial state of the "DECSCNM" private mode setting, defaulting to off (i.e. "dark"). Use this for a default colour pair akin to a Sun SPARC machine's virtual terminal console, or GUI terminal emulators.

Printing characters

All other characters are "printing" characters.

Characters in the "Cf" ("Other, Format") and "Mn" ("Mark, Non-spacing") Unicode code point categories are simply discarded. Note that this includes U+00AD (soft hyphen). The terminal emulator does not have enough information about word breaks or context to handle soft hyphenation. This behaviour differs from some other terminal emulation programs, which are based upon code written by Markus Kuhn that treats U+00AD unconditionally as an spacing character and explicitly overrides its "Mn" categorization.

Characters in the "Me" ("Mark, Enclosing") Unicode code point category overstrike the character at the current cursor position without advancing.

All other printing characters are printed as-is, using the currently set attributes, foreground colour, and background colour. After each character is printed, the cursor position is advanced.

By default, the emulator has automatic right margins turned on. Conceptually, automatic right margins means that writing a character in the last column automatically returns to the first column and moves down a row, a line wrap. If automatic right margins are turned off, writing a character in the last column does not move down a row or return to the first column. In practice, things are slightly more complex if automatic margins are turned on. Rather than a wrap happening there and then when the character is written to the last column, instead a pending wrap is flagged. If the cursor is then explicitly moved, the pending wrap is cancelled. Otherwise, when the next graphic is to be printed the pending wrap is enacted immediately beforehand.

The purpose of pending wrap is to allow full-screen TUI programs to write right up to the lower-right-hand corner without scrolling the screen. Programs must be aware of and careful about its effect. In the pending state, the terminal cursor is in a different position to where an application would have expected it to immediately wrap to; and using relative cursor motions in that state, including BackSpace and Next Line, needs to account for this.

By default, the emulator also has scrolling turned on, so that moving down from the last row scrolls the buffer up and moving up from the first row scrolls the buffer down. If scrolling is turned off, moving down from the last row or moving up from the first row have no effect. Scrolling only applies to cursor advancement by printing characters or the Newline, Index, or Reverse Index control characters. It does not apply to cursor motion control sequences.

Terminal input

Terminal input operates in terms of a stream of input events, comprising Unicode characters or special keys.

Input protocol

The directory/input FIFO receives a sequence of 4-byte messages. To avoid message tearing, realizers must ensure that they do not write messages using multiple system calls. A message is a 32-bit word in host byte order. The most significant byte denotes the message type and the interpretation of the remainder of the message.

0x00nnnnnn

A null message. This is ignored

0x01cccccc

Unicode character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x02kkkkmm

A system keypad key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. System keys are ignored by the terminal emulator.

0x04kkkkmm

The absolute horizontal (x axis) position of the mouse. The column number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x05kkkkmm

The absolute vertical (y axis) position of the mouse. The row number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x06kknnmm

The absolute depth (z axis) position of the mouse. The row number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x07kkbbmm

A mouse button. The button number is kk, its state is bb, and mm is a set of bitflags indicating the current state of modifier keys. The well-known buttons are numbered in DEC VT order: left, middle, right, side. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x08kknnmm

A mouse wheel motion. The wheel number is kk, the (signed) amount by which the wheel is scrolled is nn, and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x09cccccc

Pasted Unicode character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting start of paste is sent beforehand. If the character is ESC or CSI, the control sequence denoting end of paste is sent afterward.

0x0Ckkkkmm

A consumer device key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. Consumer device keys are ignored by the terminal emulator, because of a lack of appropriate escape sequences and control sequences for representing them.

0x0Ekkkkmm

An extended key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. The terminal emulator sends the equivalent control sequence through to the line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x0Fkkkkmm

A function key. The function key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. The terminal emulator sends the equivalent control sequence through to the line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x11cccccc

Unicode accelerator character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input, prefixed with ESC. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

The modifier flags represent abstract level2, level3, control, group2, and super modifiers. (See ISO 9995 for the concepts of levels and groups; super is an abstract "Command" or "GUI" modifier.) The input protocol does not comprise standalone events for modifier keys or associated modifier lock keys.

The numbering of system, extended, and consumer keys largely follows the ID numbers used for keys in the USB HID protocol, with some exceptions, and a number of additions for keys that are not present in USB. Importantly, keys that generate ordinary Unicode characters are sent as a Unicode character message and not as an extended key with the USB keycode.

The abstract keyboard

The terminal emulator has no dealing in keyboard maps and exactly how keystrokes translate to Unicode characters; which are the province of a realizer.

The terminal emulator has no dealing in keyboard modifier state tracking; nor, similarly, does it deal in using modifiers to change dual-mode keypads. It has no dealings in numlock, capslock, shiftlock and these are not part of the abstract keyboard that is used in the input protocol. Rather, dealing with this is entirely and solely within the remits of realizers. For capslock, numlock, and shiftlock, the interactions of the locks and shifts is handled before the input stream is translated into the abstract keyboard.

Consequently, for the calculator keypad keys and the cursor/editing keypad keys on the abstract keyboard there are exactly two behaviours: "application mode" and "normal mode". There are no sub-states of "normal mode" controlled by numlock, albeit that in "normal mode" whatever the realizer transmits as an accompanying modifier state may be reflected in generated control sequences using the DEC VT augmentations to DECFNK employed by the VT420 onwards. (Numeric lock is a function internal to most keyboard hardwares anyway on PS/2 and USB keyboards, with many keyboards sending different scan codes according to the value of the NumLock status LED. Dealing with it lies in the part of the system that speaks to hardware.) These sub-modes are provided by realizers and their various keyboard mapping systems, and numlock and the associated shifting are handled before the input stream is translated into the abstract keyboard.

Realizers also perform any substitution of PF1 to PF5 on the calculator keypad (which are distinct, extended, keys) for the F1 to F5 function keys. This is a substitution that originates with DEC VTs, where at reset F1 to F5 were in a "local" mode and did not transmit characters, resulting in applications developers relying upon the calculator keypad keys instead, and hence terminal emulator softwares performing the same substtution for application compatibility. (In later model DEC VTs these function keys could all be reconfigured to "host mode" and would transmit DECFNK control sequences like any other function key.) The input protocol distinguishes the function keys from the calculator keypad keys, and a message denoting F1 to F5 denotes those function keys in their "host" mode, not the calculator keypad keys.

Control sequences for mouse

The DEC status reports report that a locator is always present and available, and that it is a mouse device.

Two mouse protocols are supported, the xterm Private Mode #1006 protocol and the DEC VT Locator protocol. These correspond to the ttymouse=sgr and ttymouse=dec settings in vim, respectively. These are superior to the unsupported xterm Private Mode #9 ("xterm.X10"), #1005 ("xterm.UTF-8"), and #1015 ("urxvt") protocols, for several reasons including avoiding encoding ambiguities and display dimension limits, and supersede them.

In the DECPM #1006 protocol, the click-only (DECPM #1000), button motion (DECPM #1002), and all events (DECPM #1003) modes are all available. The mouse grabber mode (DECPM #1001) is not.

In the DEC VT Locator protocol, only character coordinates are available, the terminal emulator having no dealings in displays; they being the province of a realizer and possibly not even being pixel-addressible.

Control sequences for keyboard

The DEC status reports report that a keyboard is always present, with an unknown country layout.

Extended keys and function keys cause the terminal emulator to send control sequences through the line discipline to terminal input. What control sequences are sent depends from what emulation type the terminal emulator is set to: SCO, Linux, FreeBSD, NetBSD, or DEC VT. No emulated terminal type defines control sequences for all keys, or distinct control sequences where it does define them.

  • DEC VTs only define control sequences for function keys from 1 to 20 and have no defined control sequences for function keys outwith that range. These function keys, and keys on the editing keypad, are encoded as the DECFNK control sequence. In particular, note that function keys F1 to F5 have documented DECFNK numbers (given in the VT420 and VT525 programmers' references), should a realizer actually send those keys.

    DEC VTs employ SS3 sequences for the PF1 to PF5 keys on the calculator keypad. Structurally, a single shift cannot carry parameterized modifier information. (XTerm is faulty in this regard.)

  • The SCO kernel virtual terminal defines SCO FNK control sequences for function keys from 1 to 48. (These clash with some control sequences defined by other terminal type families, in particular XTerm.)

    • PF1 to PF5 do not exist in a real SCO KVT, which does not replace F1 to F5.

    • For F1 to F12 the level 2 shift and control modifiers are encoded into the function key number, and the SCO KVT's original SCO FNK control sequences (which have no parameters) are emitted. For example: CSI U is F9. CSI ^ is Shift+Control+F9.

    • For F13 to F48 the SCO FNK control sequences are extended to have parameters. The function key numbers are used as-is, and the modifiers are instead encoded into the second sub-parameter of the control sequence, in similar form to DECFNK (except zero-based). The first sub-parameter is a (usually 1) repeat count, again similar to DECFNK. For example: CSI 1:5U is Shift+Control+F9.

  • The FreeBSD kernel virtual terminal only has function keys from 1 to 12. Because of incomplete functionality in its Teken library, it uses a mongrel admixture of DECFNK and SCO FNK sequences.

    • PF1 to PF4 (which replace F1 to F4, of course) always generate SS3 sequences, irrespective of modifiers.

    • Unmodified F1 to F12 are transmitted as DECFNK control sequences, the result of a hardwired map in the Teken library itself.

    • Modified F1 to F12 are transmitted as SCO FNK control sequences, the result of the default contents of keyboard maps set up by the kbdcontrol(1) tool.

  • The Linux kernel virtual terminal only defines control sequences for function keys from 1 to 20 and has no defined control sequences for function keys outwith that range. It employs DECFNK for actual function keys.

    The Linux KVT employs Linux FNK for the PF1 to PF5 keys on the calculator keypad. Linux FNK is not conformant with the ECMA-48 rules for control sequences and easy to erroneously confuse with the CSI control sequences for cursor keys, and should be avoided if possible.

The terminal emulator may fall back to the ECMA-48 FNK sequences, for function key numbers and modifier chords which have no DECFNK, Linux FNK, or SCO FNK control sequences. As an extension, the first sub-parameter to FNK encodes modifiers, in similar form to DECFNK (except zero-based). For examples: FNK 1:0 is F1. FNK 9:5 is Shift+Control+F9.

Control sequences for pasted input

The Private Mode #2004 controls whether bracketed paste is switched on. When it is, sequences of successive pasted characters are bracketed by DECFNK sequences denoting paste on and paste off.

To prevent actual pasted character sequences from resembling DECFNK, paste is switched off after every pasted ESC or CSI character. If the next character is a pasted one, paste will then be switched back on again.

Differences from terminals and documented standards

The terminal emulator does not replicate all features of a real hardware terminal. Its goal is to provide a workalike for (the TUI parts of the) the virtual terminals that are/were built in to the Linux and BSD operating system kernels. There is no support for the historical features of real terminal hardwares such as attached printers, page switching, status lines, XON/XOFF modem flow control, programmable function keys, alternative (WYSE/TVI) control/escape sequences, direct auxiliary serial device I/O, and graphics modes.

Rather, the terminal emulator is aimed at handling the outputs of TUI programs that use the "linux" terminal type (for the Linux kernel virtual terminals), the "pcvtXX" terminal type (for NetBSD kernel virtual terminals), the "pccon" terminal type (for OpenBSD kernel virtual terminals), or the "cons25" and "teken" terminal types (for FreeBSD kernel virtual terminals). (These are the terminal types set by vc-get-tty(1).)

The terminal emulator also has no dealing in the things that are the domain of separate realizing tools. The modular nature of user-space virtual terminals means that the terminal emulator has no knowledge of the actual devices used to realize the terminal, and that there can be zero or many realizers for any given virtual terminal. There is no support for font definitions, window titles, 256-colour VGA palettes, VGA-specific overscan and underline, VESA power, screen-savers, or multi-mode keypads. In response to device status report requests, it always responds with fixed information about a set of pseudo-devices.

The terminal emulator is 8-bit clean and employs UTF-8 as standard. Therefore, the terminal emulator has no need for mechanisms to switch 8-bit code pages amongst multiple character sets. There is no ISO 2022 support.

The two-character 7-bit mechanism of ECMA-48 (section 5) is not only present, but is more completely implemented than in several kernel virtual terminal emulators, which usually only implement the 7-bit aliases for CSI and OSC.

Both ECMA-48 and the DEC VT520 Video Terminal Programmer Information [EK-VT520-RM] reference are straightforward and clear about numeric parameters to cursor motion and screen editing control sequences: a value of 0 is given no special meaning and just means zero rows/columns/repeats/whatever. There is a full explanation of this in Annex E of ECMA-48:1986 and Annex F of ECMA-48:1991. Strictly speaking, that is not the behaviour of the Linux or FreeBSD kernel virtual terminal emulators, both of which apply the old 1976 rule; nor is it the behaviour of old DEC VTs.

This is one place where the terminal emulator deviates from the aforementioned goal, in favour of conforming with the ECMA-48 standard. Per the standard, it defaults to zero meaning zero. It allows this to be controlled with Mode #22, even though using this mode to switch back to the old behaviour was deprecated in ECMA-48:1991. Nowadays, several common GUI terminal emulators obey the zero-means-zero rule; and at the time of writing this manual it has been 34 years since the standard was changed.

Neither ECMA-48 nor the DEC VT520 Video Terminal Programmer Information [EK-VT520-RM] reference document a quirk of most DEC VT-family emulators: holding line wrap pending. (This was only ever documented by DEC in internal documentation that was marked "company confidential".) Actual implementations, including the FreeBSD kernel virtual terminal emulator, have this mechanism. This is largely undocumented behaviour, but it is behaviour that many programs (including the prompt display of the Z Shell, for example) rely upon. So it is implemented here.

There is a long history of terminal emulators getting ISO 8613-6/ITU T.416 SGR 38 and SGR 48 wrong. Terminal emulators have variously forgotten the colour space selector in sub-parameter 2 of SGR 38:2 and SGR 48:2. A long-standing misunderstanding/misreading on the parts of many people has led, moreover, to most programs (from terminfo to the Linux kernel terminal emulator) sending and expecting semi-colons instead of colons, leading to ambiguous SGR sequences: CSI 1;4;38;5;14;48;2;0;224;3;7m. For compatibility, the terminal emulator understands these non-standard and ambiguous control sequences, converting (all remaining) parameters into sub-parameters to resolve ambiguity; but the standard control sequences are preferred. Please fix this 25-year-old bug if your application has it.

There is a degree of variance amongst kernel virtual terminals that can cause problems if the TERM environment variable (and hence termcap/terminfo terminal type) used by applications connected to the front end of the terminal does not match the emulation being employed by the terminal emulator. As explained in TERM(5), it is an error to use the wrong value for this variable, that does not match the terminal emulator's emulation. (This is an error called out in the XTerm FAQ.)

Note

None of the emulator modes in this terminal emulator, and none of the kernel virtual terminals being emulated, are the "xterm" terminal type.

DEC VT

This mode is selected by the --decvt command-line option. It matches (a subset of) a DEC VT in its native VT mode.

The termcap/terminfo records being matched are the "vt520" terminal type family. Variances:

  • A DEC VT sets no tabstops in response to the RIS and DECSTR control sequences. The Linux and FreeBSD kernel terminal emulators both do, however; so for compatibility so too does this terminal emulator. (As explained earlier, do not use RIS.)

  • A DEC VT does not produce distinct application mode control sequences for the asterisk, minus, and slash graphic keys on the calculator keypad. This terminal emulator does, giving them SS3-shifted j, m, and o in application calculator keypad mode.

SCO

This mode is selected by the --sco command-line option. With variances as laid out here, it matches the SCO kernel virtual terminal, the FreeBSD KVT with the Teken library in CONS25 mode, the old WSCONS FreeBSD KVT from version 8 and earlier, and (a subset of) a DEC VT in its SCO Console mode.

For F1 to F12 the level 2 shift and control modifiers are encoded into the function key number, and the SCO KVT's original SCO FNK control sequences (which have no parameters) are emitted. For F13 to F48 the SCO FNK control sequences are extended to have parameters. The function key numbers are used as-is, and the modifiers are instead encoded into the second sub-parameter of the control sequence, in similar form to DECFNK (except zero-based). The first sub-parameter is a (usually 1) repeat count, again similar to DECFNK.

The termcap/terminfo records being matched are the "cons" terminal type family (including its height variants). Variances:

  • The SCO KVT does not implement application/normal modes for the calculator or cursor keypads. This terminal emulator does.

  • The SCO KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its SCO mode, even though the input protocol actually does.

  • The SCO KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The SCO KVT always yields the DEL character for the Del keys on the editing and calculator keypads. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC Private Mode #1037).

  • The SCO KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The SCO KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The SCO KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3-shifted X and l in application calculator keypad mode.

  • The SCO KVT does not support actual F13 and upwards function keys. A DEC VT in its SCO Console mode falls back to its native VT mode, using DECFNK, for F13 to F20. This terminal emulator instead falls back to an extended SCO FNK mechanism, as described earlier.

FreeBSD (Teken)

This mode is selected by the --freebsd command-line option, and is the default when the terminal emulator is compiled for FreeBSD. With variances as laid out here, it matches the FreeBSD 9 and later kernel virtual terminal, that uses the Teken library, in its native Teken mode.

The termcap/terminfo records being matched are the "teken" terminal type family (including its colouring variants). This is is now the proper terminal type for the FreeBSD kernel virtual terminal. Importantly, it is not the "xterm" or "cons25" terminal type families; and it is an error to think that Teken is XTerm or even a subset of it.

Variances:

  • The FreeBSD KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The FreeBSD KVT does not implement application/normal modes for the calculator keypad. This terminal emulator does.

  • Except for Del, the FreeBSD KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its Teken mode, even though the input protocol actually does.

  • The FreeBSD KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The FreeBSD KVT always yields the DEL character for the Del key on the calculator keypad, and always yields DEKFNK for the Del key on the editing keypad. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC private mode #1037).

  • The FreeBSD KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The FreeBSD KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The FreeBSD KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode.

NetBSD ("vt100")

This mode is selected by the --netbsd command-line option, and is the default when the terminal emulator is compiled for NetBSD. With variances as laid out here, it matches the NetBSD kernel virtual terminal in its "vt100" mode.

Note

It is a misnomer to name this "vt100", as NetBSD does. A true DEC VT100 series does not match this emulation. It did not do ECMA-48 colour, for example.

The termcap/terminfo records being matched are the "wsvt" terminal type family (including its height variants). Variances:

  • The NetBSD KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The NetBSD KVT does not implement application/normal modes for the cursor and calculator keypads. This terminal emulator does.

  • The NetBSD KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its NetBSD mode, even though the input protocol actually does.

  • The NetBSD KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The NetBSD KVT always yields the DEL character for the Del key on the calculator keypad, and always yields DEKFNK for the Del key on the editing keypad. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC private mode #1037).

  • The NetBSD KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The NetBSD KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The NetBSD KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode. The latter is defined by DEC VTs.

Linux

This mode is selected by the --linux command-line option, and is the default when the terminal emulator is compiled for Linux.

The termcap/terminfo records being matched are the "linux" terminal type family (including its colouring variants). Variances:

  • The Linux kernel virtual terminal sends the DECFNK sequences for FIND and SELECT for the HOME and END keys, rather than the (different) proper control sequences for those keys.

    This has two major consequences: If the terminal emulator is in Linux emulation mode, HOME and END will not be correctly recognized if the TERM environment variable does not also specify "linux"; and if the terminal emulator is in another mode but the TERM environment variable is set to "linux", FIND and SELECT will be incorrectly recognized and HOME and END will not be recognized at all.

  • The Linux KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The Linux KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its Linux mode, even though the input protocol actually does.

  • The Linux KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The Linux KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The Linux KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode. The latter is defined by DEC VTs.

See also

pty-run(1)

an I/O pump that pumps data in both directions between the back end of a pseudo-terminal and its own standard I/O

console-control-sequence(1)

a utility for emitting a range of useful control sequences

console-multiplexor(1) , console-input-method(1)

mechanisms that layer on top of a terminal emulator

Author

Jonathan de Boyne Pollard