Table of contents:
Normally, the compiler requires the program source to be in EBCDIC, although there are compiler options to translate from ASCII or Baudot. Since there isn't such thing as a standard EBCDIC, we have designed our own non-standard one. The principle is simple: for each character, we selected a code which was used for that character by at least one IBM terminal. However, to guarantee incompatibility, our set differs in at least one character from any IBM hardware for which we have been able to find documentation.
Here's the character table:
+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
00 | BSP | TAB | LF | CR | ||||||||||||
10 | ||||||||||||||||
20 | ||||||||||||||||
30 | ||||||||||||||||
40 | SP | ¢ | . | < | ( | + | ! | |||||||||
50 | & | ] | $ | * | ) | ; | ¬ | |||||||||
60 | - | / | xor | | | , | % | _ | > | ? | |||||||
70 | : | # | @ | ' | = | " | ||||||||||
80 | a | b | c | d | e | f | g | h | i | |||||||
90 | j | k | l | m | n | o | p | q | r | { | [ | |||||
a0 | ~ | s | t | u | v | w | x | y | z | ® | ||||||
b0 | ^ | £ | © | |||||||||||||
c0 | A | B | C | D | E | F | G | H | I | |||||||
d0 | J | K | L | M | N | O | P | Q | R | } | ||||||
e0 | S | T | U | V | W | X | Y | Z | ||||||||
f0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | DEL |
While the compiler and runtime accept ASCII and EBCDIC for input/output, internally everything is represented in extended Baudot. The "letters" and "figures" sets are identical to the standard Baudot, but we have a nonstandard convention that shifting to letters while already in letters causes a shift to lowercase letters, and shifting to figures while already in figures causes a shift to a set containing special characters. Thus to guarantee uppercase letters one woule first shift to figures and then to letters, for example. If this extended Baudot is sent to a teletype which understands standard Baudot, the result will be a text in ALL CAPS with some of the symbols it cannot print replaced with others it can.
Here's the character table:
Code | Uppercase | Lowercase | Figures | Symbols |
---|---|---|---|---|
00 | Invalid code | |||
01 | E | e | 3 | ¢ |
02 | Line Feed | |||
03 | A | a | - | + |
04 | Space | |||
05 | S | s | Bell | \ |
06 | I | i | 8 | # |
07 | U | u | 7 | = |
08 | Carriage Return | |||
09 | D | d | $ | * |
10 | R | r | 4 | { |
11 | J | j | ' | ~ |
12 | N | n | , | xor |
13 | F | f | ! | | |
14 | C | c | : | ^ |
15 | K | k | ( | < |
16 | T | t | 5 | [ |
17 | Z | z | " | } |
18 | W | w | ) | > |
19 | L | l | 2 | ] |
20 | H | h | Invalid | backspace |
21 | Y | y | 6 | @ |
22 | P | p | 0 | Invalid |
23 | Q | q | 1 | £ |
24 | O | o | 9 | ¬ |
25 | B | b | ? | delete |
26 | G | g | & | Invalid |
27 | Figures | Symbols | ||
28 | M | m | . | % |
29 | X | x | / | _ |
30 | V | v | ; | Invalid |
31 | Lowercase | Uppercase |
CLC-INTERCAL 1.-94 introduces support for the "Hollerith" character set, for compatibility with punched card devices and similar. A column in a punched card corresponds to 12 bits, so tail registers can store one character per element (with 4 bits wasted); similarly, a Hollerith file requires two bytes per character. The first byte contains punch lines 12, 0, 2, 4, 6, 8; the second byte contains lines 11, 1, 3, 5, 7, 9. The 12 bit number corresponding to one column is therefore the interleave of the two bytes. The two most significant bits in each bytes are ignored; when producing Hollerith, CLC-INTERCAL will clear bit 7 and set bit 6 to the complement of bit 5: the result will be printable on an ASCII terminal, although it is unlikely to be easy to read.
The Hollerith encoding used by CLC-INTERCAL is an extension of one of the many character sets used for punched cards; lowercase are added by overpunching the corresponding uppercase character with a single extra hole. Some extra characters useful for INTERCAL programs have also been added.
Overpunches, where two different characters are punched on the same column are fully supported: when converting from Hollerith to another character set, these may result in sequences of characters.
The following three cards summarise the encoding. The third card shows two examples of overpunch and some control characters which do not exist in real punched cards, but may be useful for storing virtual punched cards in a file (real punched cards would just use a new card for a carriage return, newline sequence)).
|
|
|