CLC-INTERCAL Reference

... Expressions

Table of contents:

Parent directory
Constants
Variables
Indirect Variables
Unary Operators
Binary Operators
Operand Overloading
Grouping
Examples

Constants

Constants are simply the symbol # (mesh) followed by an integer between 0 and 65535. For example #1, #65535.

To create 32 bit constants, use the interleave operator (see binary operators below). For example, #256¢#0 has value 65536, and #65535¢#65535 has value 4294967295.

Variables

Variables are represented by up to four pieces of information:

The ownership path. This part is optional.
The variable identifier, as described below.
The variable number, an integer between 1 and 65535.
Subscripts. This part must not be included if the variable is not an array. If you are trying to extract a value, this part must be present when the variable is an array register. If you just want to name a register, always leave it out.

The ownership path is a simple way to follow the BELONGS TO relationship between registers. Prefixing a register with big-money ($) will take you to its owner. Prefixing with a digit from 2 to 9 will take you to any minor owner. For more information, see the chapter on Belongs TO. Note that ownership paths cannot be used if the compiler is ick, as the big-money is used for interleave.

The variable identifier defines the variable type. There are six types:

One spot (.) - these can contain 16 bit values.
Two spots (:) - these can contain 32 bit values.
Tail (,) - arrays of 16 bit values.
Hybrid (;) - arrays of 32 bit values.
Whirlpool (@) - these cannot contain any value.
Crawling Horror (_) - these contain compilers.

Subscripts, if present, are introduced by the word "SUB" and an expression. Multidimensional arrays will require many subscripts, for example ,1 SUB #1 SUB #2.

For example, the following are all valid variable names without ownership path:

,12 SUB .1 #2 :3
.1
.0001 (this is the same as .1)
:18 SUB #2
@21
;1
@65535

The following are not valid for some reason:

,12 - requires a subscript (unlsss ,12 is overloaded)
.0 - variable number cannot be zero
1E-3 - you might think this is the same as .0001 but isn't
.18 SUB #2 - this cannot have subscripts (unless .18 is overloaded)
@65536 - variable number too big

When an ownership path is specified, the variable type need not be the one specified. For this reason, the following can be valid in some cases and invalid in other, depending on whether the result of following the ownership chain is an array or not:

$,12 SUB .1 SUB #2 SUB :3
2.01
$49.99
$:1 SUB $49.99
$$21$$21$$21$$21@21 SUB 1$$21$$21$$21$$21@21 SUB 2$$21$$21$$21$$21@21
1;1
1_1
65535$65535@65535

However, the "crawling horror" registers do not currently enjoy full rights and cannot contain any value (the compilers are stored in them, but there is no direct access to them). They can, however, be subject to overloading and the Belongs TO relation. Please note, however, that at present there are only two crawling horrors, _1 and _2.

CLC-INTERCAL 0.05 introduces a new "variable", "*" (splat). It is not possible to assign a value to it, but it contains the code of the last error (hence the name). It is an error to use this variable if the program did not encounter an error. Since all runtime errors are fatal, it is usually an error to read this variable, and it should be avoided. However, a quantum program might have encountered an error and at the same time avoided it, therefore it is meaningful to use "splat" in quantum programs. Also, if used inside an event, it makes perfect sense to refer to the splat.

CLC-INTERCAL 1.-94 contains a number of special registers. These are invisible to programs, but accessible to compilers, and control the internal working of the system. They are documented in the chapter about writing compilers.

Indirect Variables

CLC-INTERCAL 0.05 introduced two new operators which might be useful to figure out what a register is even after following BELONGS TO, or during overloading.

The worm applied to a register returns its number. For example, -.5 is the same as #5 (do not confuse the worm with the bookworm, which might be printed the same on VDUs). As another example, inside overloading (see below), one can obtain the number of the register being overloaded with -$@0.

The intersection-worm (never heard of this type of worm? we haven't either) introduces indirect registers. The syntax is "intersection register worm register", and represents a register with the type of the first, the number of the second, for example +:7-.3 is the same as :3. Things get interesting when the registers start having ownership paths etc.

We were supposed to write some examples here, but frankly, we can't stomach it just now.

Unary Operators

The unary operators are the standard logical operators, AND, OR and XOR (exclusive OR). They should be written as & for AND and V for OR. For XOR, you should use the bookworm symbol, which we cannot represent in this page because it's not in the character set. As an approximation, we use the "yen" (¥) symbol, which is accepted by the compiler when the input alphabet is ASCII. This only works if the font is ISO-8859-1, so the compiler also accepts V-backspace-worm. If the input alphabet is EBCDIC, only the bookworm symbol can be used for XOR. If the compiler is in "C-INTERCAL compatibility mode", the what (?) is accepted instead of yen. (The "what" has a completely different meaning in CLC-INTERCAL mode, and should only be used inside a CREATE statement).

The value of an unary operator is determined by rotating the operand to the right one bit, and applying the corresponding bitwise binary operation to the result and the original operand.

The INTERCAL-72 specification says that the operation is inserted between the one spot, two spot or mesh and the number, so #¥1 means "unary XOR applied to the number 1" (the result is 32769). However, older versions of CLC-INTECAL also allowed the operator to be used as a prefix to an expression, as in V¥&V#¥1 (it will be obvious by now that the value is 61440). Current versions no longer allow that. You should not have been using it anyway.

Note that we have absolutely no idea whether the unary operators "bind" more or less than other things. In case of doubt, assign the result of subexpressions to some register, or use sparks and rabbit ears to control the order of evaluation.

C-INTERCAL introduced new unary operators for use with bases between 3 and 7. These are now available in CLC-INTERCAL 1.-94, but they use a different symbol. These are the unary BUT (a whirlpool (@) in C-INTERCAL, or a what (?) in CLC-INTERCAL) and the unary "add without carry" (a shark fin (^) in C-INTERCAL, or a spike (|) in CLC-INTERCAL). For bases 4 or greater, several types of unary BUT are available (C-INTERCAL: 2@, 3@, etc; CLC-INTERCAL: 2?, 3?, etc). Please consult the documentation which comes with C-INTERCAL for more information about these operators.

CLC-INTERCAL 1.-94.-4 introduced a new unary operator, division. This differs from normal unary operators because it is arithmetic, not bitwise. The operation is as follows: the operand is shifted right arithmetically, then the original value is divided by the result of the shift and truncated to an integer. Note that the most frequent result is the base, since a right shift is equivalent to a division by the base, truncating the result to an integer. For example, in base 5, unary division of #62 is #62 divided by #12, which just happens to be #5. However, the operation can also return other values, for example in base 5 unary division of #12 is #6. And of course any value smaller than the base produces a division by zero splat.

A compiler option, bitwise-divide, changes the unary division to behave like a normal unary operation, performing a bitwise rotate of its operand and so on. You can figure out what it does.

The symbol for the unary division is the worm (-), so for example #-62 is the unary division of #62. Note that the worm is also used to construct indirect registers, but that's OK because the compiler does not get confused. The programmer might.

Binary Operators

There are four binary operators: interleave, select, and two forms of operand overloading. Operand overloading is implemented in CLC-INTERCAL 0.05 or newer, and are described in the next section.

Interleave is written ¢ (change), but can also be represented as C-backspace-slat or C-backslace-spike if the input alphabet is ASCII. If the compiler is in "C-INTERCAL compatibility mode", the big money ($) can be used as well. Note that this means you can't use it for ownership paths, but that's OK since C-INTERCAL has no BELONGS TO relation.

Interleave takes two 16 bit numbers and "interleaves" their bits. For example, #3¢#0 is 10. To see why, write the numbers in binary (3 is 0000000000000011 in binary, so interleaving the bits with 0000000000000000 you get 00000000000000000000000000001010, which is 10). It can be used to simply form 32 bit constants by writing all the "even" bits to the left of the ¢ and all the "odd" bits to the right.

Interleave fails if it tries to produce more than 32 bits. Use it only on 16 bit values!

If the base is not 2, interleave works the same way, but interleaves digits instead of bits; for example, in base 3, #3¢#0 is #9.

Select is written ~ (sqiggle [sic]). It uses the second number to "select" bits in the first number. The bits selected are the ones where the second number has a 1. All the bits of the result are right-aligned, and padded with 0 to form a 16 bit or a 32 bit number depending on the size of the result. Note that if you are planning to apply an unary operator on the result of select you don't know in advance whether the 16 bit or 32 bit operator will be used, because this is data-dependent. As an example, :1~#32768 selects bit 15 of :1 and returns 0 or 1 accordingly. .1~#32770 selects bits 15 and 2 of .1 and can return 0, 1, 2, or 3.

If the base is not 2, select works similarly. See the documentation coming with C-INTERCAL for a full discussion.

Operand Overloading

There are two operand overloading operators: they are written / (slat) and \ (backslat). These are binary operators, but the first operand of slat must be a register or a register with subscripts. The second operand can be any expression.

Both overloading operators return the value of their left operand. In the case of slat, any previous overloading which applies to the left operand is removed before evaluating it. For example, ".1/#1" returns the value contained in .1

The side effect of the overloading operators is to change the way some registers are used in future. Slat applies to a single value, be it a register or an array element. Whenever that value is used after the overloading, the expression is evaluated instead. For example, after ".1/.2" using .1 will return the value of .2. The register @0 is temporarily created and enslaved to the register being overloaded, so ".1/$@0" is a slightly less efficient way to access the value of .1

Assigning to an overloaded register or array element attempts to invert the relevant operations. For example, if .1 is overloaded to &&.2, then assigning #4 to .1 will leave .1 unchanged but assign #28 to .2 (this is because &&#28 is #4). This does not always work, so you might get a runtime error. However, if the expression includes only interleave, select, overload, and registers, the assignment always succeeds. The unary operators sometimes fail because there are values which they can never return (for example, there is no way to get #10 as the result of an unary AND, so if .1 is overloaded to .&2, assigning #10 to .1 results in an error, while of course assigning #12 would be fine because #&28 is #12). Currently, assigning to a constant works in CLC-INTERCAL 1.-94, as does assigning to splat (which however produces an error because now the "splat" variable has a value!). In addition, assigning to anything which corresponds to a constant causes modification of a constant (for example, assigning to -.2 is equivalent to assign to #2 or to ?SYMBOL). If you are not confused now, you never will be.

New in CLC-INTERCAL 1.-94. Constants are no longer constants. An example will make this clear as mud. Suppose .2 is overloaded to & assigning #4 to .2 will effectively mean assigning #28 to #2 (see the discussion in the previous paragraph). Next time your program uses the number 2, it will actually use 28 instead. For example, .2 will now have the same value as .28, and an expression containing #2 will use #28 instead. Moreover, if your program used to have a COME FROM (2) it will now have a COME FROM (28) in the same place. You can use it as a more elegant alternative to computed COME FROM.

The backslat operator is similar to slat, except that it affects a range of registers. The expression on the left of the backslat is taken as the interleaving of two values. The overloading applies to any register with a number which is between the two values (if the first value is greater than the second, no overloading is done). The register $@0 might be useful to retrieve the original register. In the current implementation, this form of overloading only applies to numeric registers (spot and two-spot), not to arrays or classes. For example, "#1¢#5"\"$@0~#1" replaces any registers between .1 and .5, :1 and :5, with their lowest-significant bit.

Overloading loops are eliminated. So if you have .1/.2 and .2/.1, using .1 will return .1 (.1 causes evaluation of .2, which causes evaluation of .1 - the loop is noted and the overloading of .1 is not applied). This means that, in particular, .1/.1 can be used to remove any overloading associated with .1 - however, the resulting code will be slower than the case when no overloading has been specified, and you should instead localise the effects of overloading using statements STASH and RETRIEVE as described in the chapter about statements.

Note that programmer overloading is implemented by all INTERCAL compilers known to mankind - it's just that their documentation don't mention this.

Also note that you cannot ABSTAIN from overloading, because overloading is not a statement. However, you can prevent overloading by IGNORING a register.

Grouping

The precedence rules for operators are not defined by INTERCAL-72. For CLC-INTERCAL, we have absolutely no idea, and different versions use different precedences. It might help to either save the results of subexpressions in registers or, if all else fails, group subexpression using the grouping constructs. A group is started with a spark (') or rabbit ears (") and closed with the same symbol it started with. Any expression can go inside a group, including any number of sparks or rabbit ears. However, remember that the compiler has to read it somehow, so don't be too cruel on the poor thing.

For example, the expression '"#3~#2"¢#0'~#2 has value 1, whereas the similar expression "#3~#2"¢#0~#2 could have value 2 - because the compiler may use the whole #0~#2 as right operand for the interleave.

Note that different compilers might have different ideas about precedence, so always include enough sparks rabbit ears to make the expression unambiguous if you intend to write portable programs.

If a spark is followed immediately by a spot, the two can be "overpunched", and they will look like a bang. So, for example, '.1~.2' could be written !1~.2. A similar effect applies to the rabbit ears, but in this case you use a real overpunch (rabbit ears, backspace, spot) because there isn't a character looking like the result.

Examples

Everything should be clear by now, so you won't need any examples.