Table of contents:
Constants are simply the symbol #
(mesh) followed by an integer
between 0 and 65535. For example #1
, #65535
.
To create 32 bit constants, use the interleave operator
(see binary operators below).
For example, #256¢#0
has value 65536, and
#65535¢#65535
has value 4294967295.
Variables are represented by up to four pieces of information:
The ownership path is a simple way to follow the BELONGS TO relationship
between registers. Prefixing a register with big-money ($
)
will take you to its owner. Prefixing with a digit from 2 to 9 will take
you to any minor owner. For more information, see
the chapter on Belongs TO. Note that ownership
paths cannot be used if the compiler is ick,
as the big-money is used for interleave.
The variable identifier defines the variable type. There are six types:
.
) - these can contain 16 bit values.
:
) - these can contain 32 bit values.
,
) - arrays of 16 bit values.
;
) - arrays of 32 bit values.
@
) - these cannot contain any value.
_
) - these contain compilers.
Subscripts, if present, are introduced by the word "SUB" and an expression.
Multidimensional arrays will require many subscripts, for example
,1 SUB #1 SUB #2
.
For example, the following are all valid variable names without ownership path:
,12 SUB .1 #2 :3
.1
.0001
(this is the same as .1
)
:18 SUB #2
@21
;1
@65535
The following are not valid for some reason:
,12
- requires a subscript (unlsss ,12 is overloaded)
.0
- variable number cannot be zero
1E-3
- you might think this is the same as .0001
but isn't
.18 SUB #2
- this cannot have subscripts (unless .18 is overloaded)
@65536
- variable number too big
When an ownership path is specified, the variable type need not be the one specified. For this reason, the following can be valid in some cases and invalid in other, depending on whether the result of following the ownership chain is an array or not:
$,12 SUB .1 SUB #2 SUB :3
2.01
$49.99
$:1 SUB $49.99
$$21$$21$$21$$21@21 SUB 1$$21$$21$$21$$21@21 SUB 2$$21$$21$$21$$21@21
1;1
1_1
65535$65535@65535
However, the "crawling horror" registers do not currently enjoy full rights
and cannot contain any value (the compilers are stored in them, but there
is no direct access to them). They can, however, be subject to overloading
and the Belongs TO relation. Please note, however, that at present there
are only two crawling horrors, _1
and _2
.
CLC-INTERCAL 0.05 introduces a new "variable", "*" (splat). It is not possible to assign a value to it, but it contains the code of the last error (hence the name). It is an error to use this variable if the program did not encounter an error. Since all runtime errors are fatal, it is usually an error to read this variable, and it should be avoided. However, a quantum program might have encountered an error and at the same time avoided it, therefore it is meaningful to use "splat" in quantum programs. Also, if used inside an event, it makes perfect sense to refer to the splat.
CLC-INTERCAL 1.-94 contains a number of special registers. These are invisible to programs, but accessible to compilers, and control the internal working of the system. They are documented in the chapter about writing compilers.
CLC-INTERCAL 0.05 introduced two new operators which might be useful to figure out what a register is even after following BELONGS TO, or during overloading.
The worm applied to a register returns its number. For example,
-.5
is the same as #5 (do not confuse the worm with the bookworm,
which might be printed the same on VDUs). As another example, inside
overloading (see below), one can obtain the number of the register being
overloaded with -$@0
.
The intersection-worm (never heard of this type of worm? we haven't either)
introduces indirect registers. The syntax is "intersection register worm
register", and represents a register with the type of the first, the number
of the second, for example +:7-.3
is the same as :3
.
Things get interesting when the registers start having ownership paths etc.
We were supposed to write some examples here, but frankly, we can't stomach it just now.
The unary operators are the standard logical operators, AND
,
OR
and XOR
(exclusive OR
). They
should be written as &
for AND
and V
for OR
. For XOR
, you should use the bookworm symbol,
which we cannot represent in this page because it's not in the character
set. As an approximation, we use the "yen" (¥
) symbol, which
is accepted by the compiler when the input alphabet is ASCII. This only works
if the font is ISO-8859-1, so the compiler also accepts
V-backspace-worm
. If the input alphabet is EBCDIC, only the
bookworm symbol can be used for XOR
. If the compiler is
in "C-INTERCAL compatibility mode", the what (?) is accepted instead of yen.
(The "what" has a completely different meaning in CLC-INTERCAL mode, and
should only be used inside a CREATE
statement).
The value of an unary operator is determined by rotating the operand to the right one bit, and applying the corresponding bitwise binary operation to the result and the original operand.
The INTERCAL-72 specification says that the operation is inserted between the
one spot, two spot or mesh and the number, so #¥1
means
"unary XOR
applied to the number 1" (the result is 32769).
However, older versions of CLC-INTECAL also allowed the operator to be used
as a prefix to an expression,
as in V¥&V#¥1
(it will be obvious by now that the
value is 61440). Current versions no longer allow that. You should not
have been using it anyway.
Note that we have absolutely no idea whether the unary operators "bind" more or less than other things. In case of doubt, assign the result of subexpressions to some register, or use sparks and rabbit ears to control the order of evaluation.
C-INTERCAL introduced new unary operators for use with bases between 3 and
7. These are now available in CLC-INTERCAL 1.-94, but they use a different
symbol. These are the unary BUT
(a whirlpool (@) in C-INTERCAL,
or a what (?) in CLC-INTERCAL) and the unary "add without carry" (a shark
fin (^) in C-INTERCAL, or a spike (|) in CLC-INTERCAL). For bases 4 or greater,
several types of unary BUT
are available (C-INTERCAL: 2@, 3@, etc;
CLC-INTERCAL: 2?, 3?, etc). Please consult the documentation which comes
with C-INTERCAL for more information about these operators.
CLC-INTERCAL 1.-94.-4 introduced a new unary operator, division. This differs from normal unary operators because it is arithmetic, not bitwise. The operation is as follows: the operand is shifted right arithmetically, then the original value is divided by the result of the shift and truncated to an integer. Note that the most frequent result is the base, since a right shift is equivalent to a division by the base, truncating the result to an integer. For example, in base 5, unary division of #62 is #62 divided by #12, which just happens to be #5. However, the operation can also return other values, for example in base 5 unary division of #12 is #6. And of course any value smaller than the base produces a division by zero splat.
A compiler option, bitwise-divide, changes the unary division to behave like a normal unary operation, performing a bitwise rotate of its operand and so on. You can figure out what it does.
The symbol for the unary division is the worm (-
), so
for example #-62 is the unary division of #62. Note that the worm
is also used to construct indirect registers, but that's OK because
the compiler does not get confused. The programmer might.
There are four binary operators: interleave, select, and two forms of operand overloading. Operand overloading is implemented in CLC-INTERCAL 0.05 or newer, and are described in the next section.
Interleave is written ¢
(change), but can also be
represented as C-backspace-slat
or C-backslace-spike
if the input alphabet is ASCII. If the compiler is in "C-INTERCAL compatibility
mode", the big money ($) can be used as well. Note that this means you can't
use it for ownership paths, but that's OK since C-INTERCAL has no BELONGS TO
relation.
Interleave takes two 16 bit numbers and "interleaves" their bits. For
example, #3¢#0
is 10. To see why, write the numbers
in binary (3 is 0000000000000011 in binary, so interleaving the bits
with 0000000000000000 you get 00000000000000000000000000001010, which
is 10). It can be used to simply form 32 bit constants by writing all
the "even" bits to the left of the ¢
and all the
"odd" bits to the right.
Interleave fails if it tries to produce more than 32 bits. Use it only on 16 bit values!
If the base is not 2, interleave works the same way, but interleaves digits instead of bits; for example, in base 3, #3¢#0 is #9.
Select is written ~
(sqiggle [sic]). It uses the second number
to "select" bits in the first number. The bits selected are the ones where
the second number has a 1. All the bits of the result are right-aligned, and
padded with 0 to form a 16 bit or a 32 bit number depending on the size of
the result. Note that if you are planning to apply an unary operator on the
result of select you don't know in advance whether the 16 bit or 32 bit
operator will be used, because this is data-dependent. As an example,
:1~#32768
selects bit 15 of :1
and returns 0 or
1 accordingly. .1~#32770
selects bits 15 and 2 of .1
and can return 0, 1, 2, or 3.
If the base is not 2, select works similarly. See the documentation coming with C-INTERCAL for a full discussion.
There are two operand overloading operators: they are written /
(slat) and \
(backslat). These are binary operators, but the
first operand of slat must be a register or a register with subscripts.
The second operand can be any expression.
Both overloading operators return the value of their left operand. In the case of slat, any previous overloading which applies to the left operand is removed before evaluating it. For example, ".1/#1" returns the value contained in .1
The side effect of the overloading operators is to change the way some registers are used in future. Slat applies to a single value, be it a register or an array element. Whenever that value is used after the overloading, the expression is evaluated instead. For example, after ".1/.2" using .1 will return the value of .2. The register @0 is temporarily created and enslaved to the register being overloaded, so ".1/$@0" is a slightly less efficient way to access the value of .1
Assigning to an overloaded register or array element attempts to invert the relevant operations. For example, if .1 is overloaded to &&.2, then assigning #4 to .1 will leave .1 unchanged but assign #28 to .2 (this is because & is #4). This does not always work, so you might get a runtime error. However, if the expression includes only interleave, select, overload, and registers, the assignment always succeeds. The unary operators sometimes fail because there are values which they can never return (for example, there is no way to get #10 as the result of an unary AND, so if .1 is overloaded to .&2, assigning #10 to .1 results in an error, while of course assigning #12 would be fine because #&28 is #12). Currently, assigning to a constant works in CLC-INTERCAL 1.-94, as does assigning to splat (which however produces an error because now the "splat" variable has a value!). In addition, assigning to anything which corresponds to a constant causes modification of a constant (for example, assigning to -.2 is equivalent to assign to #2 or to ?SYMBOL). If you are not confused now, you never will be.
New in CLC-INTERCAL 1.-94. Constants are no longer constants. An
example will make this clear as mud. Suppose .2 is overloaded to &
assigning #4 to .2 will effectively mean assigning #28 to #2 (see the
discussion in the previous paragraph). Next time your program uses the
number 2, it will actually use 28 instead. For example, .2 will now have
the same value as .28, and an expression containing #2 will use #28 instead.
Moreover, if your program used to have a COME FROM (2)
it
will now have a COME FROM (28)
in the same place. You can
use it as a more elegant alternative to computed COME FROM
.
The backslat operator is similar to slat, except that it affects a range of registers. The expression on the left of the backslat is taken as the interleaving of two values. The overloading applies to any register with a number which is between the two values (if the first value is greater than the second, no overloading is done). The register $@0 might be useful to retrieve the original register. In the current implementation, this form of overloading only applies to numeric registers (spot and two-spot), not to arrays or classes. For example, "#1¢#5"\"$@0~#1" replaces any registers between .1 and .5, :1 and :5, with their lowest-significant bit.
Overloading loops are eliminated. So if you have .1/.2 and .2/.1, using .1 will return .1 (.1 causes evaluation of .2, which causes evaluation of .1 - the loop is noted and the overloading of .1 is not applied). This means that, in particular, .1/.1 can be used to remove any overloading associated with .1 - however, the resulting code will be slower than the case when no overloading has been specified, and you should instead localise the effects of overloading using statements STASH and RETRIEVE as described in the chapter about statements.
Note that programmer overloading is implemented by all INTERCAL compilers known to mankind - it's just that their documentation don't mention this.
Also note that you cannot ABSTAIN from overloading, because overloading is not a statement. However, you can prevent overloading by IGNORING a register.
The precedence rules for operators are not defined by INTERCAL-72. For
CLC-INTERCAL, we have absolutely no idea, and different versions use
different precedences. It might help to either save the results of
subexpressions in registers or, if all else fails, group subexpression
using the grouping constructs. A group is started with a spark
('
) or rabbit ears ("
) and closed with the
same symbol it started with. Any expression can go inside a group,
including any number of sparks or rabbit ears. However, remember that
the compiler has to read it somehow, so don't be too cruel on the poor
thing.
For example, the expression '"#3~#2"¢#0'~#2
has value
1, whereas the similar expression "#3~#2"¢#0~#2
could
have value 2 - because the compiler may use the whole #0~#2
as
right operand for the interleave.
Note that different compilers might have different ideas about precedence, so always include enough sparks rabbit ears to make the expression unambiguous if you intend to write portable programs.
If a spark is followed immediately by a spot, the two can be "overpunched",
and they will look like a bang. So, for example, '.1~.2'
could
be written !1~.2
. A similar effect applies to the rabbit ears,
but in this case you use a real overpunch (rabbit ears, backspace, spot)
because there isn't a character looking like the result.
Everything should be clear by now, so you won't need any examples.