summaryrefslogtreecommitdiff
path: root/doc/tutorial.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tutorial.txt')
-rw-r--r--doc/tutorial.txt473
1 files changed, 473 insertions, 0 deletions
diff --git a/doc/tutorial.txt b/doc/tutorial.txt
new file mode 100644
index 0000000..e9198af
--- /dev/null
+++ b/doc/tutorial.txt
@@ -0,0 +1,473 @@
+Newsgroups: rec.games.corewar
+From: DURHAM@ricevm1.rice.edu (Mark A. Durham)
+Subject: Intro to Redcode Part I
+Organization: Rice University, Houston, TX
+Date: Thu, 14 Nov 1991 09:41:37 GMT
+
+Introduction to Redcode
+-----------------------
+
+ I. Preface - Reader Beware! { Part I }
+
+ II. Notation { Part I }
+
+III. MARS Peculiarities { Part I }
+
+ IV. Address Modes { Part II }
+
+ V. Instruction Set { Part II }
+
+----------------------------------------------------------------------
+
+I. Preface - Reader Beware!
+
+ The name "Core War" arguably can be claimed as public domain.
+Thus, any program can pass itself off as an implementation of Core
+War. Ideally, one would like to write a Redcode program on one system
+and know that it will run in exactly the same manner on every other
+system. Alas, this is not the case.
+ Basically, Core War systems fall under one of four catagories:
+Non-ICWS, ICWS'86, ICWS'88, or Extended. Non-ICWS systems are usually
+a variant of Core War as described by A. K. Dewdney in his "Computer
+Recreations" articles appearing in Scientific American. ICWS'86 and
+ICWS'88 systems conform to the standards set out by the International
+Core War Society in their standards of 1986 and 1988, respectively.
+Extended systems generally support ICWS'86, ICWS'88, and proprietary
+extensions to those standards. I will discuss frequently common
+extensions as if they were available on all Extended systems (which
+they most certainly are not).
+ I will not describe Non-ICWS systems in this article. Most Non-
+ICWS systems will be easily understood if you understand the systems
+described in this article however. Although called "standards",
+ICWS'86 and ICWS'88 (to a lesser extent) both suffer from ambiguities
+and extra-standard issues which I will try to address.
+ This is where the reader should beware. Because almost any
+interpretation of the standard(s) is as valid as any other, I
+naturally prefer MY interpretation. I will try to point out other
+common interpretations when ambiguities arise though, and I will
+clearly indicate what is interpretation (mine or otherwise) as such.
+You have been warned!
+
+----------------------------------------------------------------------
+
+II. Notation
+
+ "86:" will indicate an ICWS'86 feature. "88:" will indicate an
+ICWS'88 feature. "X:" will indicate an Extended feature. "Durham:"
+will indicate my biased interpretation. "Other:" will indicate
+interpretations adhered to by others. "Commentary:" is me explaining
+what I am doing and why. "Editorial:" is me railing for or against
+certain usages. Items without colon-suffixed prefaces can be
+considered universal.
+
+ Redcode consists of assembly language instructions of the form
+
+<label> <opcode> <A-mode><A-field>, <B-mode><B-field> <comment>
+
+An example Recode program:
+
+; Imp
+; by A. K. Dewdney
+;
+imp MOV imp, imp+1 ; This program copies itself ahead one
+ END ; instruction and moves through memory.
+
+The <label> is optional.
+86: <label> begins in the first column, is one to eight characters
+ long, beginning with an alphabetic character and consisting
+ entirely of alphanumerals. Case is ignored ("abc" is equivalent
+ to "ABC").
+88: <label> as above, except length is not limited and case is not
+ addressed. Only the first eight characters are considered
+ significant.
+X: <label> can be preceded by any amount of whitespace (spaces, tabs,
+ and newlines), consists of any number of significant alphanumerals
+ but must start with an alphabetic, and case is significant ("abc"
+ is different from "ABC").
+Commentary: I will always use lowercase letters for labels to
+ distinguish labels from opcodes and family operands.
+
+The <opcode> is separated from the <label> (if there is one) by
+ whitespace. Opcodes may be entered in either uppercase or
+ lowercase. The case does not alter the instruction. DAT, MOV,
+ ADD, SUB, JMP, JMZ, JMN, DJN, CMP, SPL, and END are acceptable
+ opcodes.
+86: SPACE is also recognized as an opcode.
+88: SLT and EQU are recognized as opcodes. SPACE is not.
+X: All of the above are recognized as opcodes as well as XCH and PCT,
+ plus countless other extensions.
+Commentary: END, SPACE, and EQU are known as pseudo-ops because they
+ really indicate instructions to the assembler and do not produce
+ executable code. I will always capitalize opcodes and pseudo-ops
+ to distinguish them from labels and text.
+
+The <A-mode> and <A-field> taken together are referred to as the
+ A-operand. Similarly, the <B-mode><B-field> combination is known
+ as the B-operand. The A-operand is optional for some opcodes.
+ The B-operand is optional for some opcodes. Only END can go
+ without at least one operand.
+86: Operands are separated by a comma.
+88: Operands are separated by whitespace.
+X: Operands are separated by whitespace and/or a comma. Lack of a
+ comma can lead to unexpected behaviour for ambiguous constructs.
+Commentary: The '88 standard forces you to write an operand without
+ whitespace, reserving whitespace to separate the operands. I like
+ whitespace in my expressions, therefore I prefer to separate my
+ operands with a comma and will do so here for clarity.
+
+<mode> is # (Immediate Addressing), @ (Indirect Addressing), or <
+ (86: Auto-Decrement Indirect, 88: Pre-Decrement Indirect). A
+ missing mode indicates Direct Addressing.
+86: $ is an acceptable mode, also indicating Direct Addressing.
+88: $ is not an acceptable mode.
+X: $ is an acceptable mode as in 86:.
+Commentary: The distinction between Auto-Decrement Indirect Addressing
+ and Pre-Decrement Indirect Addressing is semantic, not syntactic.
+
+<field> is any combination of labels and integers separated by the
+ arithmetic operators + (addition) and - (subtraction).
+86: Parentheses are explicitly forbidden. "*" is defined as a special
+ label symbol meaning the current statement.
+88: Arithmetic operators * (multiplication) and / (integer division)
+ are added. "*" is NOT allowed as a special label as in 86:.
+X: Parentheses and whitespace are permitted in expressions.
+Commentary: The use of "*" as meaning the current statement may be
+ useful in some real assemblers, but is completely superfluous in a
+ Redcode assembler. The current statement can always be referred
+ to as 0 in Redcode.
+
+<comment> begins with a ; (semicolon), ends with a newline, and can
+ have any number of intervening characters. A comment may appear
+ on a line by itself with no instruction preceding it.
+88: Blank lines are explicitly allowed.
+
+ I will often use "A" to mean any A-operand and "B" to mean any
+B-operand (capitalization is important). I use "a" to mean any A-
+field and "b" to mean any B-field. For this reason, I never use "a"
+or "b" as an actual label.
+ I enclose sets of operands or instructions in curly braces. Thus
+"A" is equivalent to "{ a, #a, @a, <a }". I use "???" to mean any
+opcode and "x" or "label" as an arbitrary label. Thus, the complete
+family of acceptable Redcode statements can be represented as
+
+x ??? A, B ; This represents all possible Redcode statements.
+
+"???" is rarely used as most often we wish to discuss the behaviour of
+a specific opcode. I will often use labels such as "x-1" (despite its
+illegality) for the instruction before the instruction labelled "x",
+for the logically obvious reason. "M" always stands for the integer
+with the same value as the MARS memory size.
+
+----------------------------------------------------------------------
+
+III. MARS Peculiarities
+
+ There are two things about MARS which make Redcode different from
+any other assembly language. The first of these is that there are no
+absolute addresses in MARS. The second is that memory is circular.
+ Because there are no absolute addresses, all Redcode is written
+using relative addressing. In relative addressing, all addresses are
+interpreted as offsets from the currently executing instruction.
+Address 0 is the currently executing instruction. Address -1 was the
+previously executed instruction (assuming no jumps or branches).
+Address +1 is the next instruction to execute (again assuming no jumps
+or branches).
+ Because memory is circular, each instruction has an infinite number
+of addresses. Assuming a memory size of M, the current instruction
+has the addresses { ..., -2M, -M, 0, M, 2M, ... }. The previous
+instruction is { ..., -1-2M, -1-M, -1, M-1, 2M-1, ... }. The next
+instruction is { ..., 1-2M, 1-M, 1, M+1, 2M+1, ... }.
+
+Commentary: MARS systems have historically been made to operate on
+ object code which takes advantage of this circularity by insisting
+ that fields be normalized to positive integers between 0 and M-1,
+ inclusive. Since memory size is often not known at the time of
+ assembly, a loader in the MARS system (which does know the memory
+ size) takes care of field normalization in addition to its normal
+ operations of code placement and task pointer initialization.
+
+Commentary: Redcode programmers often want to know what the memory
+ size of the MARS is ahead of time. This is not always possible.
+ Since normalized fields can only represent integers between 0 and
+ M-1 inclusive, we can not represent M in a normalized field. The
+ next best thing? M-1. But how can we write M-1 when we do not
+ know the memory size? Recall from above that -1 is equivalent to
+ M-1. Final word of caution: -1/2 is assembled as 0 (not as M/2)
+ since the expression is evaluated within the assembler as -0.5 and
+ then truncated.
+
+86: Only two assembled-Redcode programs (warriors) are loaded into
+ MARS memory (core).
+88: Core is initialized to (filled with) DAT 0, 0 before loading any
+ warriors. Any number of warriors may be loaded into core.
+
+Commentary: Tournaments almost always pit warrior versus warrior with
+ only two warriors in core.
+
+ MARS is a multi-tasking system. Warriors start as just one task,
+but can "split" off additional tasks. When all of a warriors tasks
+have been killed, the warrior is declared dead. When there is a sole
+warrior still executing in core, that warrior is declared the winner.
+86: Tasks are limited to a maximum of 64 for each warrior.
+88: The task limit is not set by the standard.
+
+----------------------------------------------------------------------
+IV. Address Modes
+
+ Addressing modes subtly (sometimes not-so-subtly) alter the
+behaviour of instructions. A somewhat brief description of their
+general properties is given here. Specifics will be left to the
+instruction set section.
+ An octothorpe (#) is used to indicate an operand with an Immediate
+Address Mode. Immediate mode data is contained in the current
+instruction's field. If the A-mode is immediate, the data is in the
+A-field. If the B-mode is immediate, the data is in the B-field.
+ If no mode indicator is present (86: or the US dollar sign '$' is
+present), Direct Address Mode is used. Direct addresses refer to
+instructions relative to the current instruction. Address 0 refers to
+the current instruction. Direct address -1 refers to the (physically)
+previous instruction. Direct address +1 refers to the (physically)
+next instruction.
+ The commercial-at (@) is used to indicate Indirect Address Mode.
+In indirect addressing, the indirect address points to an instruction
+as in direct addressing, except the target is not the instruction to
+which the indirect address points but rather the instruction pointed
+to by the B-field of the instruct pointed to by the indirect address.
+Example:
+
+x-2 DAT #0, #0 ; Target instruction.
+x-1 DAT #0, #-1 ; Pointer instruction.
+x MOV 0, @-1 ; Copies this instruction to location x-2.
+
+ The less-than (<) is used to indicate (86: Auto-, 88: Pre-)
+Decrement Indirect Address Mode. Its behaviour is just like that of
+Indirect Address Mode, except the pointer is decremented before use.
+Example:
+
+x-2 DAT #0, #0 ; Target instruction
+x-1 DAT #0, #0 ; Pointer instruction. Compare to @ example.
+x MOV 0, <-1 ; Copies this instruction to location x-2.
+
+Commentary: Although Decrement Indirect addressing appears to be a
+ simple extension of Indirect addressing, it is really very tricky
+ at times - especially when combined with DJN. There are sematic
+ differences between the '86 and '88 standards, thus the change in
+ name from Auto-Decrement to Pre-Decrement. These differences are
+ discussed below. This discussion is non-essential for the average
+ Redcode programmer. I suggesting skipping to the next section for
+ the weak-stomached.
+
+86: Durham: Instructions are fetched from memory into an instruction
+ register. Each operand is evaluated, storing a location (into an
+ address register) and an instruction (into a value register) for
+ each operand. After the operands have been evaluated, the
+ instruction is executed.
+ Operand Evaluation: If the mode is immediate, the address register
+ is loaded with 0 (the current instruction's address) and the value
+ register is loaded with the current instruction. If the mode is
+ direct, the address register is loaded with the field value and
+ the value register is loaded with the instruction pointed to by
+ the address register. If the mode is indirect, the address
+ register is loaded with the sum of the field value and the B-field
+ value of the instruction pointed to by the field value and the
+ value register is loaded with the instruction pointed to by the
+ address register. If the mode is auto-decrement, the address
+ register is loaded with a value one less than the sum of the field
+ value and the B-field value of the instruction pointed to by the
+ field value and the value register is loaded with the instruction
+ pointed to by the address register. AFTER the operands have been
+ evaluated (but before instruction execution), if either mode was
+ auto-decrement, the appropriate memory location is decremented.
+ If both modes were auto-decrement and both fields pointed to the
+ same pointer, that memory location is decremented twice. Note
+ that this instruction in memory then points to a different
+ instruction than either operand and also differs from any copies
+ of it in registers.
+86: Other: As above, except there are no registers. Everything is
+ done in memory.
+Commentary: ICWS'86 clearly states the use of an instruction register,
+ but the other operand address and value registers are only
+ implied. Ambiguities and lack of strong statements delineating
+ what takes place in memory and what takes place in registers
+ condemned ICWS'86 to eternal confusion and gave birth to ICWS'88.
+88: As above except everything is done in memory and Pre-Decrement
+ Indirect replaces Auto-Decrement Indirect. Pre-Decrement Indirect
+ decrements memory as it is evaluating the operands rather than
+ after. It evaluates operand A before evaluating operand B.
+
+----------------------------------------------------------------------
+
+V. Instruction Set
+
+DAT A, B
+ The DAT (data) instruction serves two purposes. First, it allows
+you to store data for use as pointers, offsets, etc. Second, any task
+which executes a DAT instruction is removed from the task queue. When
+all of warrior's tasks have been removed from the queue, that warrior
+has lost.
+86: DAT allows only one operand - the B-operand. The A-field is left
+ undefined (the example shows #0), but comparisons of DAT
+ instructions with identical B-operands must yield equality.
+88: DAT allows two operands but only two modes - immediate and
+ pre-decrement.
+X: DAT takes one or two operands and accepts all modes. If only one
+ operand is present, that operand is considered to be the B-operand
+ and the A-operand defaults to #0.
+Commentary: It is important to note that any decrement(s) WILL occur
+ before the task is removed from the queue since the instruction
+ executes only after the operand evaluation.
+
+MOV A, B
+ The MOV (move) instruction either copies a field value (if either
+mode is immediate) or an entire instruction (if neither mode is
+immediate) to another location in core (from A to B).
+86: Durham: MOV #a, #b changes itself to MOV #a, #a.
+Commentary: There is a clear typographical error in ICWS'86 which
+ changes the interpretation of MOV #a, B to something non-sensical.
+ For those with a copy of ICWS'86, delete the term "B-field" from
+ the next-to-last line of the second column on page 4.
+88: No immediate B-modes are allowed.
+X: Immediate B-modes are allowed and have the same effect as a
+ B-operand of 0. (See 86: Durham: above).
+
+ADD A, B
+86: The ADD instruction adds the value at the A-location to the value
+ at the B-location, replacing the B-location's old contents.
+88: If the A-mode is immediate, ADD is interpreted as above. If the
+ A-mode is not immediate, both the A-field and the B-field of the
+ instruction pointed to by the A-operand are added to the A-field
+ and B-field of the instruction pointed to by the B-operand,
+ respectively. The B-mode can not be immediate.
+X: Immediate B-modes are allowed and have the same effect as in 86:.
+ Example: ADD #2, #3 becomes ADD #2, #5 when executed once.
+
+SUB A, B
+ The SUB (subtract) instruction is interpreted as above for all
+three cases, except A is subtracted from B.
+
+JMP A, B
+ The JMP (jump) instruction changes the instruction pointer to point
+to the instruction pointed to by the A-operand.
+86: JMP allows only one operand - the A-operand. The B-operand is
+ shown as #0.
+88: JMP allows both operands, but the A-mode can not be immediate.
+X: JMP allows both operands and the A-mode can be immediate. An
+ immediate A-mode operand is treated just like JMP 0, B when
+ executed.
+
+JMZ A, B
+ The JMZ (jump if zero) instruction jumps to the instruction pointed
+to by the A-operand only if the B-field of the instruction pointed to
+by the B-operand is zero.
+88: Immediate A-modes are not allowed.
+
+JMN A, B
+ The JMN (jump if non-zero) instruction jumps to the instruction
+pointed to by the A-operand only if the B-field of the instruction
+pointed to by the B-operand is non-zero.
+88: Immediate A-modes are not allowed.
+
+DJN A, B
+ The DJN (decrement and jump if non-zero) instruction causes the
+B-field of the instruction pointed to by the B-operand to be
+decremented. If the decremented values is non-zero, a jump to the
+instruction pointed to by the A-operand is taken.
+88: Immediate A-modes are not allowed.
+
+CMP A, B
+ The CMP (compare, skip if equal) instruction compares two fields
+(if either mode is immediate) or two entire instructions (if neither
+mode is immediate) and skips the next instruction if the two are
+equivalent.
+Commentary: There is a clear typographical error in ICWS'86 which
+ changes the interpretation of CMP #a, B to something non-sensical.
+ For those with a copy of ICWS'86, delete the term "B-field" from
+ the fifth line from the bottom of the second column on page 5.
+ Also, the comments to the example on page 6 have been switched
+ (equal is not equal and vice versa). The labels are correct
+ though.
+88: Immediate B-modes are not allowed.
+
+SPL A, B
+ The SPL (split) instruction splits the execution between this
+warrior's currently running tasks and a new task. Example: A battle
+between two warriors, 1 and 2, where warrior 1 has two tasks (1 and
+1') and warrior 2 has only one task would look like this: 1, 2, 1', 2,
+1, 2, 1', 2, etc.
+86: SPL allows only one operand - the B-operand. The A-operand is
+ shown as #0. After executing the SPL, the next instruction to
+ execute for this warrior is that of the newly added task (the new
+ task is placed at the front of the task queue). A maximum of 64
+ tasks is allowed for each warrior.
+88: SPL splits the A-operand, not the B-operand. After executing the
+ SPL, the next instruction to execute for this warrior is the same
+ instruction which would have executed had another task not been
+ added (the new task is placed at the back of the task queue).
+ There is no explicit task limit on warriors. Immediate A-operands
+ are not allowed.
+X: Immediate A-operands are allowed and behave as SPL 0, B when
+ executed.
+
+88: SLT A, B: The SLT (skip if less than) instruction skips the next
+ instruction if A is less than B. No Immediate B-modes are
+ allowed.
+X: Immediate B-modes are allowed.
+
+X: XCH A, B: The XCH (exchange) instructions exchanges the A-field and
+ the B-field of the instruction pointed to by the A-operand.
+
+X: PCT A, B: The PCT (protect) instruction protects the instruction
+ pointed to by the A-operand until the protection is removed by an
+ instruction attempting to copy over the protected instruction.
+
+Pseudo-Ops: Instructions to the Assembler
+-----------------------------------------
+
+END
+ The END pseudo-op indicates the end of the Redcode source program.
+86: END takes no operands.
+88: If END is followed by a label, the first instruction to be
+ executed is that with the label following END.
+X: ORG A (origin) takes over this initial instruction function from
+ END.
+Commentary: If no initial instruction is identified, the first
+ instruction of your program will be the initial instruction. You
+ can accomplish the same effect as "END start" or "ORG start" by
+ merely starting your program with the instruction "JMP start".
+
+86: SPACE A, B: The SPACE pseudo-op helps pretty-up Redcode source
+ listings. SPACE A, B means to skip A lines, then keep B lines on
+ the next page. Some assemblers do not support SPACE, but will
+ treat it as a comment.
+
+88: label EQU A: The EQU (equate) pseudo-op gives the programmer a
+ macro-like facility by replacing every subsequent occurrence of
+ the label "label" with the string "A".
+Commentary: A normal label is a relative thing. Example:
+
+x DAT #0, #x ; Here x is used in the B-field
+x+1 DAT #0, #x ; Each instruction's B-field gives
+x+2 DAT #0, #x ; the offset to x.
+
+is the same as
+
+x DAT #0, #0 ; Offset of zero
+x+1 DAT #0, #-1 ; one
+x+2 DAT #0, #-2 ; two
+
+but
+
+x! EQU 0 ; Equate label like #define x! 0
+ DAT #0, #x! ; Exclamation points can be used
+ DAT #0, #x! ; in labels (in Extended systems)
+ DAT #0, #x! ; I use them exclusively to indicate
+ ; immediate equate labels.
+
+is the same as
+
+ DAT #0, #0 ; A direct text replacement
+ DAT #0, #0 ; appears the same on every
+ DAT #0, #0 ; line it is used.
+
+----------------------------------------------------------------------
+