icasm : Retargetable IcpuED ASSEMBLER

Version 5.2 - 6/12/2000  - A. K. Uht

1. Introduction

This page describes the syntax and semantics of the icasm assembler used to generate the equivalent of machine code for the ICED Canned CPU. An ASCII hex code file suitable for downloading to the ICED protosys hardware (main memory) is created by the assembler. A listing file is also generated, consisting of the source lines prefixed with instruction addresses and opcodes (or data). Listing file example.

icasm allows the use of symbolic names for instruction labels or constants. Address computations to determine relative or absolute addresses for branches are automatically performed by icasm. Regular decimal and hexadecimal numbers may also be used, of course.

A symbol table may also optionally be created. This file is expressly for use with a ICED HP-1662CS logic analyzer. The analyzer uses it to recognize addresses and display the source labels with their corresponding instructions. The analyzer also uses it with the HP Software Analyzer to map (static) program source lines to (dynamic) executed instructions.

Retargeting the Assembler

icasm is retargetable, i.e., it can be modified to be used for any set of mnemonics or Assembly language. Of course, the closer the new language to the old the easier it is. Adding an instruction can be done in many cases with the addition of only two lines of code and the modification of a third. Instructions are in the source files themselves.

Click here to obtain the icasm code. (It's in C.)
 

2. File name extensions

The following extensions are recommended, but any valid filenames can be used. (NOTE: the logic analyzer expects the symbol table file to have the .gpa extension.)
 
name.src - source file, input to icasm
name.hex - assembler output, IcpuED hex file, input to host monitor program, loaded into ICED main memory by monitor. This file is deleted by icasm if any errors are detected in name.src
name.lst - listing file, output of icasm
name.gpa - symbol table file, output of icasm

 

3. icasm command line

icasm <source file> <output file> {-l <listing file>} {-s <symbol table file>}
         (.src)        (.hex)            (.lst)            (.gpa)


Suitable .cshrc file alias:

ic alias 'icasm \!*.src \!*.hex -l \!*.lst'


 Then the Unix command would be:

ic name

4. Source file format

 Each line not starting with a "*" has the format:
 

       field 1   field 2   field 3   field 4
     -----------------------------------------
     | label   | opcode  | operands| comment |
     |   or    |   or    |         |         |
     | blank   |directive|         |(ignored)|
     -----------------------------------------

  1. A "*" in column 1 indicates the entire line is a comment.
  2. A delimiter separates each field.
  3. Delimiters between fields are one or more blankspace characters (blanks and tabs).
  4. If there is no label, then there must be at least one delimiter, i.e., there must be at least one space character before an opcode or directive.
  5. Operands are separated by commas, blankspace is jumped over.
  6. Lines with a label but without an opcode/directive ARE allowed. The address of the next line with an instruction is used for the label.
    • Note: this feature is not currently fully tested.
  7. Labels are made up of letters, digits, or underscore ('_'). Labels must start with a letter (at least the first character).
  8. All opcodes and directives may appear in any order.

5. Pseudo-ops, i.e., assembler directives

pc #X - Set PC (program counter) to X, initialize the PC internal to the assembler for code location.
There may be multiple pc directives in a program, and they need not have strictly ascending values of X, e.g., pc #2000 may be followed by pc #1000.
label equ #X - Assign the value X to "label".
The label may be used in place of numeric constants anywhere (I think) in the program, including memory offsets.
Labels may be used before being declared.
ds  #X - Allocate X words of storage at the current PC value; this increases the PC by 4*X.
This directive takes one line in the name.lst file, but X lines of '0' entries in the name.hex file.
dw  #X - Allocates one word of storage at the current value of the PC and initializes the word to X. The PC is incremented by 4.

6. Assembler misc. and examples

Special characters:

"r"     - MUST be used as first character of register operands. Valid register operands are: "r0" through "r15"
"#"     - Indicates that the following string should be treated as a number (decimal or hex). Labels should be preceded by a "#".
"%"     - Treat the following string as a hex number. Must follow a "#" character, can't have a "%" by itself.
< "+"     - Used after a symbolic name to add a constant offset; not supported in this version of icasm, but may work. >

Constants:

Decimal numbers may be negative or positive, hex numbers positive only, although they may be interpreted as negative if their msb is a 1. Not all numbers may be negative: offsets and absolute word addresses must be positive; in these cases hex numbers are treated as unsigned integers.

Character constants are not supported at this time, except in their numeric representation (of course).
 

Examples:

add r1,r2, r3 (r1 <- r2 + r3)
ldi r1, #%10 (load r1 with 16 [hex 10])
ldiu r1, #11 (load the upper 16 bits of r1 with 11 [decimal])
brt r2, loop (branch if r2 is non-zero to statement with label "loop"; relative address is computed by icasm automatically)
brl r1, subrt1 (branch-and-link to statement with label "subrtl"; the absolute address is computed by icasm automatically)
ld r1, #8(r2) (load r1 with the contents of memory at address 8+[value of r2])

Other:

Macros are not supported at this time.
 
 

7. Assembler mnemonics

For a full list of valid IcpuED mnemonics, see IcpuED Instruction Set.

icasm also recognizes the directives given above: pc, equ, ds, dw
There is no "define byte" directive at this time.
 
 

8. Suggested programming conventions

  1. Use r0 for a constant '0'
  2. Use r1 for a constant '1'
  3. Use r4 for a constant '4' - for word pointer incrementation
  4. Use r14 for subroutine linkage (see example below)
  5. Use r15 for the stack pointer (see example below); initialize it to the highest word address in main memory: %7ffffc (8MB DRAM)
  6. If a no-op is needed and you are following "a." above, use:  add  r0,r0,r0
  7. Place a branch to the entry point of your program at memory address  %000000, and start your program later on in memory (say at  %100). 
    This way you can load multiple programs into memory at once and select which one is executed by modifying the above branch. 
    (IcpuED always starts execution at pc=%000000.)
  8. Reserve locations %000004 and %000008 for exception routine vectors, i.e., a jump to the Interrupt Service Routine (ISR)
    is put at %000008. (These "vector" addresses are fixed in IcpuED and may not be changed, at least not in the common version.)

Example of subroutine calling and return assuming above conventions:

          brl r14, subrt3 branch to subroutine subrt3
           .
           .
           .
           .
   subrt3 st 0(r14),r15   store the old value of the PC on the stack
          sub r15,r15,r4  decrement stack pointer
           .
           .   body of subroutine, with other calls
           .
          add r15,r15,r4  increment stack pointer
          ld r14,0(r15)   restore link
          bri r14         return from the subroutine call
 

9. Complete example program.

This is the benchmark program appearing elsewhere in ICED (or ELE 405) documentation. This is the listing file output of icasm; the source file is bmk.src. The source file does not contain the first three columns. Column one is just a line number. Column two contains the data or instruction data address. Column three contains the data or instruction machine code.
 

IcpuED Assembler icasm.c Version 5.0
        Input source code file "bmk.src".
Pass 1:  0 errors.
     1                          * IcpuED basic benchmark and test program (not exhaustive).
     2                          *  Note: in order to make it fit in 32 words the initial
     3                          *  part (before "strt") and all pc directives must be removed.
     4
     5                                  pc #0
     6   000000   a000040e              brl r14, strt branch to start of program
     7
     8                                  pc #%8
     9   000008   00000000              dw #0   ISR vector goes here
    10
    11                          * Start of program
    12                                  pc #%100
    13
    14   000100   c000000c      strt    ldi r12, #0     r12 holds an msb mask
    15   000104   c18000cc              ldiu r12, #%8000
    16   000108   c0000008              ldi r8, #0      r8 - running negative sum
    17   00010c   c0000009              ldi r9, #0      r9 - running positive sum
    18   000110   c0000807              ldi r7, #8      r7 - loop count limit
    19   000114   c0000000              ldi r0, #0      r0 - used for constant 0
    20   000118   c0000101              ldi r1, #1      r1 - used for constant 1
    21   00011c   c0fffc0f              ldi r15, #%fffc r15 - stack pointer - had to add ldiu
    22   000120   c1007fff              ldiu r15, #%7f
    23   000124   c0000002              ldi r2, #0      r2 - loop counter
    24   000128   c0030004              ldi r4, data    r4 - pointer to the data (not r4 convention)
    25
    26   00012c   89200243      lp      add r3, r4, r2  add loop count to data pointer (byte)
    27   000130   c2800035              ldb r5, #0(r3)  get the byte from the array
    28   000134   410005f0              st #0(r15), r5  save the byte on the stack (param)
    29   000138   a000080e              brl r14, elmnt  branch to the subroutine
    30   00013c   89200212              add r2, r1, r2  increment the loop counter
    31   000140   8a200726              sub r6, r2, r7  see if the count has been reached
    32   000144   10fff960              brt r6, lp      if not, go back (loop)
    33   000148   40808840              stb #8(r4), r8  save the negative sum in memory
    34   00014c   40809940              stb #9(r4), r9  save the positive sum in memory
    35   000150   03000000              halt            stop the CPU, assert "cpu-halted"
    36
    37                                  pc #%200
    38   000200   c30000fb      elmnt   ld r11, #0(r15)         get the datum off of the stack
    39   000204   8a2000ba              sub r10, r11, r0        dummy; subtract 0 from the datum
    40   000208   85000caa              and r10, r10, r12       test the msb (negative datum?)
    41   00020c   120002a0              brf r10, pos            if not, go to positive # handler
    42   000210   89200b88              add r8, r8, r11         negative, so add to negative sum
    43   000214   12000100              brf r0, ret             skip to return
    44   000218   89200b99      pos     add r9, r9, r11         positive, so add to positive sum
    45   00021c   200000e0      ret     bri r14                 return from subroutine
    46
    47                                  pc #%300
    48   000300   f00238f9      data    dw #%f00238f9   data, first four bytes:  -16, 2, 56, -7
    49   000304   0ffde700              dw #%0ffde700   data, last four bytes:   15, -3, -25, 0
    50   000308   00ffeedd      results dw #%00ffeedd   garbage; last two bytes --> 49 and cd
    51
    52                                  pc #%7ffffc     set up the stack
    53   7ffffc   ffffffff      stack   dw #%ffffffff   garbage; stack
Pass 2:  0 errors.
 
 
 

Appendix. Hex file format

You may never need to know the following details, but they are here for your reference should you ever want to hack the binary or create interfaces for ICED memory loading or retrieval.

The output hex file format of icasm consists of a series of ASCII characters interpreted as hex numbers, one number per line, no blank lines. There may be multiple blocks in a file, with no blank lines between the blocks.

Each block consists of a 3-word header followed by the data. The header entries are:
 

  1. Starting address in main memory where the block is to be placed.
  2. Size in bits of each datum; valid values are 8 and 20 (32 decimal).
  3. Number of data in the block, not including the header.

Example:

 There are two blocks in the following example, the first composed of 5 bytes to start at location 1000 and the second consisting of 2 32-bit words to start at location 2000. Therefore, when this code is loaded into ICED protosys memory, address 1002 holds 03, address 2004 holds 9abcdef0 (all numbers are hex), etc.

    1000
       8
       5
      01
      02
      03
      04
      05
    2000
      20
       2
12345678
9abcdef0