coursework

Two-Pass Assembler

ANSI C90 compiler-like assembler. Two-stage: symbol-table build, then instruction parsing and base-4 machine-code emission. Built for the Systems Programming Laboratory course.

Year2023
DisciplineEMB
StackC (ANSI C90) · Make · GDB
Two-Pass Assembler — source, symbol table, and base-4 machine code output
Two-Pass Assembler — source, symbol table, and base-4 machine code output

The problem

Course assignment for Systems Programming Laboratory in C: implement a two-pass assembler in ANSI C90 for a custom instruction set, emitting base-4 machine code. No external libraries. Stand on your own data structures.

What I built

A two-stage compiler-like pipeline. Pass one walks the source, builds a symbol table, allocates memory for instructions and data, and resolves sizes. Pass two emits machine code with every symbol and addressing mode finalized. Strict ANSI C90 throughout — no comfy C99 features, no <stdbool.h> shortcuts.

Highlights

  • Symbol table with hashed lookup; entries distinguish labels, externals, and entries.
  • Instruction parser covering every addressing mode in the spec, including the weird ones.
  • Macro pre-processing as a separate pass before assembly.
  • Base-4 encoding emitted alongside human-readable object files for debugging.

What I learned

C90 makes you appreciate every affordance you get in higher-level languages — and then you stop missing them, because you finally understand what they're doing underneath. The fastest way to learn memory management is to have none of it handed to you.