Skip to content

A virtual CPU with some memory and I/O capabilities and some support for a 6502-based basic assembly

Notifications You must be signed in to change notification settings

charlie-gallagher/cpu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Virtual CPU

This is a quick, informal project emulating a processor. You can set a clock frequecy with the CLK_FREQ constant defined in main.h -- it's given in Hz. For watching instructions, something like 1 or 2 is fun, but pretty painfully slow.

See opcodes.c for a full list of implemented opcodes, and opcodes.h for the constant definitions for all opcodes I plan on implementing.

I've implemented a basic assembler that understands mnemonics, comments, and labels. It's not smart enough to figure out addressing modes, so these are included in the mnemonics explicitly. See the section "Assembler" below.

Compile

This program depends on the unistd.h header file, so it can't be compiled on Windows.

gcc -I include -o cpu src/*.c

# Or just make
make

Example

I've included a few test files, which are the best way to run this for the first time.

./cpu test/test_file.txt
./cpu test/subroutine_test.txt

They are good examples of the structure of the programs you can write for this CPU. Some of the other files are for testing various features, trying to trick the compiler and so on.

Assembler

The assembler is basic. The user is responsible for writing machine code instructions using mnemonics of the form "INSTRUCTION_MODE". MODE can be one of "D", "I", "X", or "M" for Direct, Indirect, indexed with the X register, and iMmediate address modes. The operands may be in hex (format xxh, e.g. 4fh) or decimal if the hex format is not used. So, for example, 25 will be read as decimal, 2fh as hexadecimal, and 2f as 2, because atoi is used to convert numbers. Alphabetical hex characters may be uppercase or lowercase.

You can add up to 20 labels per file. I can't imagine you being able to use many more than this when you only have 256 bytes of memory. The array of labels is printed at startup so you can troubleshoot if anything goes wrong. Labels are most useful for subroutines, jumping to the start of the program, and defining variables. See test/subroutine_test.txt for an example. Labels are processed before the instructions, so they can be used before they are defined.

JMP
start

; Define some subroutines and data
mysubroutine:
    ...

start:
    ...   ; Start of program

Comments begin with ; and may be included on a line by itself or at the end of a line. Any part of a line after the ";" is dropped during processing.

Whitespace is also stripped from the lines, so you can indent lines for clarity.

Instructions are placed in the first addresses in RAM. Every operation is 2 bytes (1 byte instruction, 1 byte operand), and blank lines and comment lines do not affect the numbering of address locations. So, for example,

; Begin program
LDA_M  ; Load accumulator
2fh

; Store value in 3F
STA_D  ; Store accumulator
3Fh

will be entered into RAM as

0: LDA_M
1: 2fh
2: STA_D
3: 3fh

Conditional branch instructions go to address PC + OPERAND, so to jump backwards you must pass a negative number or the appropriate hex conversion for a 1-byte 2's complement number. Conditional jump addressing matches the addressing used by the 6502: the offset is relative to the next value of the PC after the operand of the branch instruction. In other words, to properly jump from a conditional branch instruction, go to the next word in memory and count from there.

LDA_M
10
STA_X
IO_START
BNE
-6      ; Branch back to "LDA_M"

Every instruction takes an operand, even implied addressing instructions like INCX. If you use INCX, you must follow it with a null byte (or any other byte, but for clarity's sake use a null byte).

The types of input allowed are:

  • Mnemonics (see next section for instruction set)
  • Decimal numbers (no suffix)
  • Hexadecimal numbers (h suffix)
  • The IO_START builtin constant
  • Character constants surrounded by single quotes (e.g., 'm' or 'c')
  • A label definition of the form [A-Za-z_]+: (e.g., a: or hello_world:)
  • A label name defined in your file

Labels may include characters or underscores (_). They are attached to the address of the next RAM entry, so you won't run into trouble if you define multiple labels for a single address.

one:
two:
    LDA_M
    10h

In this case, both one and two will be associated with address 0 in RAM, a.k.a. the instruction LDA_M. Labels of course do not affect the arrangement of memory in RAM.

Instruction set

The instructions use suffixes to indicate the addressing mode.

  • "D" Direct, or absolute, addressing
  • "I" Indirect addressing
  • "X" Indexed addressing with the X register
  • "M" Immediate addressing

I have tested many but not all of the instructions. Report bugs if you find them.

ADC_D       Add number to accumulator
ADC_I
ADC_M
ADC_X
AND_D       AND operand with accumulator
AND_I
AND_M
AND_X
BCC         Jump if carry clear
BCS         Jump if carry set
BEQ         Jump if zero flag set
BMI         Jump if negative flag set
BNE         Jump if zero flag not set
BPL         Jump if negative flag clear
CLC         Clear carry flag
CMP_D       Compare operand to accumulator
CMP_I
CMP_X
DECX        Decrement X register
DEC_D       Decrement place in memory
DEC_I
DEC_X
HLT         Halt
INCX        Increment X register
INC_D       Increment place in memory
INC_I
INC_X
JMP         Unconditional jump
JSR         Jump to subroutine at operand address
LDA_D       Load accumulator
LDA_I
LDA_M
LDA_X
LDX_D       Load X register
LDX_M
OR_D        OR operand with accumulator
OR_I
OR_M
OR_X
PHA         Push accumulator onto stack
PHP         Push status register onto stack
PLA         Pull accumulator from stack
PLP         Pull status register from stack
RTS         Return from subroutine
SBC_D       Subtract number from accumulator
SBC_I
SBC_M
SBC_X
SEC         Set carry flag
STA_D       Store accumulator
STA_I
STA_X
TAX         ...and vice versa
TSX         ...and vice versa
TXA         Transfer from index to accumulator...
TXS         Transfer from index to stack pointer...

Known problems and deficiencies

While this is close to the Motorolla 6502 instruction set, it is not a copy of it.

  • The 6502 has two registers besides the accumulator, X and Y, but I only included X (index register)
  • All instructions must be 2 bytes, even if they do not take an operand
  • I haven't implemented bit shift or bit testing instructions (yet)

Report bugs to [email protected].


Charlie Gallagher, September 2021

About

A virtual CPU with some memory and I/O capabilities and some support for a 6502-based basic assembly

Resources

Stars

Watchers

Forks

Packages

No packages published