Reality Signal Processor/CPU Core: Difference between revisions

From N64brew Wiki
Jump to navigation Jump to search
Content added Content deleted
m (Fixed typo and minor grammar/capitalization changes)
No edit summary
Line 17: Line 17:


All accesses to DMEM are performed using the lowest 12 bits of the address calculated by the load/store instruction (higher bits are ignored). Moreover, contrary to standard MIPS architecture, the RSP can correctly perform misaligned memory accesses (eg: it is possibly to fetch a 32-bit word at address 0x001, that will contain the 4 bytes at 0x1-0x5). Standard MIPS architecture allows to do misaligned addresses only using the LWL/LWR or SWL/SWR couples, which are not required on the RSP.
All accesses to DMEM are performed using the lowest 12 bits of the address calculated by the load/store instruction (higher bits are ignored). Moreover, contrary to standard MIPS architecture, the RSP can correctly perform misaligned memory accesses (eg: it is possibly to fetch a 32-bit word at address 0x001, that will contain the 4 bytes at 0x1-0x5). Standard MIPS architecture allows to do misaligned addresses only using the LWL/LWR or SWL/SWR couples, which are not required on the RSP.

== Vector Unit (VU) ==
The VU is the internal unit of the RSP CPU core that is able to perform fixed-point SIMD calculations. It is a proprietary design which does not follow any standard specification. Its opcodes and registers are exposed to the core via the COP2 interface.

=== Vector registers and glossary ===
VU contains 32 128-bit SIMD registers, each organized in 8 lanes of 16-bit each one. Most VU opcodes perform the same operation in parallel on each of the 8 lanes. The arrangement is thus similar to x86 SSE2 registers in EPI16 format.

The vector registers array is called <code>VPR</code> in this document, so <code>VPR[4]</code> refers to the fifth register (usually called <code>v4</code> in assembly). When referring to specific portions of the register, we use the following convention:

* <code>VPR[vt][4..7]</code> refers to byte indices, that is bytes from 4 to 7, counting from the higher part of the register (in big-endian order).
* <code>VPR[vt]<4..7></code> refers to specific lane indices, that is lanes from 4 to 7 counting from the higher part of the register (in big-endian order).
* Within each lane, <code>VPR[vt]<2>(3..0)</code> refers to inclusive bit ranges. Notice that bits are counted as usual in little-endian order (bit 0 is the lowest, bit 15 is the highest), and thus they are written as <code>(high..low)</code>.

Ranges are specified using the <code>beg..end</code> inclusive notation (that is, both <code>beg</code> and <code>end</code> are part of the range).

The concatenation of disjoint ranges is written with a <code>,</code>, for instance: <code>[0..3,8..11]</code> means 8 bytes formed by concatenating 4 bytes starting at 0 with 4 bytes starting at 8.

Revision as of 22:46, 3 April 2022

Scalar unit (SU)

The scalar is the half of the RSP core that is similar to a standard MIPS R4000 32-bit CPU. It has 32 32-bit registers (conventionally called r0-r31) and implement most standard opcodes. This page does not describe the whole scalar unit as standard MIPS documentation suffices, but it highlights the main difference.

Missing opcodes

The following opcodes are not implemented by RSP:

  • Multiplication units. RSP does not have a multiplication unit so there is no MULT, MULTU, DIV, DIVU, MFHI, MFLO, MTHI, MTLO.
  • 64-bit instructions. RSP only has 32-bit scalar registers in SU, so there is no 64-bit opcodes (the ones starting with D such as DADDIU, DSRL, etc.) nor 64-bit memory accesses such as LD, SD, LDL, SDL.
  • No opcodes for misaligned memory accesses. All memory accesses to DMEM can be correctly performed also to misaligned addresses, using the standard opcodes like LW / SW or LH / LHU / SH, so there is no LWL, LWR, SWL, SWR.
  • No traps or exceptions. RSP does not implement any form of interrupt or exception handling, so there is no SYSCALL nor trap instructions (TGE, TLT, etc.). BREAK is available but it has a special behavior (see below).
  • No support for likely branches. The "likely" variant of all branches is not supported. The missing opcodes are the ones ending with L (such as BEQL, BLEZL, etc.)

Memory access

RSP is a harvard architecture. All opcodes are fetched from IMEM (4KB) and all data is access in DMEM (4KB).

The PC register is 12-bit. All higher address bits in branch / call instructions are thus ignored. When PC reaches the last opcode (at 0xFFC), execution continues to the first opcode in IMEM (PC wraps to 0x000).

All accesses to DMEM are performed using the lowest 12 bits of the address calculated by the load/store instruction (higher bits are ignored). Moreover, contrary to standard MIPS architecture, the RSP can correctly perform misaligned memory accesses (eg: it is possibly to fetch a 32-bit word at address 0x001, that will contain the 4 bytes at 0x1-0x5). Standard MIPS architecture allows to do misaligned addresses only using the LWL/LWR or SWL/SWR couples, which are not required on the RSP.

Vector Unit (VU)

The VU is the internal unit of the RSP CPU core that is able to perform fixed-point SIMD calculations. It is a proprietary design which does not follow any standard specification. Its opcodes and registers are exposed to the core via the COP2 interface.

Vector registers and glossary

VU contains 32 128-bit SIMD registers, each organized in 8 lanes of 16-bit each one. Most VU opcodes perform the same operation in parallel on each of the 8 lanes. The arrangement is thus similar to x86 SSE2 registers in EPI16 format.

The vector registers array is called VPR in this document, so VPR[4] refers to the fifth register (usually called v4 in assembly). When referring to specific portions of the register, we use the following convention:

  • VPR[vt][4..7] refers to byte indices, that is bytes from 4 to 7, counting from the higher part of the register (in big-endian order).
  • VPR[vt]<4..7> refers to specific lane indices, that is lanes from 4 to 7 counting from the higher part of the register (in big-endian order).
  • Within each lane, VPR[vt]<2>(3..0) refers to inclusive bit ranges. Notice that bits are counted as usual in little-endian order (bit 0 is the lowest, bit 15 is the highest), and thus they are written as (high..low).

Ranges are specified using the beg..end inclusive notation (that is, both beg and end are part of the range).

The concatenation of disjoint ranges is written with a ,, for instance: [0..3,8..11] means 8 bytes formed by concatenating 4 bytes starting at 0 with 4 bytes starting at 8.