MIPS Interface: Difference between revisions

From N64brew Wiki
Jump to navigation Jump to search
Content added Content deleted
No edit summary
(Rewrite RepeatMode with better examples)
Line 41: Line 41:
| 9 | Upper | Upper mode enabled.
| 9 | Upper | Upper mode enabled.
| 8 | EBus | EBus mode enabled.
| 8 | EBus | EBus mode enabled.
| 7 | Repeat | Repeat mode enabled, Automatically clears after a single write.
| 7 | Repeat | Repeat mode enabled, Automatically clears after a single Rambus write.
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
}}
}}
Line 71: Line 71:
| 10 | SetEBus | Set Ebus mode.
| 10 | SetEBus | Set Ebus mode.
| 9 | ClearEBus | Clear Ebus mode.
| 9 | ClearEBus | Clear Ebus mode.
| 8 | SetRepeat | Set repeat mode. Automatically clears after a single write.
| 8 | SetRepeat | Set repeat mode. Automatically clears after a single Rambus write.
| 7 | ClearRepeat | Clear repeat mode.
| 7 | ClearRepeat | Clear repeat mode.
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
Line 90: Line 90:
: '''EBus modę:''' The lower 4 bits of the 32bit word are mapped onto 4 bits of EBus.<Br>In typical operation, EBus is used by RDP and VI to access the extra 9th bit (aka parity/error bit) that RDRAM provides for each byte. This mode allows the CPU to read this extra information back.<Br>Unfortunately this mode doesn't appear to be useful for writing to Antialiased framebuffers, as you can't combine a normal mode write and a EBus mode write without overwriting each other (Future testing required, maybe 64bit transfers work?)
: '''EBus modę:''' The lower 4 bits of the 32bit word are mapped onto 4 bits of EBus.<Br>In typical operation, EBus is used by RDP and VI to access the extra 9th bit (aka parity/error bit) that RDRAM provides for each byte. This mode allows the CPU to read this extra information back.<Br>Unfortunately this mode doesn't appear to be useful for writing to Antialiased framebuffers, as you can't combine a normal mode write and a EBus mode write without overwriting each other (Future testing required, maybe 64bit transfers work?)


: '''Repeat Mode:''' Writes cause a repeating pattern of '''RepeatCount+1''' bytes (upto 128 bytes) to be written. Reading can cause a hang (further testing needed).<br> First a 64bit value is loaded into the DBus FIFO to be the pattern:
: '''Repeat Mode:''' 32bit writes result in <code>RepeatCount + 1</code> bytes being written, with the same 32bit word repeating every 32bits. Reading with this mode can cause a hang. <br>In this mode, MI duplicates the 32bit word into both the upper and lower half of DBUS, and then lies to RI about the transfer count. Instead of 4, it sends ```RepeatCount```. The full Count is inserted into the Rambus request packet and the Rambus device uses the lower 3 bits for byte masking. RI looks at RepeatCount[6:3] to calculate the number 64bit transfers.
: This mode is labeled as '''"Init Mode"''' in some documentation. It's only used once during IPL3's RDRAM initialization to do a broadcast write to the '''Delay''' register. This is needed because after reset, the default timings in the '''Delay''' register result in the Rambus device sampling the data from the Rambus way too late and sampling garbage data. RI's timings are baked into hardware and can't be changed, but IPL3 uses a clever trick:<Br>
By enabling Repeat Mode and setting RepeatCount to 15, the 32bit value is repeated 4 times, and the Rambus device now samples valid data into the '''Delay''' register. However, the default timings aren't off by an integer multiple of the RCP clock, and the Rambus device samples halfway between two repetitions of the 32bit word. To work around this, IPL3 rotates the value by 16 bits before writing.


:: {| class="wikitable"
|-
| rowspan="2" | Uncached 64bit write || The 64bit value is used directly
|-
| <code>reg64=0x01234567_89abcdef -> pattern=0x012345678_9abcdef</code>
|-
| rowspan="2" | Uncached 32bit write || Only the lower 32bits are send over SYSAD, and that is duplicated into both upper and lower halves
|-
| <code>reg64=0x01234567_89abcdef -> pattern=0x89abcdef_89abcdef</code>
|-
| rowspan="2" | Uncached 16bit write
|| When writing less than 32bits, the cpu pre-shifts the value and MI sees it as a 32bit write. Normal byte masking is suppressed as it was replaced by RepeatCount.
|-
| <code>reg64=0x01234567_89abcdef, addr[1:0]=0b10 -> pattern=0xcdef0000_cdef0000</code>
|-
| rowspan="2" | Uncached 8bit write || ''As Above.'' Notice that some bytes are now unmasked.
|-
| <code>reg64=0x01234567_89abcdef, addr[2:0]=0b001 -> pattern=0xabcdef00_abcdef00</code>
|-
| rowspan="2" | Cache writeback || The last 8 bytes of the cacheline are used, everything else is discarded.
|-
| <code>cacheline="an ascii example" -> pattern=" example example"</code>
|}

: Then MI lies to RI about the transfer count, with '''RepeatCount''' overriding the correct value. causing RI to do <code>RepeatCount[6:3]</code> transfers, with MI repeating the same 64bit value for all transfers.<br> The Rambus device uses '''Addr[2:0]''' and '''RepeatCount[2:0]''' to implement byte masking for the first and last transfers. RI does not correctly implement support for unaligned writes that cross a 64bit boundary, so you end up with <code>RepeatCount[6:3] | RepeatCount[2:0] + Addr[2:0]</code> written, which discards the potential carry out from bit 2 to bit 3.

: This mode is labeled as '''"Init Mode"''' in some documentation. It's used once during Nintendo's IPL3's RDRAM initialization as a workaround for the default timings after RDRAM reset not being compatible with RI's hardcoded timings. See [[RDRAM#0x02_-_Delay|RDRAM Delay register]] for a detailed explanation.

: Repeat Mode is ideal for fast Memset operations, as you can quickly write a repeating pattern to upto 128 bytes of RAM with just two 32bit uncached writes.<br> For cache-coherency, you do need to invalidate those cache lines, but you can execute the CACHE instructions while MI is finishing the Repeat write with almost no impact to performance.<br> MI repeat is almost as fast as using RSP's DMA, but it will often be faster and more convenient to use MI repeat as the RSP is busy doing other things.

:: {| class="wikitable"
|+ Memset benchmarks for 1MiB of data.
|-
| 64bit uncached writes || 25.7 ms
|-
| 64bit cached writes || 49.8 ms
|-
| RSP DMA || 2.58 ms
|-
| MI repeat || 3.80 ms (plus 0.20ms with inline CACHE instructions)
|}

: If you are careful, it should be possible to do unaligned memsets too. An uncached byte write can be used to set '''Addr[2:0]''' to any value, and you can adjust '''RepeatCount[6:3]''' to compensate for the missing carry out, so the Rambus device writes the correct number of bytes.

==== <span style="display:none;">0x0430 0004 - MI_VERSION</code> ====
==== <span style="display:none;">0x0430 0004 - MI_VERSION</code> ====
----
----

Revision as of 05:20, 13 November 2023

The MIPS Interface (or MI) is one of multiple I/O interfaces in the RCP. It is the interface between the RCP and the VR4300 CPU, primarily used for enabling/disabling interrupts and checking their status.

Memory mapped registers are used to configure the MIPS Interface. The base address for these registers is 0x0430 0000, also known as MI_BASE. However, because all memory accesses in the CPU are made using virtual addresses, the following addresses must be offset appropriately. For non-cached reads/writes, add 0xA000 0000 to the address. As an example, to directly write to the MI_MODE register, use address 0xA430 0000.

Note that some of these registers have different behavior when writing to them, than when reading from them. When writing to a register that has Set and Clear bits, write a 1 on the desired bit. Writing 0's have no effect. Behavior is unknown when writing 1's to both Set and Clear bits in a pair at the same time.

Accesses beyond 0x0430 0010 are mirrored, so only the first four bits are taken into account for address decoding.

Registers

Table Notation:

R = Readable bit
W = Writable bit
U = Undefined/Unused bit
-n = Default value n at power on
[x:y] = Specifies bits x to y, inclusively

0x0430 0000 - MI_MODE


When Reading:

MI_MODE 0x0430 0000
31:24 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
23:16 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
15:8 U-0 U-0 W-0 W-0 W-0 W-0 R-0 R-0
Upper Ebus
7:0 R-0 RW-0 RW-0 RW-0 RW-0 RW-0 RW-0 RW-0
Repeat RepeatCount[6:0]
bit 9 Upper: Upper mode enabled.
bit 8 EBus: EBus mode enabled.
bit 7 Repeat: Repeat mode enabled, Automatically clears after a single Rambus write.
bit 6-0 RepeatCount[6:0]: Number of bytes (minus 1) to write in repeat mode

When Writing:

MI_MODE 0x0430 0000
31:24 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
23:16 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
15:8 U-0 U-0 W-0 W-0 W-0 W-0 RW-0 RW-0
Set Upper Clear Upper ClearDP Set Ebus Clear Ebus Set Repeat
7:0 W-0 RW-0 RW-0 RW-0 RW-0 U-0 RW-0 RW-0
Clear Repeat RepeatCount[6:0]
bit 13 SetUpper: Set Upper mode.
bit 12 ClearUpper: Clear Upper mode.
bit 11 ClearDP: Clear the DP Interrupt.
bit 10 SetEBus: Set Ebus mode.
bit 9 ClearEBus: Clear Ebus mode.
bit 8 SetRepeat: Set repeat mode. Automatically clears after a single Rambus write.
bit 7 ClearRepeat: Clear repeat mode.
bit 6-0 RepeatCount[6:0]: Number of bytes (minus 1) to write in repeat mode

MI's Mode:

The mode controls how reads and writes coming from the CPU are forwarded into RDRAM (technically, how 32bit SysAD bus transfers are mapped onto DBus and EBus, which are the internal RCP buses that transfer data from and to RDRAM).

RI only uses 64bit aligned reads/writes when accessing RDRAM. For operations smaller than 64bits, the data needs to be shifted into the correct bytes of DBUS. When writing, the Rambus device will use the lower 3 bits of the address and byte count as a byte mask. When reading, all 64bits are returned and the receiving device will need to implement its own byte masking. The VR4300's pipeline already shifts smaller loads/stores into the correct part of the 64bit double word and implements byte masking for loads, but because 8bit/16bit/32bit operations only result in 32bits of data being transferred across SysAD bus, MI still needs to shift each 32bit word into the correct half of DBus.

These modes impact all access to the RDRAM address range, not just Rambus registers, but actual memory too. Reads and Writes to other RCP registers and MMIO (like SI, PI or RSP DMEM/IMEM) are unaffected, as that data goes over CBus.

Normal mode: When all special mode bits are disabled, MI maps words onto the DBus as expected. 32bit words are shifted into the upper or low half of the 64bits depending on Addr[2] and any "critical word first" rules.
EBus appears to default to the sign extension of each byte (further testing needed)
Upper mode: 32bit transfers are always shifted into the upper half of the 64bit bus.
This mode is labeled as "RDRAM register mode" in some documentation and is useful for accessing registers on Rambus devices. The Rambus Rreg, Wreg, and WregB commands are hardcoded to ignore the count field of request packets and always do a 32bit transfers. When misinterpreting the RI's 8 byte transfer, the Rambus device always takes the first 4 bytes (which are the upper 32bits of DBus, because RCP is big endian) and ignores the next 4 bytes. Normal mode should produce correct results for registers at even offsets, but you need switch MI into Upper mode to correctly access odd registers.
EBus modę: The lower 4 bits of the 32bit word are mapped onto 4 bits of EBus.
In typical operation, EBus is used by RDP and VI to access the extra 9th bit (aka parity/error bit) that RDRAM provides for each byte. This mode allows the CPU to read this extra information back.
Unfortunately this mode doesn't appear to be useful for writing to Antialiased framebuffers, as you can't combine a normal mode write and a EBus mode write without overwriting each other (Future testing required, maybe 64bit transfers work?)
Repeat Mode: Writes cause a repeating pattern of RepeatCount+1 bytes (upto 128 bytes) to be written. Reading can cause a hang (further testing needed).
First a 64bit value is loaded into the DBus FIFO to be the pattern:
Uncached 64bit write The 64bit value is used directly
reg64=0x01234567_89abcdef -> pattern=0x012345678_9abcdef
Uncached 32bit write Only the lower 32bits are send over SYSAD, and that is duplicated into both upper and lower halves
reg64=0x01234567_89abcdef -> pattern=0x89abcdef_89abcdef
Uncached 16bit write When writing less than 32bits, the cpu pre-shifts the value and MI sees it as a 32bit write. Normal byte masking is suppressed as it was replaced by RepeatCount.
reg64=0x01234567_89abcdef, addr[1:0]=0b10 -> pattern=0xcdef0000_cdef0000
Uncached 8bit write As Above. Notice that some bytes are now unmasked.
reg64=0x01234567_89abcdef, addr[2:0]=0b001 -> pattern=0xabcdef00_abcdef00
Cache writeback The last 8 bytes of the cacheline are used, everything else is discarded.
cacheline="an ascii example" -> pattern=" example example"
Then MI lies to RI about the transfer count, with RepeatCount overriding the correct value. causing RI to do RepeatCount[6:3] transfers, with MI repeating the same 64bit value for all transfers.
The Rambus device uses Addr[2:0] and RepeatCount[2:0] to implement byte masking for the first and last transfers. RI does not correctly implement support for unaligned writes that cross a 64bit boundary, so you end up with RepeatCount[6:3] | RepeatCount[2:0] + Addr[2:0] written, which discards the potential carry out from bit 2 to bit 3.
This mode is labeled as "Init Mode" in some documentation. It's used once during Nintendo's IPL3's RDRAM initialization as a workaround for the default timings after RDRAM reset not being compatible with RI's hardcoded timings. See RDRAM Delay register for a detailed explanation.
Repeat Mode is ideal for fast Memset operations, as you can quickly write a repeating pattern to upto 128 bytes of RAM with just two 32bit uncached writes.
For cache-coherency, you do need to invalidate those cache lines, but you can execute the CACHE instructions while MI is finishing the Repeat write with almost no impact to performance.
MI repeat is almost as fast as using RSP's DMA, but it will often be faster and more convenient to use MI repeat as the RSP is busy doing other things.
Memset benchmarks for 1MiB of data.
64bit uncached writes 25.7 ms
64bit cached writes 49.8 ms
RSP DMA 2.58 ms
MI repeat 3.80 ms (plus 0.20ms with inline CACHE instructions)
If you are careful, it should be possible to do unaligned memsets too. An uncached byte write can be used to set Addr[2:0] to any value, and you can adjust RepeatCount[6:3] to compensate for the missing carry out, so the Rambus device writes the correct number of bytes.


0x0430 0004 - MI_VERSION


MI_VERSION 0x0430 0004
31:24 R-0 R-0 R-0 R-0 R-0 R-0 R-? R-?
RSP_VERSION[7:0]
23:16 R-0 R-0 R-0 R-0 R-0 R-0 R-? R-?
RDP_VERSION[7:0]
15:8 R-0 R-0 R-0 R-0 R-0 R-0 R-? R-?
RAC_VERSION[7:0]
7:0 R-0 R-0 R-0 R-0 R-0 R-0 R-? R-?
IO_VERSION[7:0]
bit 31-24 RSP_VERSION[7:0]: RSP hardware version
bit 23-16 RDP_VERSION[7:0]: RDP hardware version
bit 15-8 RAC_VERSION[7:0]: RAC hardware version
bit 7-0 IO_VERSION[7:0]: IO hardware version

Extra Details:

It is not known for certain the full extent of values that can exist here. Most consoles report 0x0202_0102, though emulators and other docs seem to mention other similar values such as 0x0101_0101 or 0x0201_0202. iQue retail consoles report 0x0202_b0b0. Testing should be performed on all revisions of the N64 motherboard and development systems. Results will be listed here.

0x0430 0008 - MI_INTERRUPT


MI_INTERRUPT 0x0430 0008
31:24 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
23:16 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
15:8 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
7:0 U-0 U-0 R-0 R-0 R-0 R-0 R-0 R-0
DP PI VI AI SI SP
bit 31-6 Undefined: Initialized to 0
bit 5 DP: Interrupt flag - Set when the RDP finishes a full sync (requested explicitly via a SYNC_FULL command)
bit 4 PI: Interrupt flag - Set when a PI DMA transfer finishes
bit 3 VI: Interrupt flag - Set when the VI starts processing a specific half-line of the screen (VI_V_CURRENT == VI_V_INTR). Usually, this is configured with VI_V_CURRENT = 2 so that it behaves as a VBlank interrupt.
bit 2 AI: Interrupt flag - Set when the AI begins playing back a new audio buffer (to notify that the next one should be enqueued as soon as possible, to avoid crackings)
bit 1 SI: Interrupt flag - Set when a SI DMA to/from PIF RAM finishes
bit 0 SP: Interrupt flag - Set when the RSP executes a BREAK opcode while SP_STATUS has been configured with the INTERRUPT_ON_BREAK bit; alternatively, it can also be set by explicitly writing the INTERRUPT flag in the SP_STATUS register (by either the CPU or the RSP itself).

0x0430 000C - MI_MASK


MI_MASK 0x0430 000C
31:24 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
23:16 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
15:8 U-0 U-0 U-0 U-0 W-0 W-0 W-0 W-0
Details Below
7:0 W-0 W-0 RW-0 RW-0 RW-0 U-0 RW-0 RW-0
Details Below
READ:                             WRITE:
    [11]   —                          [11]   Set DP Interrupt Mask
    [10]   —                          [10]   Clear DP Interrupt Mask
    [9]    —                          [9]    Set PI Interrupt Mask
    [8]    —                          [8]    Clear PI Interrupt Mask
    [7]    —                          [7]    Set VI Interrupt Mask
    [6]    —                          [6]    Clear VI Interrupt Mask
    [5]    DP Interrupt Mask          [5]    Set AI Interrupt Mask
    [4]    PI Interrupt Mask          [4]    Clear AI Interrupt Mask
    [3]    VI Interrupt Mask          [3]    Set SI Interrupt Mask
    [2]    AI Interrupt Mask          [2]    Clear SI Interrupt Mask
    [1]    SI Interrupt Mask          [1]    Set SP Interrupt Mask
    [0]    SP Interrupt Mask          [0]    Clear SP Interrupt Mask