MIPS Interface: Difference between revisions
Rewrite RepeatMode with better examples
No edit summary |
(Rewrite RepeatMode with better examples) |
||
Line 41:
| 9 | Upper | Upper mode enabled.
| 8 | EBus | EBus mode enabled.
| 7 | Repeat | Repeat mode enabled, Automatically clears after a single Rambus write.
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
}}
Line 71:
| 10 | SetEBus | Set Ebus mode.
| 9 | ClearEBus | Clear Ebus mode.
| 8 | SetRepeat | Set repeat mode. Automatically clears after a single Rambus write.
| 7 | ClearRepeat | Clear repeat mode.
| 6-0 | RepeatCount[6:0] | Number of bytes (minus 1) to write in repeat mode
Line 90:
: '''EBus modę:''' The lower 4 bits of the 32bit word are mapped onto 4 bits of EBus.<Br>In typical operation, EBus is used by RDP and VI to access the extra 9th bit (aka parity/error bit) that RDRAM provides for each byte. This mode allows the CPU to read this extra information back.<Br>Unfortunately this mode doesn't appear to be useful for writing to Antialiased framebuffers, as you can't combine a normal mode write and a EBus mode write without overwriting each other (Future testing required, maybe 64bit transfers work?)
: '''Repeat Mode:''' Writes cause a repeating pattern of '''RepeatCount+1''' bytes (upto 128 bytes) to be written. Reading can cause a hang (further testing needed).<br> First a 64bit value is loaded into the DBus FIFO to be the pattern:
:: {| class="wikitable"
|-
| rowspan="2" | Uncached 64bit write || The 64bit value is used directly
|-
| <code>reg64=0x01234567_89abcdef -> pattern=0x012345678_9abcdef</code>
|-
| rowspan="2" | Uncached 32bit write || Only the lower 32bits are send over SYSAD, and that is duplicated into both upper and lower halves
|-
| <code>reg64=0x01234567_89abcdef -> pattern=0x89abcdef_89abcdef</code>
|-
| rowspan="2" | Uncached 16bit write
|| When writing less than 32bits, the cpu pre-shifts the value and MI sees it as a 32bit write. Normal byte masking is suppressed as it was replaced by RepeatCount.
|-
| <code>reg64=0x01234567_89abcdef, addr[1:0]=0b10 -> pattern=0xcdef0000_cdef0000</code>
|-
| rowspan="2" | Uncached 8bit write || ''As Above.'' Notice that some bytes are now unmasked.
|-
| <code>reg64=0x01234567_89abcdef, addr[2:0]=0b001 -> pattern=0xabcdef00_abcdef00</code>
|-
| rowspan="2" | Cache writeback || The last 8 bytes of the cacheline are used, everything else is discarded.
|-
| <code>cacheline="an ascii example" -> pattern=" example example"</code>
|}
: Then MI lies to RI about the transfer count, with '''RepeatCount''' overriding the correct value. causing RI to do <code>RepeatCount[6:3]</code> transfers, with MI repeating the same 64bit value for all transfers.<br> The Rambus device uses '''Addr[2:0]''' and '''RepeatCount[2:0]''' to implement byte masking for the first and last transfers. RI does not correctly implement support for unaligned writes that cross a 64bit boundary, so you end up with <code>RepeatCount[6:3] | RepeatCount[2:0] + Addr[2:0]</code> written, which discards the potential carry out from bit 2 to bit 3.
: This mode is labeled as '''"Init Mode"''' in some documentation. It's used once during Nintendo's IPL3's RDRAM initialization as a workaround for the default timings after RDRAM reset not being compatible with RI's hardcoded timings. See [[RDRAM#0x02_-_Delay|RDRAM Delay register]] for a detailed explanation.
: Repeat Mode is ideal for fast Memset operations, as you can quickly write a repeating pattern to upto 128 bytes of RAM with just two 32bit uncached writes.<br> For cache-coherency, you do need to invalidate those cache lines, but you can execute the CACHE instructions while MI is finishing the Repeat write with almost no impact to performance.<br> MI repeat is almost as fast as using RSP's DMA, but it will often be faster and more convenient to use MI repeat as the RSP is busy doing other things.
:: {| class="wikitable"
|+ Memset benchmarks for 1MiB of data.
|-
| 64bit uncached writes || 25.7 ms
|-
| 64bit cached writes || 49.8 ms
|-
| RSP DMA || 2.58 ms
|-
| MI repeat || 3.80 ms (plus 0.20ms with inline CACHE instructions)
|}
: If you are careful, it should be possible to do unaligned memsets too. An uncached byte write can be used to set '''Addr[2:0]''' to any value, and you can adjust '''RepeatCount[6:3]''' to compensate for the missing carry out, so the Rambus device writes the correct number of bytes.
==== <span style="display:none;">0x0430 0004 - MI_VERSION</code> ====
----
|