RDRAM Interface: Difference between revisions

(RI_REFRESH: Emphasize that "refresh enable" refers specifically to the automatic refresh issued on VI HSYNC, other small tweaks.)
 
(3 intermediate revisions by one other user not shown)
Line 193:
{{#invoke:Register table|definitions
| 31-4 | Undefined | Undefined
| 3-0 | DmaLatencyOverlap[4:0] | ? Defaults to <code>0xf</code>
}}
 
'''Speculation:'''
 
: This might control the maximum size of DMA transfers. RCP supports DMA bursts of upto 16 Octbytes (128 bytes), which matches the default value.<Br> Perhaps this register allows forces a smaller transfer size and allows better interleaving of multiple DMA requests, or for a lower guaranteed latency when a high-priority device (like VI) requests a DMA transfer.
: This register isn't used by any known N64 software, maybe it's broken. Maybe it didn't improve performance.
 
==== <span style="display:none;">0x0470 0018 - RI_ERROR ====
Line 256 ⟶ 261:
 
'''Note:''' Some sources such as libultra's <code>rcp.h</code> header call this register <code>RI_WERROR</code>, however this register is unrelated to errors. The name <code>RI_BANK_STATUS</code> comes from a patent and is much more descriptive of the function of this register.
 
= Bank Status Tracking =
Each 1 MiB bank can only have one row (2 KiB) open. The only way to open a row on with version 1 of the Rambus spec is to just attempt a read or write operation. If the row is already open, the operation succeeds (hits) and the Rambus device responds with an Ack packet. If the row wasn't open, the operation fails and the Rambus device responds with a NAck packet, while simultaneously closing the currently open row and loading the next. This takes even longer if the current row is dirty and needs to be written back to the dram array first. The Controller must send a new request packet once the device has finished opening the row.
 
One possible implementation for a Rambus controller is to just retry any operations that miss, they will eventually succeed. But RI doesn't have any retry logic. It does detect unexpected NAcks and set the '''<small>NAck</small>''' bit in the '''RI_ERROR''' register.
 
Instead, RI tracks the current status of the state machine for each bank. Some of this shadow state machine is exposed via the '''RI_BANK_STATUS''' register where you can find the row valid and row dirty bits. The '''<small>MultiBank</small>''' field of the '''RI_REFRESH''' register also has some effect, as the two banks of a 2MiB chip share some resources. ''(Research Needed, exactly which timings are affected by Multibank?)'' Other parts of the shadow state machine are not exposed via registers, such as if the chips are currently executing a refresh operation, or which row is currently open. With this state tracking, RI always knows which requests will cause a miss and how long it needs to wait before resending the request packet.
 
RI only has resources for tracking 8 banks (of 1 MiB each, for a total of 8 MiB) and these banks are hardwired into the bottom 8 MiB of the memory-space, as 8 continuous banks.
 
While you could initialise more Rambus devices in the space above 8 MiB, or move one of the existing devices, without Bank Status tracking, the timings will be wrong ''(Research Needed, wrong in what way, presumably RI always assumes operations will always hit?).''
 
Bank Status Tracking also interferes with any attempt to use the Rambus' Address Swapping feature, as there is no way to configure Bank Status tracking's address to match the new layout.
 
= Memory addressing =
Line 271 ⟶ 289:
|-
|<code>0x0000 0000</code>
|<code>0x03EF0x007F FFFF</code>
|0
|(address >> 20) & 0x3F
Line 278 ⟶ 296:
|0
|Memory-space access
|-
|<code>0x0080 0000</code>
|<code>0x03EF FFFF</code>
|0
|(address >> 20) & 0x3F
|(address >> 11) & 0x1FF
|address & 0x7FF
|0
|Broken Memory-space access
Not covered by bank status tracking
|-
|<code>0x03F0 0000</code>
Line 297 ⟶ 325:
|Broadcast register write
|}
 
Notice that memory-space accesses (0x00000000 - 0x03EFFFFF) that hit addresses where there is no RDRAM chip mapped will result in a sort of "no-operation" behavior: reads will return zero, and writes will be ignored. For instance, in a N64 with 8 MiB (through expansion paks), reads at the 9 MiB are not a mirror of the reads at the first MiB: they just return zero because no chips in the RAMBUS will reply to those requests.
 
 
Examples :
 
Line 312 ⟶ 336:
 
* Early version of RCP reserved fewer bits for RDRAM register address (eg. Adr[35:20] = (address >> 9) & 0x3FF; Adr[19:0] = address & 0x1FF) which didn't allow to access RDRAM register 128 (Row register) which is at offset 0x200.
* The presented address map supportshas upspace tofor upto 32x 2x9Mbit RDRAM modules. However, a maximumRI ofonly 8MiBhas isbank accessibletracking withresources thefor IPL3 in all commercial carts8MiB.
* Standard DRAM initialization only supports up to 8 modules, but can mix 2x9Mbit and 1x9Mbit modules. In that case, 2x9Mbit modules are placed before 1x9Mbit modules.
* Standard DRAM initialization procedure, doesn't make use of address swapping feature, evenbecause thoughbank ittracking maydoesn't increasesupport DRAM hit rate according to datasheetsit.
* Register-space addresses duplicates the content between Adr[28:20] and Adr[19:11] to not be affected by RDRAM address swapping features. Indeed, whereas address swapping is desirable for RDRAM memory to benefit from row internal row caching, registers won't benefit from the swapping and would complicate usage of registers in such a case.
 
==== Accesses outside of mapped RDRAM chips ====
Memory-space accesses (0x00000000 - 0x03EFFFFF) that hit addresses where there is no RDRAM chip mapped will result in a sort of "no-operation" behavior: reads will return zero, and writes will be ignored. For instance, in a N64 with 4 MiB (no expansion pak), reads at the 5 MiB are not a mirror of the reads at the first MiB: they just return zero because no chips in the RAMBUS will reply to those requests.
 
The same goes for accessing addresses above 8 MiB, no Rambus device will respond to requests.
 
This is true in theory for RDRAM buses, but there seems to be a weird behavior, at least during reads, causing some areas of the address space to return non-zero values when read. These 32-bit non-zero values can be seen every 0x80 bytes, in an area of 8 KiB, repeating every 512 KiB. The dump below has been taken from a N64; an identical pattern can be observed on different consoles, though an extensive comparison has not been run. You can see the non-zero values present in 32-bit slots every 0x80 bytes (though not all slots contain a value), in range 0 - 8KiB (0x2000), and then repeating again after 512 KiB (0x80000 - 0x82000), and so on every 512 KiB.
 
What seems to happen is that somehow a RDRAM register value is shown as part of a memory read; this is probably a RI bug, but it has not been fully investigated yet. For instance, the value <code>0xb4190010</code> shown at several addresses (eg: 0x1400) is a very common value for the [[RDRAM#0x00 - DeviceType|RDRAM register DeviceType]].
 
<syntaxhighlight lang="text">
00000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00380 78 01 fe 02 00 00 00 00 00 00 00 00 00 00 00 00 |x...............|
*
00d80 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00d90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00e40 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00e50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00f80 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00f90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01380 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01390 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01400 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01410 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01480 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01540 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01550 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01600 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01610 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
016c0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
016d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01740 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01750 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
804c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
806c0 94 01 fe 02 00 00 00 00 00 00 00 00 00 00 00 00 |................|
806d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80700 9c 01 fe 02 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80710 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80780 a6 01 fe 02 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80790 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
807c0 ae 01 fe 02 00 00 00 00 00 00 00 00 00 00 00 00 |................|
807d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80840 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80850 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
808c0 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
808d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80940 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80950 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
809c0 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
809d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80a40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80b80 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80b90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80bc0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80bd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80c40 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80c50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80cc0 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80cd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80d40 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80d50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80dc0 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80dd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80ec0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80ed0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
80fc0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
80fd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81380 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81390 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81480 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
815c0 b4 19 00 10 00 00 00 00 00 00 00 00 00 00 00 00 |................|
815d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
816c0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
816d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81700 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81710 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81780 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81790 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
817c0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
817d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81ac0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81ad0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81b00 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81b10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81b80 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81b90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
81bc0 fe 03 fe 03 00 00 00 00 00 00 00 00 00 00 00 00 |................|
81bd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
</syntaxhighlight>
 
 
= Count =
RCP supports DMA bursts upto a maximum of 128 bytes (16 Octwords)
 
The recommended mapping the Rambus request Count field from the Rambus datasheet is <code>Count = NumBytes + Address[2:0]</code>, as this produces the correct byte masking for writes that aren't 64bit aligned. But RI actually implements this mapping from the RCP as:<syntaxhighlight>
Count[6:3] = NumBytes[6:3]
Count[2:0] = NumBytes[2:0] + Address[2:0]
</syntaxhighlight>Which drops any carries from bit 2 to bit 3. This works fine for unaligned writes that fit within a single 64bit transfer (and all unaligned writes from the CPU fit this rule).
 
But you can use PI to create misaligned DMA bursts of any length from 1 to 128 bytes, and it's possible to cause a dropped carry. Testing shows this results in the DMA transfers of <code>NumBytes - Address[2:0]</code> bytes. It's possible to compensate for this "bug" by increasing the transfer length (at least for short transfers under 128 bytes).
 
SI also allows for misaligned DMA transfers, but exact results haven't been documented. All other devices don't allow the lower bits of address to be set.
 
= RI_SELECT configurations =