RDRAM

From N64brew Wiki
Jump to navigation Jump to search

Rambus DRAM (or RDRAM) is a type of synchronous dynamic random-access memory (SDRAM) designed by Rambus. The N64 motherboard came with either one or two chips, totaling 4 MB (4,194,304 bytes) of general purpose storage which can be accessed by the CPU. The optional Expansion Pak could increase this by an additional 4 MB and is required for some games to run.

Each byte of RDRAM actually has an extra bit, which can only be used by the RDP and VI core. This 9th bit is used to store things like anti-aliasing coverage in the color buffer. On systems other than the N64, the 9th bit would likely be used for parity checks.

RDRAM system overview

A typical RDRAM system is composed of 3 main elements :

  • a controller, which act as the channel master. This part reside in the RI.
  • the channel, which is a synchronous bus connecting the RDRAM devices together
  • RDRAM modules, each containing memory banks, and some registers.

TODO: insert a small diagram

The N64 system implements the "Base RDRAM" protocol, which is the earliest version of RDRAM protocol. Historical note, latter version of the protocol are "Concurrent RDRAM" and "Direct RDRAM".

Interface Pinouts

Pin Signal RSL I/O Description
1 VDD - +3.3V power supply.
2 GND - Circuit ground.
3 DQ8 Y I/O Signal line (bit 8) for REQ, DIN, and DOUT packets.
4 GND - Circuit ground.
5 DQ7 Y I/O Signal line (bit 7) for REQ, DIN, and DOUT packets.
6 NC* - Not connected.
7 ADDRESS Y I Signal line for COL packets with column addresses.
8 VDD - +3.3V power supply.
9 DQ6 Y I/O Signal line (bit 6) for REQ, DIN, and DOUT packets.
10 GND - Circuit ground.
11 DQ5 Y I/O Signal line (bit 5) for REQ, DIN, and DOUT packets.
12 VDDA - Separate analog power supply for clock generation in the RDRAM.
13 RXCLK Y I Receive clock. All input packets are aligned to this clock.
14 GNDA - Separate analog ground for clock generation in the RDRAM.
15 TXCLK Y I Transmit clock. DOUT packets are aligned with this clock.
16 VDD - +3.3V power supply.
17 DQ4 Y I/O Signal line (bit 4) for REQ, DIN, and DOUT packets.
18 GND - Circuit ground.
19 COMMAND Y I Signal line for REQ, RSTRB, RTERM, WSTRB, WTERM, RESET, and CKE packets.
20 SIN I Initialization daisy chain input. CMOS levels. See section on initialization for more details.
21 VREF I Logic threshold reference voltage for RSL signals.
22 SOUT O Initialization daisy chain output. CMOS levels. See section on initialization for more details.
23 DQ3 Y I/O Signal line (bit 3) for REQ, DIN, and DOUT packets.
24 GND - Circuit ground.
25 DQ2 Y I/O Signal line (bit 2) for REQ, DIN, and DOUT packets.
26 NC - Not connected.
27 DQ1 Y I/O Signal line (bit 1) for REQ, DIN, and DOUT packets.
28 GND - Circuit ground.
29 DQ0 Y I/O Signal line (bit 0) for REQ, DIN, and DOUT packets.
30 NC - Not connected.
31 GND - Circuit ground.
32 VDD - +3.3V power supply.

RSL stands for Rambus Signaling Levels, a low-voltage-swing, active-low signaling technology.

Source: Rambus concurrent RDRAM datasheet [1]

RDRAM registers

Register summary
Number Name Description
0 DeviceType Read-only register which describes RDRAM configuration
1 DeviceId Specifies base address of RDRAM
2 Delay Specifies CAS timing parameters
3 Mode Control operating mode and IOL output current
4 RefInterval Specifies refresh interval for devices that require refresh
5 RefRow Next row and bank to be refreshed
6 RasInterval Specifies AS access interval
7 MinInterval Provides minimum delay information and some special control
8 AddressSelect Specifies Adr field subufield swapping to maximize hit rate
9 DeviceManufacturer Read-only register providing manufacturer and device information
128 Row Address of currently sensed row in each bank

See RI page for details about how they are mapped into CPU address space.

Programming caution :

  • Before reading any RDRAM register content, RDRAM current control must be calibrated
  • Also, it seems that RDRAM register reads should be surrounded by MI_MODE = SET_DRAM_REG / CLR_DRAM_REG

NOTE: In the following register description we will omit the ninth bit which is unused when accessing RDRAM registers, and describe them as a 32bit word instead of 4x{8,9}bit.

TODO: detailed register description, with bit layout and arrows.

AdrS[9:2] 0x00 - DeviceType


DeviceType 0x00
31:24 R-0 R-0 R-0 R-? R-0 R-0 R-0 R-0
Version Type
23:16 U-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
15:8 R-? R-? R-? R-? R-? R-? R-? R-?
BankBits RowBits
7:0 R-? R-? R-? R-? U-0 R-1 U-0 R-?
ColumnBits Bn En
bit 31-28 Version: RDRAM version.
0001 = Extended architecture (Base RDRAM protocol)
bit 27-24 Type: Device type.
0000 = RDRAM device
bit 15-12 BankBits: Number of bank address bits, or said differently, declares that this RDRAM device has 2^BankBits banks.
bit 11-8 RowBits: Number of row address bits, or said differently, declares that this RDRAM devices has 2^RowBits rows per bank.
bit 7-4 ColumnBits: Number of column address bits, or said differently, declares that this RDRAM device has 2^ColumnBits bytes per row.
bit 2 Bn: Bonus, number of bits per byte.
0 = 8bit byte
1 = 9bit byte
bit 0 En: Enhanced speed grade.
0 = Normal
1 = Low Latency

AdrS[9:2] 0x01 - DeviceId


DeviceId 0x01
31:24 RW-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
IdField[35]
23:16 RW-0 RW-0 RW-0 RW-0 RW-0 RW-0 RW-0 RW-0
IdField[34:27]
15:8 RW-0 U-0 U-0 U-0 U-0 U-0 U-0 U-0
IdField[26]
7:0 RW-0 RW-0 RW-0 RW-0 RW-0 U-0 U-0 U-0
IdField[25:20]
bit 31,23-16,15,7-3 IdField[35:k]: Compared to AdrS[35:k] to select RDRAM.
k = 21 for 16/18Mbit RDRAM.
k = 20 for 8/9Mbit RDRAM.

TODO: Delay register

AdrS[9:2] 0x03 - Mode


Mode 0x03
31:24 RW-1 RW-1 U-0 U-0 U-0 U-0 U-0 U-0
C3 C0
23:16 RW-1 RW-1 U-0 U-0 U-0 U-0 U-0 U-0
C4 C1
15:8 RW-1 RW-1 U-0 U-0 RW-0 U-0 U-0 U-0
C5 C2 AD
7:0 RW-1 RW-1 RW-0 R-0 RW-0 RW-1 RW-0 RW-0
CE X2 PL SV SK AS DE LE
bit 15,23,31,14,22,30 C[5:0] / CCValue: Current Control value which controls in fine the output current IOL.
In manual mode (CE=0), IOL is proportional to (63-CC) with IOL ~ (0.95±0.3)×(63-CC) mA, for CC = 0..63. (These coefficients derive from Imax±△/63)
This field is inverted when read.
In auto mode (CE=1), IOL is proportional to (63-CC) with IOL ~ (1.25±0.1)×(63-CC) mA, for CC = 31..63. (These coefficients derive from I40±△/(63-31))
An internally generated value is returned when read.
bit 11 AD / AckDis: For low latency RDRAM only. Allows to supress acknowledge response when set to 1.
bit 7 CE / CCEnable: Current Control Enable.
0 = manual
1 = auto
bit 6 X2 / CCMult: Should be 1. Inverted when read.
bit 5 PL: Select PowerDown Latency
bit 4 SV / SkipValue: For tests. 0
bit 3 SK / Skip: For tests. 0
bit 2 AS / AutoSkip: For tests. 1
bit 1 DE / DeviceEnable: Enable RDRAM device. When disabled, only broadcast register requests can be executed.
0 = disabled
1 = enabled
bit 0 LE: Enable PowerDown mode for RDRAM that supports it to reduce power consumption.

RDRAM addressing

Warning : In this paragraph, we describe RDRAM addressing within the RDRAM protocol. This is not to be confused with RDRAM addresses "as seen" by the CPU or RCP. See RI memory addressing paragraph for details about how the RI converts addresses between the two address spaces.

RDRAM protocol addresses RDRAM memory and registers using a 36bit address and a variety of commands :

  • many types of memory read
  • many types of memory write
  • register read
  • register write
  • broadcast register write (all connected RDRAM will write the same value to the specified register)

The higher part of the address identify an RDRAM device, the lower part is an offset within the device (in register-space for register commands, and memory space for memory commands).

The procedure of identifying which RDRAM device is addressed by a given command + address is call Id matching.

It works as follow:

Given a 36 bit address Adr[35:0], we compute a "partially bit-swapped" AdrS[35:0] such that bits [28:20] and bits [19:11] are swapped on a bit by bit bases based on the value of SwapField (from AddressSelect register). Bits [35:29] and [10:0] are left untouched. This swapping of bits provides a flexible way of remapping addresses across banks of a given device and across devices to benefit from internal row caching. This can help increase DRAM hit rate in several applications.

The upper 16 bits (or 15bits for 2x{8,9}Mbit devices) of AdrS are then compared to IdField contained in DeviceId register. If both are equal the RDRAM device has a Id Match.

More formally this can be written as follow :

AdrS[35:29] = Adr[35:29]
AdrS[28:20] = ( SwapField[8:0] & Adr[19:11]) | (~SwapField[8:0] & Adr[28:20])
AdrS[19:11] = (~SwapField[8:0] & Adr[19:11]) | ( SwapField[8:0] & Adr[28:20])
AdrS[10:0] = Adr[10:0]
IdMatch = AdrS[35:M] == IdField[35:M]
with M = 21 for 2x{8,9]Mbit modules, M=20 for 1x{8,9}Mbit modules.

Remark : An IdMatch doesn't mean necessarily that the RDRAM device will act on the request, and conversely a non matching RDRAM device can still act on a request. Other factors such as DeviceEnable bit from ModeRegiter, SIn pinout can inhibit a request, and the broadcast register write can force a request even on non matching RDRAM device.

Current Control calibration

Any RDRAM device (module and controller) wishing to "talk" on the RDRAM channel must configure its output current IOL controlled by the current control (CC for short) register.

2 modes are possible to configure the current control register :

  1. Manual mode. In this mode, the value of the current control register is linearly correlated to IOL, such that IOL @ CC=63 -> 0mA, and IOL @ CC=0 -> Imax (Imax will vary between RDRAM due to process differences). In this mode, fluctuation due to temperature, change over time and are not compensated, so it may require a manual periodic readjustment.
  2. Automatic mode. In this mode, small fluctuations are automatically corrected, so no further readjustment should be required. The relation between CC value and IOL is still mostly linear but with a different slope. Note also that, in this mode, the CC register value read will be an internally generated one, not the one used to program the CC register.

The purpose of the CC calibration procedure is to find the CC value in Automatic mode that maximize the signal margin.


TODO : continue description, and describe calibration algorithm

Known RDRAM Console Chip Configurations

N64 Version Board revision Region Number of RDRAM Chips Size per Chip(Mbytes) Size per Chip(Mbytes)
NUS-001 (P) - 01 PAL 2 2.25Megabytes 18Mbit
NUS-002 (P) - 02 PAL 1 4.5Megabytes 36Mbits

Known RDRAM Expansion Pak Configurations

Expansion Pak Type Number of RDRAM Chips Size per Chip(Mbytes) Size per Chip(Mbytes)
Transfer Pak (Nintendo Official) 0 0 0
Expantion Pak (Nintendo Official) 1 4.5Megabytes 36Mbits

There are 3rd party Expansion Paks that have 2 chips which are both 2.25Megabytes each. Please provide images and makers here.

Initialization Sequence

This Initialization sequence is based on the 6102 CIC boot code

File:Cncrntug.pdf

Expansion Pak Detection

The typical way to detect how much memory is installed is to probe it.

LibUltra provides a function called osGetMemSize() which does this. The function writes different values at addresses in the uncached KSEG1 direct map, starting at 0xa0300000, and then reads the values back. It tries successively higher addresses, jumping by 1 MB each time through the loop. It returns the amount of RAM which it successfully wrote and read back, rounded up to a number of megabytes.

// C-like pseudocode...
u32 osGetMemSize(void) {
    // Base address of RAM in kseg1.
    uintptr_t base_addr = 0xa0000000;
    uintptr_t megabyte = 1024 * 1024;
    // Address where we will probe.
    uintptr_t cur_addr = kseg1 + 3 * megabyte;
    while (true) {
        write to addr;
        read from addr;
        if (value read != value written) {
            break;
        }
        cur_addr += megabyte;
    }
    return cur_addr - base_addr;
}

During boot, IPL3 will also write the amount of RAM available, in bytes, to a 32-bit value at address 0x80000318 (or 0x800003f0, for CIC 6105). On retail hardware, this should always have the value 0x400000 (no expansion pak) or 0x800000 (expansion pak). When using LibUltra, this variable can be accessed with the name osMemSize, which is defined like this:

extern u32 osMemSize;

LibDragon provides the the amount of memory installed with the get_memory_size() function.

Drawbacks and Limitations

Opinion

RDRAM has excellent data transfer speed for the era (bytes per second) but due to the protocol used and serial interface, memory transactions were somewhat slower (how much time it took from starting a read/write operation to finishing it). In practice, you may find that the available memory bandwidth is a limiting factor for the performance of your game. See: How fast was Rambus compared to regular EDO RAM?
— Vanadium

Datasheets

Several manufacturers produced compatible "Base RDRAM" modules such as :

Reference : [2]