From N64brew Wiki
Jump to navigation Jump to search

This page tries to provide a detailed answer to frequently asked questions on Nintendo 64. Some of these have spawned myths over the years in online communities, so this page hopefully answers properly and debunks some of them.

What is the maximum cartridge size (ROM) supported by a Nintendo 64?

Short answer: 4 GiB, without bank switchers. Unlimited otherwise.

Most people seem to believe that Nintendo 64 is somehow limited to cartridge of 64 MiB. The origin of this belief is likely the fact that 64 MiB is the maximum size used by retail games (such as Conker: Bad Fur Day). In fact, there is no such a hardware limit. The cartridge is accessed via the PI bus, a serial bus that allows for full 32-bit addresses, accessible via DMA: so a hardware cartridge can reply with data to any address in the 32-bit range, that is a total of 4 GiB. ROMs must have a valid header at address 1000'0000, but besides that, there is absolutely no constraint: a cartridge could also reply with data in the range 0000'0000 - 0FFF'FFFF, as long as the application itself knows about it and retrieves it when necessary.

Others seem to believe the maximum size is 252 MiB. That belief comes from the fact that some of the address space of the PI bus is also memory-mapped to the VR4300 (and that includes normally the whole ROM too), and in fact 252 MiB of it is memory mapped in the physical range 0x10000000-0x1FBFFFFF but that is not a limit or a constraint in any way. There is no technical constraints preventing a cartridge to expose a larger area to the PI bus (up to 4 GiB). It is true that this full 32-bit address space is accessible only via DMA, but that is actually the main (and almost only) way a ROM is normally accessed: direct I/O accesses via CPUs in memory mapped areas are rather slow, cannot be cached, and only work correctly with 32-bit access size. So in practice they are rarely used.

At the hardware level, the presence of a serial bus means that it is possible to split ROM contents across a different array of chips if required; as long as the PI bus decoding logic knows how to map each address to the correct chip, it will be fine.

Homebrew developers should anyway carefully consider going past 64 MiB with their ROMs. In fact, most emulators do not support ROMs beyond that limit (even though the modification is expected to be quick once they are made aware that it is not a hardware limit), and flashcarts commonly used to play home-brew productions on hardware (such as EverDrive 64 or SummerCart 64) only have about 64 MiB of SDRAM to keep ROM contents, so they do not support larger ROMs as well.

Would it be possible to create a larger RAM expansion for N64, to go beyond the total 8 MiB we can get today with the Expansion Pak?

Short answer: Currently, it is thought to be impossible because of a physical hardware limit of the RDRAM controller (RI) within the RCP chip.

The RDRAM chips are connected to the RCP via a bus called RAMBUS. This bus allows to connect multiple chips to a controller; the controller can then talk to each chip and configure it to reply to a certain range of addresses (that is, "map it" into a memory map).

The RDRAM initialization is performed by IPL3, a piece of the Nintendo 64 secure boot code (there are a few variants to its contents but the differences are not related to RAM management). IPL3 does the RDRAM chip initialization using a process called "current calibration", and then map them into the (phyisical) address space, by giving to each chip its own address. The code in IPL3 is ready to handle 1 MiB and 2 MiB chips, but it does currently map only enough chips until 8 MiB is reached.

For a long time, it was then believed that changing IPL3 would be enough to allow more chips to be mapped, assuming somebody built an expansion pak card with more chips in it. This is made difficult because IPL3 contents are verified through a hash and are part of the secure boot chain, so even if the hash can now be bruteforced with GPUs (and has been done a few times as proof of concepts), nobody has still written and released an open source IPL3 boot code to tinker with.

This notwithstanding, Mazamars312 has done investigations while implementing his N64-on-FPGA system, and has reported that, even if the chips are mapped to more than 8 MiB, it seems like the RCP chip itself (specifically, the RI subsystem which is in charge of converting VR4300 memory accesses to RAMBUS packets) is physically limited to managing requests in the first 8 MiB range. That is, even if the VR4300 generates accesses beyond 8 MiB, the RI refuses to generate the corresponding RAMBUS packets to talk to the chips that have been mapped there. In other words, the 8 MiB limit seems hardcoded in the RCP chip. If there is a way to unlock this limit via RI hardware registers, it is not known at this time.

The tests Mazamars312 conducted are currently not reproducible with open source source and tools. Libdragon is planning to eventually release an open source IPL3, at which point it would be easier for others to experiment with the RDRAM initialization sequence.

Is it true that the RSP has hardware MPEG-1 acceleration? Is it used by full motion videos in Resident Evil 2?

Short answer: RSP has two couples of opcodes (VMULQ/VMACQ and VRNDN/VRNDP) that are meant to simplify implementation of a very small part of the MPEG-1 decoder (inverse quantization and oddification of IDCT coefficients). They are not used by Resident Evil 2 or any other commercial game though.

RSP is the vector coprocessor in Nintendo 64. It is well designed to also accelerate video codecs such as those of the MPEG family. To do so, though, careful assembly code must be written using the specific RSP vector opcodes to perform the various operations required for a video decoder. When RSP was designed by SGI (in the early 90s), only MPEG1 existed as a finished standard (MPEG2 was finalized in 1994, after the RSP design was frozen), so SGI designers decided to add a couple of opcodes to the instruction set to help implementing a part of the MPEG1 decoder: specifically, the algorithm that performs inverse quantization and oddificaton of IDCT coefficients. This is actually a quite specific part of the whole pipeline, and it is not even the most resource intensive one. It is hard to guess why the SGI designers thought it was important to specifically add these instructions or speed up this specific part of the pipeline.

Resident Evil 2, which is a marvelous example of careful code crafting, did manage to cram full motion videos together with the whole game in the limited 64 MiB cartridge. To do so, they encoded them with MPEG1 at a low resolution, bitrate and frame rate (around 15 fps), and then performed a nice interpolation (cross-fade) between frames using RDP. The MPEG decoder was not specifically accelerated though (if not for the final YUV to RGB conversion, which is technically not even part of MPEG): they simply recompiled a C player that worked good enough at that bitrate, so they did not get to use the special RSP instructions for it.

In modern times, libdragon provides a fully accelerated MPEG decoder that uses RSP and also uses the special RSP instructions. This can also be used as further proof that the instructions SGI designed were indeed useful for their goal, albeit for a very small part of the whole algorithm.