COP1: Difference between revisions

From N64brew Wiki
Jump to navigation Jump to search
Content added Content deleted
(First draft - not linking this from other pages yet as it's not yet done)
 
No edit summary
Line 49: Line 49:


== Rounding modes and inexact results ==
== Rounding modes and inexact results ==
The conversions mentioned above, but also most regular instructions can be lossy. When that happens, the COP1 has to perform some sort of rounding to fit the result in the destination. It provides four modes:
Most of conversions mentioned above, but also most regular instructions can be lossy. When that happens, the COP1 has to perform some sort of rounding to fit the result in the destination. It provides four modes:
* ROUND: Round towards nearest number (e.g. 4.4 => 4 and 4.6 => 5),
* ROUND: Round towards nearest number (e.g. 4.4 => 4 and 4.6 => 5),
* TRUNC: Round towards zero (e.g. 4.9 => 4 and -4.9 => 4),
* TRUNC: Round towards zero (e.g. 4.9 => 4 and -4.9 => 4),
Line 55: Line 55:
* FLOOR: Round towards smaller number (e.g. 4.9 => 4 and -4.9 => -5).
* FLOOR: Round towards smaller number (e.g. 4.9 => 4 and -4.9 => -5).


The COP1 has a configurable rounding mode in FCSR (see below), which is applied for most instructions where its applicable. For the specific case of float->int conversions, it provides specialized instructions that overwrite the global rounding mode: ROUND.x.y, TRUNC.x.y, CEIL.x.y, FLOOR.x.y (where x is either W or L and y is either S or D; all 16 combinations are supported).
The COP1 has a configurable rounding mode in FCSR (see below), which is applied for most instructions where it's applicable. For the specific case of float->int conversions, it provides specialized instructions that overwrite the global rounding mode: ROUND.x.y, TRUNC.x.y, CEIL.x.y, FLOOR.x.y (where x is either W or L and y is either S or D; all 16 combinations are supported).


When rounding happens, inexact is signaled (see exceptions below).
When rounding happens, inexact is signaled (see exceptions below).


== FCSR ==
== FCSR ==
In addition to the data registers, the COP1 also provides the Floating Point Status Register, is read via CFC1 and written through CTC1 (both using index 31). It provides the following bits:
In addition to the data registers, the COP1 also provides the Floating Point Status Register, is read via CFC1 and written through CTC1 (using index 31). It provides the following bits:


{| class="wikitable"
{| class="wikitable"
Line 109: Line 109:
The COP1 supports 6 exceptions:
The COP1 supports 6 exceptions:
* Inexact: The destination can't hold the full result, so some data loss occurred and rounding was performed.
* Inexact: The destination can't hold the full result, so some data loss occurred and rounding was performed.
* Underflow: The resulting number was so small it was rounded down to 0. This is always in combination with inexact. (The COP1 has a quirk here: Unlike other CPUs like x64 and arm64, the rounding modes FLOOR/CEIL are taken literally even on underflow; if the result is smaller than the smallest possible float, it might not be rounded to 0 but to the minimum regular float).
* Underflow: The resulting number was so small it was rounded to 0. This is always in combination with inexact. (The COP1 has a quirk here: Unlike other CPUs (e.g. x64 or arm64), the rounding modes FLOOR/CEIL are taken literally even on underflow; if the result is smaller than the smallest possible float, it might not be rounded to 0 but to the minimum regular float).
* Overflow: The resulting number was so large it couldn't be represented as a regular number and was instead "rounded up to infinity" (which is a special floating point value). This is always in combination with inexact.
* Overflow: The resulting number was so large it couldn't be represented as a regular number and was instead "rounded up to infinity" (which is a special floating point value). This is always in combination with inexact.
* Division By Zero: This just happens for DIV.S and DIV.D when the divisor is 0.
* Division By Zero: This just happens for DIV.S and DIV.D when the divisor is 0.
* Invalid Operation: This happens in a bunch of special cases (see "special cases" below).
* Invalid Operation: This happens in a bunch of special cases (see "Special Cases" below).
* Unimplemented Operation: This happens in a bunch of special cases (see "special cases" below).
* Unimplemented Operation: This happens in a bunch of special cases (see "Special Cases" below).


For instructions that can fire exceptions (e.g. ADD.S, CVT.S.W) the process is roughly as follows (in this example, Inexact is being signaled):
Instructions that can fire exceptions (e.g. ADD.S, CVT.S.W) always clear all Cause bits that aren't being signaled in this specific instructions. For example, CVT.W.S from 5.5 to an int would affect the bits in the following way:
* Clear all Cause bits
* Clear all Cause bits
* Perform operation
* Perform operation
* Set Inexact Cause
* Set "Cause: Inexact"
* If Inexact is Enabled, fire exception. Otherwise, set Inexact Flag
* If "Enable: Inexact" is true, fire exception. Otherwise, set "Flag: Inexact"


This means that after several instructions, Flags are cumulative: They are true if any previous instruction signaled that exception (assuming it was disabled). Cause on the other hand exclusively has information on the preceding instruction.
This means that Cause be looked at to see the result of the directly preceding instruction. Flags however are cumulative: They are true if any instruction since the last clear signaled that exception, assuming the exception was disabled.


Unimplemented Operation is special as it can't be disabled - if it happens, it will always fire.
Unimplemented Operation is special as it can't be disabled - if it happens, it will always fire.


== Floating Point Numbers ==
== Floating Point Numbers ==
Before going into details about Invalid Operation and Unimplemented Operation, it makes sense to take a quick look at what floats actually are. This is the definition of a single (doubles work exactly the same, but have more bits in the exponent and the mantissa):
At this point, it makes sense to take a quick look at what floats actually are. The following is the bit representation of a single (doubles work exactly the same, but have more bits in the exponent and the mantissa):


{| class="wikitable"
{| class="wikitable"
Line 133: Line 133:
! 31 !! 30 - 23 !! 22 - 0
! 31 !! 30 - 23 !! 22 - 0
|-
|-
| Sign bit || Exponent (8 bits) || Mantissa (23 bits)
| Sign (1 bit) || Exponent (8 bits) || Mantissa (23 bits)
|}
|}


Line 139: Line 139:


There are some special cases for the exponent and mantissa:
There are some special cases for the exponent and mantissa:
{| class="wikitable"
|+ Special Numbers
|-
! Sign bit !! Exponent !! Mantissa !! Description
|-
| 0 || 0 || 0 || Regular zero
|-
| 1 || 0 || 0 || "Negative zero", which is considered equal to regular zero
|-
| any || 0 || != 0 || Denormal/subnormal.
|-
| 0 || 0xFF || 0 || Positive Infinity
|-
| 1 || 0xFF || 0 || Negative Infinity
|-
| any || 0xFF || != 0 with highest bit 0 || sNAN (signaling Not-A-Number)
|-
| any || 0xFF || != 0 with highest bit 1 || qNAN (quiet Not-A-Number)
|-
| 0 || 0<x<0xFF || any || A regular positive number
|-
| 1 || 0<x<0xFF || any || A regular negative number
|}


* Exponent=0 and Mantissa=0: The number is 0.0 (if sign is 0) or -0.0 (negative zero if sign is 1). Note that for all intents and purposes, -0 is considered equal to 0.
* Exponent=0 and Mantissa=0: The number is 0.0 (if sign is 0) or -0.0 (negative zero if sign is 1). Note that for all intents and purposes, -0 is considered equal to 0.
* Exponent=0 and Mantissa!=0: The number is a denormal or subnormal. If a number like this is given to an calculating instruction, an Unimplemented Operation is signaled.
* Exponent=0 and Mantissa!=0: The number is a denormal or subnormal. If a number like this is given to an calculating instruction, an Unimplemented Operation is signaled.

Revision as of 23:43, 1 November 2022

Overview

The COP1 is the FPU of the main CPU. It operates on floats (either 32 bit singles or 64 bit doubles).

Just like the main CPU, the COP1 has 32 registers which are each 64 bit wide. Unlike the main CPU registers, all registers are equal (there is no zero register).

Getting data to/from the COP1

Numbers can be passed from main registers to FPU registers via MTC1 (32 bit) and DMTC1 (64 bit). For the way back, use MFC1/DMFC1. Alternatively, numbers can be loaded from RAM via LWC1/LDC1 and stored via SWC1/SDC1.

Supported formats and conversions

The instructions above simply transfer bits. In order to actually calculate, they need to be interpreted correctly. The COP1 understands four formats:

Supported formats
Name Abbreviation Explanation
Single S 32 bit float
Double D 64 bit double
Word W 32 bit integer
Long Long 64 bit integer

The COP1 can only perform calculations on singles and doubles; word and long are temporary formats merely used for conversion.

Example: The following snippet puts 6 into V0, which is then moved to the COP1 into F0. It then converts that number to a double and puts it into F2. At the end, F2 will have the value 6.0:

ORI V0, R0, 6
MTC1 V0, F0
CVT.D.W F2, F0   // read: convert to Double from Word

The COP1 supports almost all conversions:

Supported conversions
From Single From Double From Word From Long
To Single N/A CVT.S.D CVT.S.W CVT.S.L
To Double CVT.D.S N/A CVT.D.W CVT.D.L
To Word CVT.W.S CVT.W.D N/A (doesn't exist)
To Long CVT.L.S CVT.L.D (doesn't exist) N/A

Rounding modes and inexact results

Most of conversions mentioned above, but also most regular instructions can be lossy. When that happens, the COP1 has to perform some sort of rounding to fit the result in the destination. It provides four modes:

  • ROUND: Round towards nearest number (e.g. 4.4 => 4 and 4.6 => 5),
  • TRUNC: Round towards zero (e.g. 4.9 => 4 and -4.9 => 4),
  • CEIL: Round towards larger number (e.g. 4.1 => 5 and -4.1 => -4),
  • FLOOR: Round towards smaller number (e.g. 4.9 => 4 and -4.9 => -5).

The COP1 has a configurable rounding mode in FCSR (see below), which is applied for most instructions where it's applicable. For the specific case of float->int conversions, it provides specialized instructions that overwrite the global rounding mode: ROUND.x.y, TRUNC.x.y, CEIL.x.y, FLOOR.x.y (where x is either W or L and y is either S or D; all 16 combinations are supported).

When rounding happens, inexact is signaled (see exceptions below).

FCSR

In addition to the data registers, the COP1 also provides the Floating Point Status Register, is read via CFC1 and written through CTC1 (using index 31). It provides the following bits:

FCSR
Bits Description
0 to 1 RoundingMode: Nearest (ROUND) = 0, Zero (TRUNC) = 1, PositiveInfinity (CEIL) = 2, NegativeInfinity (FLOOR) = 3
2 Flag: Inexact Operation
3 Flag: Underflow
4 Flag: Overflow
5 Flag: Division By Zero
6 Flag: Invalid Operation
7 Enable: Inexact Operation
8 Enable: Underflow
9 Enable: Overflow
10 Enable: Division By Zero
11 Enable: Invalid Operation
12 Cause: Inexact Operation
13 Cause: Underflow
14 Cause: Overflow
15 Cause: Division By Zero
16 Cause: Invalid Operation
17 Cause: Unimplemented Operation
23 Condition
24 Flush Denorm To Zero

Exceptions Overview

The COP1 supports 6 exceptions:

  • Inexact: The destination can't hold the full result, so some data loss occurred and rounding was performed.
  • Underflow: The resulting number was so small it was rounded to 0. This is always in combination with inexact. (The COP1 has a quirk here: Unlike other CPUs (e.g. x64 or arm64), the rounding modes FLOOR/CEIL are taken literally even on underflow; if the result is smaller than the smallest possible float, it might not be rounded to 0 but to the minimum regular float).
  • Overflow: The resulting number was so large it couldn't be represented as a regular number and was instead "rounded up to infinity" (which is a special floating point value). This is always in combination with inexact.
  • Division By Zero: This just happens for DIV.S and DIV.D when the divisor is 0.
  • Invalid Operation: This happens in a bunch of special cases (see "Special Cases" below).
  • Unimplemented Operation: This happens in a bunch of special cases (see "Special Cases" below).

Instructions that can fire exceptions (e.g. ADD.S, CVT.S.W) always clear all Cause bits that aren't being signaled in this specific instructions. For example, CVT.W.S from 5.5 to an int would affect the bits in the following way:

  • Clear all Cause bits
  • Perform operation
  • Set "Cause: Inexact"
  • If "Enable: Inexact" is true, fire exception. Otherwise, set "Flag: Inexact"

This means that Cause be looked at to see the result of the directly preceding instruction. Flags however are cumulative: They are true if any instruction since the last clear signaled that exception, assuming the exception was disabled.

Unimplemented Operation is special as it can't be disabled - if it happens, it will always fire.

Floating Point Numbers

At this point, it makes sense to take a quick look at what floats actually are. The following is the bit representation of a single (doubles work exactly the same, but have more bits in the exponent and the mantissa):

Single
31 30 - 23 22 - 0
Sign (1 bit) Exponent (8 bits) Mantissa (23 bits)

If the sign bit is 0, the number is positive. If it is 1, the number is negative (because of this, a floating point number is easily negated - just XOR with 0x80000000).

There are some special cases for the exponent and mantissa:

Special Numbers
Sign bit Exponent Mantissa Description
0 0 0 Regular zero
1 0 0 "Negative zero", which is considered equal to regular zero
any 0 != 0 Denormal/subnormal.
0 0xFF 0 Positive Infinity
1 0xFF 0 Negative Infinity
any 0xFF != 0 with highest bit 0 sNAN (signaling Not-A-Number)
any 0xFF != 0 with highest bit 1 qNAN (quiet Not-A-Number)
0 0<x<0xFF any A regular positive number
1 0<x<0xFF any A regular negative number


  • Exponent=0 and Mantissa=0: The number is 0.0 (if sign is 0) or -0.0 (negative zero if sign is 1). Note that for all intents and purposes, -0 is considered equal to 0.
  • Exponent=0 and Mantissa!=0: The number is a denormal or subnormal. If a number like this is given to an calculating instruction, an Unimplemented Operation is signaled.
  • Exponent=0xFF and Mantissa==0: The number is INFINITY (if sign is 0) or -INFINITY (if sign is 1).
  • Exponent=0xFF and Mantissa!=0: The number is a NAN (Not a Number), which indicates an incorrect result (this is for example the result of 0.0/0.0 or sqrt(-2). There are two varieties of NAN, which are differented by the most significant bit of the mantissa: msb=1 is qNAN (quiet) and msb=0 is sNAN (signaling). Any sNAN that is given to a calculating instruction will immediately trigger an UnimplementedOperation. qNAN as input will cause the output of the instruction to be qNAN (though with a different payload) and no exception will be signaled.
  • Anything else: This is a regular floating point number.