Reality Signal Processor/CPU Core: Difference between revisions

Single-lane instructions are an instruction group that perform operations on a single lange of a single input register (<code>VT<se></code>), and store the result into a single lane of a single output register (<code>VD<de></code>).
 
Example syntax:<syntaxhighlight lang="asm">
<code>vt_elem</code> and <code>vd_elem</code> are used to compute <code>se</code> and <code>de</code> that is to specify which lane, respectively of the input and output register, is affected.
VMOV $v01, e(4), $v05, e(6)
</syntaxhighlight>In this example, the value in lane <code>$v05<6></code> is moved to lane <code>$v01<4></code>. In the assembly syntax, the [[Reality Signal Processor/CPU Core#Broadcast modifiers|broadcast modifier syntax]] is used, but no actual broadcast is performed, as the instructions operate on the single specified lane. Only the single-lane broadcast modifiers (<code>e(0)</code> ... <code>e(7)</code>) are supported.+
 
In the opcode, the fields <code>vt_elem</code> and <code>vd_elem</code> are used to compute <code>se</code> and <code>de</code> that is to specify which lane, respectively of the input and output register, is affected.
<code>vd_elem</code> is 4 bits long (range 0..15); the highest bit is always ignored so the destination lane <code>de</code> is computed from the lowest 3 bits.
 
<code>vd_elem</code> is 45 bits long (range 0..1531); the highest bitbits isare always ignored, soand the destination lane <code>de</code> is computedsimply from the lowest 3 bits<code>vd_elem(2..0)</code>.
<code>vt_elem</code> is 5 bits long (range 0..31). <code>vt_elem(4)</code> must be zero. When <code>vt_elem(3)</code> is 1, <code>vt_elem(2..0)</code> is actually used as source lane <code>se</code>, as expected. When <code>vt_elem(3)</code> is 0, a hardware bug is triggered and portions of the lower bits of <code>vt_elem</code> are replaced with portion of the bits of <code>vd_elem</code> while computing <code>se</code>. Specifically, all bits in <code>vt_elem</code> from the topmost set bit and higher are replaced with the same-position bits in <code>vd_elem</code>. Notice that this behaviour is actually consistent with what happens when <code>vt_elem(3)</code> is 1, which means that there is no need to think of it as a special-case. Pseudo-code:
 
<code>vt_elem</code> is 54 bits long (range 0..3115). <code>vt_elem(4)</code> must be zero. When <code>vt_elem(3)</code> is 1, <code>vt_elem(2..0)</code> is actually used as source lane <code>se</code>, as expected. When <code>vt_elem(3)</code> is 0, a hardware bug is triggered and portions of the lower bits of <code>vt_elem</code> are replaced with portion of the bits of <code>vd_elem</code> while computing <code>se</code>. Specifically, all bits in <code>vt_elem</code> from the topmost set bit and higher are replaced with the same-position bits in <code>vd_elem</code>. Notice that this behaviour is actually consistent with what happens when <code>vt_elem(3)</code> is 1, which means that there is no need to think of it as a special-case. Pseudo-code:
<code>de(2..0) = vd_elem(2..0)
msb = highest_set_bit(vt_elem)
se(2..0) = vd_elem(2..msb) || vt_elem(msb-1..0)</code>
TODO: complete analysis for <code>vt_elem(4)</code> == 1.
{| class="wikitable"
|+Single-lane instructions