Reality Signal Processor/CPU Core: Difference between revisions

Jump to navigation Jump to search
Content added Content deleted
No edit summary
Line 615: Line 615:
</syntaxhighlight>Notice that it is possible to specify the lane syntax for the <code>element</code> field to refer to a specific lane, but if the access is made using <code>llv</code> or <code>ldv</code> (4 or 8 bytes), it will overflow into the following lanes.
</syntaxhighlight>Notice that it is possible to specify the lane syntax for the <code>element</code> field to refer to a specific lane, but if the access is made using <code>llv</code> or <code>ldv</code> (4 or 8 bytes), it will overflow into the following lanes.


===== '''Pseudo-code''' =====
===== Pseudo-code =====
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
addr = GPR[base] + offset * access_size
addr = GPR[base] + offset * access_size
Line 624: Line 624:
</syntaxhighlight>
</syntaxhighlight>


===== '''Description''' =====
===== Description =====
These instructions load a scalar value (1, 2, 4, or 8 bytes) from DMEM into a VPR. Loads affect only a portion of the vector register (which is 128-bit); other bytes in the register are not modified.
These instructions load a scalar value (1, 2, 4, or 8 bytes) from DMEM into a VPR. Loads affect only a portion of the vector register (which is 128-bit); other bytes in the register are not modified.


Line 663: Line 663:
|}
|}


===== '''Assembly''' =====
===== Assembly =====
<syntaxhighlight lang="asm">
<syntaxhighlight lang="asm">
ssv $v01,e(2), 0,s0 ; Store the 16-bit word in the third lane of $v01 into DMEM at address s0
ssv $v01,e(2), 0,s0 ; Store the 16-bit word in the third lane of $v01 into DMEM at address s0
Line 669: Line 669:
</syntaxhighlight>Notice that it is possible to specify the lane syntax for the <code>element</code> field to refer to a specific lane, but if the access is made using <code>slv</code> or <code>sdv</code> (4 or 8 bytes), it will overflow into the following lanes.
</syntaxhighlight>Notice that it is possible to specify the lane syntax for the <code>element</code> field to refer to a specific lane, but if the access is made using <code>slv</code> or <code>sdv</code> (4 or 8 bytes), it will overflow into the following lanes.


===== '''Pseudo-code''' =====
===== Pseudo-code =====
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
addr = GPR[base] + offset * access_size
addr = GPR[base] + offset * access_size
Line 684: Line 684:
The part of the vector register being accessed is <code>VPR[vt][element..element+access_size]</code>, that is <code>element</code> selects the first accessed byte within the vector register. When <code>element+access_size</code> is bigger than 15, the element access wraps within the vector and a full-size store is always performed (eg: <code>slv</code> with <code>element=15</code> stores <code>VPR[vt][15,0..2]</code> into memory, for a total of 4 bytes).
The part of the vector register being accessed is <code>VPR[vt][element..element+access_size]</code>, that is <code>element</code> selects the first accessed byte within the vector register. When <code>element+access_size</code> is bigger than 15, the element access wraps within the vector and a full-size store is always performed (eg: <code>slv</code> with <code>element=15</code> stores <code>VPR[vt][15,0..2]</code> into memory, for a total of 4 bytes).


===== '''Usage''' =====
===== Usage =====
These instructions are seldom used. Normally, it is better to structure RSP code to work across full vectors to maximize parallelism. Data flow between RSP and VR4300 should be structured in vectorized format, so that it is possible to use a vector store (<code>sqv</code>, in case the output is made of 16-bit data) or a packed load (<code>suv</code>/<code>spv</code>, in case the output is made of 8-bit data). Consider also using <code>mfc2</code> to store a 16-bit value from the lane of a VPR into a GPR.
These instructions are seldom used. Normally, it is better to structure RSP code to work across full vectors to maximize parallelism. Data flow between RSP and VR4300 should be structured in vectorized format, so that it is possible to use a vector store (<code>sqv</code>, in case the output is made of 16-bit data) or a packed load (<code>suv</code>/<code>spv</code>, in case the output is made of 8-bit data). Consider also using <code>mfc2</code> to store a 16-bit value from the lane of a VPR into a GPR.