Vector Saturating Rounding Doubling Multiply Returning High Half. VQRDMULH multiplies corresponding elements in two vectors, doubles the results, and places the most significant half of the final results in the destination vector.
implement reference code
https://github.com/google/gemmlowp/blob/master/fixedpoint/fixedpoint.h#L329
<code>
</code>
1. ab_x2_high32 computed by divides "1<<31", not "1<<32", why?
Ans: because there are two sign bits after multiple two fixpoint value, the most significant half is starting from second MSB
2. if ab_64>=0, why does it need to add rounding with 1<<30? not 1<<31?
same
Ans: like the above answer, although the final result is [63:0], but the most significant bit is [62:0], so corresponding to the most significant half, the rounding is 1<<30.
Note: VQRDMULH likes to RISCV RVV's VSMUL
https://github.com/google/gemmlowp/blob/master/fixedpoint/fixedpoint.h#L329
<code>
| // This function implements the same computation as the ARMv7 NEON VQRDMULH // instruction. | |
| template <> | |
| inline std::int32_t SaturatingRoundingDoublingHighMul(std::int32_t a, | |
| std::int32_t b) { | |
| bool overflow = a == b && a == std::numeric_limits<std::int32_t>::min(); | |
| std::int64_t a_64(a); | |
| std::int64_t b_64(b); | |
| std::int64_t ab_64 = a_64 * b_64; | |
| std::int32_t nudge = ab_64 >= 0 ? (1 << 30) : (1 - (1 << 30)); | |
| std::int32_t ab_x2_high32 = | |
| static_cast<std::int32_t>((ab_64 + nudge) / (1ll << 31)); | |
| return overflow ? std::numeric_limits<std::int32_t>::max() : ab_x2_high32; | |
| } |
1. ab_x2_high32 computed by divides "1<<31", not "1<<32", why?
Ans: because there are two sign bits after multiple two fixpoint value, the most significant half is starting from second MSB
2. if ab_64>=0, why does it need to add rounding with 1<<30? not 1<<31?
same
Ans: like the above answer, although the final result is [63:0], but the most significant bit is [62:0], so corresponding to the most significant half, the rounding is 1<<30.
Note: VQRDMULH likes to RISCV RVV's VSMUL
| When multiplying two N-bit signed numbers, the largest magnitude is obtained for -2N-1 * -2N-1 producing a result +22N-2, which has a single (zero) sign bit when held in 2N bits. All other products have two sign bits in 2N bits. To retain greater precision in N result bits, the product is shifted right by one bit less than N, saturating the largest magnitude result but increasing result precision by one bit for all other products. |
# Signed saturating and rounding fractional multiply vsmul.vv vd, vs2, vs1, vm # vd[i] = clip((vs2[i]*vs1[i]+round)>>(SEW-1)) vsmul.vx vd, vs2, rs1, vm # vd[i] = clip((vs2[i]*x[rs1]+round)>>(SEW-1))
沒有留言:
張貼留言