Convert 32-bit floating point value
- RISCV_DSP_ATTRIBUTE void riscv_float_to_f16 (const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)
- RISCV_DSP_ATTRIBUTE void riscv_float_to_f64 (const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q15 (const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q31 (const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q7 (const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)
- group float_to_x
Functions
- RISCV_DSP_ATTRIBUTE void riscv_float_to_f16 (const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)
Converts the elements of the floating-point vector to f16 vector.
Converts the elements of the floating-point vector to Q31 vector.
- Parameters
pSrc – [in] points to the f32 input vector
pDst – [out] points to the f16 output vector
blockSize – [in] number of samples in each vector
- RISCV_DSP_ATTRIBUTE void riscv_float_to_f64 (const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)
Converts the elements of the floating-point vector to f64 vector.
Converts the elements of the floating-point vector to 64 bit floating-point vector.
- Parameters
pSrc – [in] points to the f32 input vector
pDst – [out] points to the f64 output vector
blockSize – [in] number of samples in each vector
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q15 (const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)
Converts the elements of the floating-point vector to Q15 vector.
- Details
The equation used for the conversion process is:
- Scaling and Overflow Behavior
The function uses saturating arithmetic. Results outside of the allowable Q15 range [0x8000 0x7FFF] are saturated.
Note
In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.
- Parameters
pSrc – [in] points to the floating-point input vector
pDst – [out] points to the Q15 output vector
blockSize – [in] number of samples in each vector
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q31 (const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)
Converts the elements of the floating-point vector to Q31 vector.
- Details
The equation used for the conversion process is:
- Scaling and Overflow Behavior
The function uses saturating arithmetic. Results outside of the allowable Q31 range[0x80000000 0x7FFFFFFF] are saturated.
Note
In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.
Note
If the input float values are very big (2**32) then the function won’t be able to saturate to the right values. If you expect very big float values in the input array then you should force those values to +1 or -1 before calling this function. For reasonable float values (< 2**32), the function will saturate correctly.
- Parameters
pSrc – [in] points to the floating-point input vector
pDst – [out] points to the Q31 output vector
blockSize – [in] number of samples in each vector
- RISCV_DSP_ATTRIBUTE void riscv_float_to_q7 (const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)
Converts the elements of the floating-point vector to Q7 vector.
- Description:
The equation used for the conversion process is:
- Scaling and Overflow Behavior:
The function uses saturating arithmetic. Results outside of the allowable Q7 range [0x80 0x7F] will be saturated.
Note
In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.
- Parameters
*pSrc – [in] points to the floating-point input vector
*pDst – [out] points to the Q7 output vector
blockSize – [in] length of the input vector