Convert 32-bit floating point value

RISCV_DSP_ATTRIBUTE void riscv_float_to_f16 (const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)
RISCV_DSP_ATTRIBUTE void riscv_float_to_f64 (const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)
RISCV_DSP_ATTRIBUTE void riscv_float_to_q15 (const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)
RISCV_DSP_ATTRIBUTE void riscv_float_to_q31 (const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)
RISCV_DSP_ATTRIBUTE void riscv_float_to_q7 (const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)
group float_to_x

Functions

RISCV_DSP_ATTRIBUTE void riscv_float_to_f16 (const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to f16 vector.

Converts the elements of the floating-point vector to Q31 vector.

Parameters
  • pSrc[in] points to the f32 input vector

  • pDst[out] points to the f16 output vector

  • blockSize[in] number of samples in each vector

RISCV_DSP_ATTRIBUTE void riscv_float_to_f64 (const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to f64 vector.

Converts the elements of the floating-point vector to 64 bit floating-point vector.

Parameters
  • pSrc[in] points to the f32 input vector

  • pDst[out] points to the f64 output vector

  • blockSize[in] number of samples in each vector

RISCV_DSP_ATTRIBUTE void riscv_float_to_q15 (const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q15 vector.

Details

The equation used for the conversion process is:

Scaling and Overflow Behavior

The function uses saturating arithmetic. Results outside of the allowable Q15 range [0x8000 0x7FFF] are saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Parameters
  • pSrc[in] points to the floating-point input vector

  • pDst[out] points to the Q15 output vector

  • blockSize[in] number of samples in each vector

RISCV_DSP_ATTRIBUTE void riscv_float_to_q31 (const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q31 vector.

Details

The equation used for the conversion process is:

Scaling and Overflow Behavior

The function uses saturating arithmetic. Results outside of the allowable Q31 range[0x80000000 0x7FFFFFFF] are saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Note

If the input float values are very big (2**32) then the function won’t be able to saturate to the right values. If you expect very big float values in the input array then you should force those values to +1 or -1 before calling this function. For reasonable float values (< 2**32), the function will saturate correctly.

Parameters
  • pSrc[in] points to the floating-point input vector

  • pDst[out] points to the Q31 output vector

  • blockSize[in] number of samples in each vector

RISCV_DSP_ATTRIBUTE void riscv_float_to_q7 (const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q7 vector.

Description:

The equation used for the conversion process is:

Scaling and Overflow Behavior:

The function uses saturating arithmetic. Results outside of the allowable Q7 range [0x80 0x7F] will be saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Parameters
  • *pSrc[in] points to the floating-point input vector

  • *pDst[out] points to the Q7 output vector

  • blockSize[in] length of the input vector