NMSIS-DSP  Version 1.3.1
NMSIS DSP Software Library
Complex Dot Product

Computes the dot product of two complex vectors. The vectors are multiplied element-by-element and then summed. More...

Functions

RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_f16 (const float16_t *pSrcA, const float16_t *pSrcB, uint32_t numSamples, float16_t *realResult, float16_t *imagResult)
 Floating-point complex dot product. More...
 
RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_f32 (const float32_t *pSrcA, const float32_t *pSrcB, uint32_t numSamples, float32_t *realResult, float32_t *imagResult)
 Floating-point complex dot product. More...
 
RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_q15 (const q15_t *pSrcA, const q15_t *pSrcB, uint32_t numSamples, q31_t *realResult, q31_t *imagResult)
 Q15 complex dot product. More...
 
RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_q31 (const q31_t *pSrcA, const q31_t *pSrcB, uint32_t numSamples, q63_t *realResult, q63_t *imagResult)
 Q31 complex dot product. More...
 

Detailed Description

Computes the dot product of two complex vectors. The vectors are multiplied element-by-element and then summed.

The pSrcA points to the first complex input vector and pSrcB points to the second complex input vector. numSamples specifies the number of complex samples and the data in each array is stored in an interleaved fashion (real, imag, real, imag, ...). Each array has a total of 2*numSamples values.

The underlying algorithm is used:

realResult = 0;
imagResult = 0;
for (n = 0; n < numSamples; n++) {
    realResult += pSrcA[(2*n)+0] * pSrcB[(2*n)+0] - pSrcA[(2*n)+1] * pSrcB[(2*n)+1];
    imagResult += pSrcA[(2*n)+0] * pSrcB[(2*n)+1] + pSrcA[(2*n)+1] * pSrcB[(2*n)+0];
}

There are separate functions for floating-point, Q15, and Q31 data types.

Function Documentation

◆ riscv_cmplx_dot_prod_f16()

RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_f16 ( const float16_t *  pSrcA,
const float16_t *  pSrcB,
uint32_t  numSamples,
float16_t *  realResult,
float16_t *  imagResult 
)

Floating-point complex dot product.

Parameters
[in]pSrcApoints to the first input vector
[in]pSrcBpoints to the second input vector
[in]numSamplesnumber of samples in each vector
[out]realResultreal part of the result returned here
[out]imagResultimaginary part of the result returned here

◆ riscv_cmplx_dot_prod_f32()

RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_f32 ( const float32_t pSrcA,
const float32_t pSrcB,
uint32_t  numSamples,
float32_t realResult,
float32_t imagResult 
)

Floating-point complex dot product.

Parameters
[in]pSrcApoints to the first input vector
[in]pSrcBpoints to the second input vector
[in]numSamplesnumber of samples in each vector
[out]realResultreal part of the result returned here
[out]imagResultimaginary part of the result returned here

◆ riscv_cmplx_dot_prod_q15()

RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_q15 ( const q15_t pSrcA,
const q15_t pSrcB,
uint32_t  numSamples,
q31_t realResult,
q31_t imagResult 
)

Q15 complex dot product.

Parameters
[in]pSrcApoints to the first input vector
[in]pSrcBpoints to the second input vector
[in]numSamplesnumber of samples in each vector
[out]realResultreal part of the result returned here
[out]imagResultimaginary part of the result returned her
Scaling and Overflow Behavior
The function is implemented using an internal 64-bit accumulator. The intermediate 1.15 by 1.15 multiplications are performed with full precision and yield a 2.30 result. These are accumulated in a 64-bit accumulator with 34.30 precision. As a final step, the accumulators are converted to 8.24 format. The return results realResult and imagResult are in 8.24 format.

◆ riscv_cmplx_dot_prod_q31()

RISCV_DSP_ATTRIBUTE void riscv_cmplx_dot_prod_q31 ( const q31_t pSrcA,
const q31_t pSrcB,
uint32_t  numSamples,
q63_t realResult,
q63_t imagResult 
)

Q31 complex dot product.

Parameters
[in]pSrcApoints to the first input vector
[in]pSrcBpoints to the second input vector
[in]numSamplesnumber of samples in each vector
[out]realResultreal part of the result returned here
[out]imagResultimaginary part of the result returned here
Scaling and Overflow Behavior
The function is implemented using an internal 64-bit accumulator. The intermediate 1.31 by 1.31 multiplications are performed with 64-bit precision and then shifted to 16.48 format. The internal real and imaginary accumulators are in 16.48 format and provide 15 guard bits. Additions are nonsaturating and no overflow will occur as long as numSamples is less than 32768. The return results realResult and imagResult are in 16.48 format. Input down scaling is not required.