QN9080 has Arm Cortex-M4F core integrated, DSP instructions are available for such algorithm computation.
QN9080 has a co-processor integrated as well which is called Fusion Signal Processor (FSP), it is suitable for processing an algorithm for sensor data.
Thanks to FSP, the sensor algorithm processing can be done in parallel with a high priority task like Bluetooth stack processing. FSP is actually much faster than Cortex-M4F DSP instructions.
How fast is it, then?
Ok, I am gonna show you the performance of QN9080 Fusion Signal Processor.
Contents
What processing’s are done for sensor data?
In general, what will you need for using sensor data?
Data of sensor is basically analogue data. And, usually it is converted to digital signal by using Analog-to-digital converter.
The converted digital signal however can not yet be used as-is because it contains much of noise.
The noise needs to be filtered out.
As well as a the process like filtering out a high frequency noise of sensor, you might also need for example an integral computation when you calculate a speed from accelerate sensor data.
Other than filtering process, integral and differential computation, when it comes to a frequency analysis, it quite often uses FFT(Fast Fourier Transform ).
These algorithm of such filtering and FFT and so on, they quite often use so-called mathematical function like sine, cosine and a power, matrix and floating point calculations.
There is CMSIS-DSP, too!

For mathematical functions line sine, cosine and a power etc, you can use mathematical built-in functions of C standard library which includes in math.h
You can also use CMSIS-DSP library instead of C standard math library when the device is ARM Cortex-M core.
CMSIS-DSP is already optimized for Cortex-M core
What is CMSIS-DSP? it is a library of a common signal processing functions optimized for Cortex-M cores. And, it is faster to process than C standard math library(math.h).
If DSP instructions and FPU is not supported like Cortex-M0 core, of course the code size is bigger and the processing speed is slower than M3/M4F.
Then, which is faster, FSP or Cortex-M4F using CMSIS-DSP?
First, I am gonna explain the how-to use CMSIS-DSP on coretex-Mx core and Fusion Signal Processor.
Then, I will try to get them to work and benchmark those cores.
How to use CMSIS-DSP
1. Include arm-math.h

For the use of a mathematical functions like sine and cosine, you might want to use C standard math library, you need to include math.h.
Now, if you use a mathematical functions available in CMSIS-DSP, you need to include arm_math.h. It is quite easy!
2.Put CMSIS-DSP library in the project

The CMSIS library itself is stored in below folder.
SDK package folder/libs/arm_cortexm4lf_math.a
You need to specify the library in Option-Property-> C/C++Build -> MCU linker – libraries. See above picture. Here you need to specify the library name, “arm_cortexm4lf_math” removing .a.
For the search path, you need to put above CMSIS-DSP library path. You might as well refer to the above picture.
3. Enabling hardware floating point (FPU)

When the core is Cortex-M4F, Hardware floating unit (FPU) is available. Once FPU is enabled, you can use FPU unit.
You need to set “Architecture” to “FPv4-SP(Hard ABI)” to enable it.
The same setting is in Assembler and Linker settings.
How to use Fusion Signal Processor (FSP)

All you have to do to use FSP is to include fsl_fsp.h.
No library is needed in the project like you did CMSIS-DSP.
Now that you can call FSP API, FSP_xxx() functions.
Due to that, It is relatively easy porting from CMSIS-DSP to FSP math function.
For the detail explanation of FSP functions, they are listed in its documentation.
Benchmarking
FSP sample code is available in QN9080 SDK which you can download from MCUXpresso SDK web page. Here is MCUXpresso SDK page ( https://www.nxp.com/mcuxpresso)
In the sample code, there are FFT, Matrix and Power calculation being used and measured the cycle counts for those cases.
Here is the sample code. The part of a power calculation is only shown below.
void power_example(void)
{
uint32_t fsp_cycles, mcu_cycles;
for (uint32_t i = 0; i < 256; i++)
power_acc_input[i] = 0.1 * i;
s_se_done = 0;
// FSP
START_COUNTING();
FSP_PowerIntF32(DEMO_FSP_BASE, power_acc_input, 1024);
while (!s_se_done)
{
}
FSP_GetPowerIntResultF32(DEMO_FSP_BASE, &power_acc_fsp_output);
fsp_cycles = GET_CYCLES();
// MCU
START_COUNTING();
arm_power_f32(power_acc_input, 1024, &power_acc_mcu_output);
mcu_cycles = GET_CYCLES();
PRINTF("-- Sum of the squares of the 1024 elements\r\n");
PRINTF("FSP %d cycles, MCU %d cycles\r\n", fsp_cycles, mcu_cycles);
}
Measurement result

From this result, it gives you the fact that FSP is much faster than Cortex-M4F using CMSIS-DSP on each calculations.
Especially, for FFT algorithm, it is about 7 times faster, for a power calculation it is about 6 times faster.
FSP is a co-processor with Cortex-M4F so that it can off-load the work from core’s work load.
Summary
This time, I measured the performance of Fusion Signal Processor that results in much faster calculation than Cortex-M core. It is about 6 – 7 times faster in FFT or power calculation.
Low power, BLE wireless connectivity and sensor algorithm are required in a small body wearable devices. QN9080 indeed meets all the requirements for such devices.