Use /bin/bash rather than relying on execution permissions. |
4 years ago | |
|---|---|---|
| .github | 4 years ago | |
| ComputeLibrary | 5 years ago | |
| Doxygen | 4 years ago | |
| Examples | 4 years ago | |
| Include | 4 years ago | |
| PrivateInclude | 4 years ago | |
| PythonWrapper | 4 years ago | |
| SDFTools | 4 years ago | |
| Scripts | 4 years ago | |
| Source | 4 years ago | |
| Testing | 4 years ago | |
| cmsisdsp | 4 years ago | |
| .gitattributes | 4 years ago | |
| .gitconfig | 4 years ago | |
| .gitignore | 4 years ago | |
| ARM.CMSIS-DSP.pdsc | 4 years ago | |
| LICENSE.txt | 4 years ago | |
| MANIFEST.in | 4 years ago | |
| PythonWrapper_README.md | 4 years ago | |
| README.md | 4 years ago | |
| cmsisdspconfig.py | 4 years ago | |
| pyproject.toml | 4 years ago | |
| setup.py | 4 years ago | |
README.md
CMSIS-DSP
About
CMSIS-DSP is an optimized compute library for embedded systems (DSP is in the name for legacy reasons).
It provides optimized compute kernels for Cortex-M and for Cortex-A.
On Cortex-M, different variants are available according to the core and most of the functions are using a vectorized version when the Helium extension is available.
The latest Documentation is here https://arm-software.github.io/CMSIS-DSP/main.
Kernels
Kernels provided by CMSIS-DSP (list not exhaustive):
- Basic mathematics (real, complex, quaternion, linear algebra, fast math functions)
- DSP (filtering)
- Transforms (FFT, MFCC, DCT)
- Statistics
- Classical ML (Support Vector Machine, Distance functions for clustering ...)
Kernels are provided with several datatypes : f64, f32, f16, q31, q15, q7.
Python wrapper
A PythonWrapper is also available and can be installed with:
pip install cmsisdsp
With this wrapper you can design your algorithm in Python using an API as close as possible to the C API. The wrapper is compatible with NumPy. The wrapper is supporting fixed point arithmetic.
The goal is to make it easier to move from a design to a final implementation in C.
Synchronous Data Flow
CMSIS-DSP is also providing an experimental synchronous data flow scheduler:
- You define your compute graph in Python
- A static schedule (computed by the Python script is generated)
- The static schedule can be run on the device
The scripts for the synchronous data flow (SDF) are part of the CMSIS-DSP Python wrapper.
The SDF is making it easier to implement a streaming solution : connecting different compute kernels each consuming and producing different amount of data.
Support / Contact
For any questions or to reach the CMSIS-DSP team, please create a new issue in https://github.com/ARM-software/CMSIS-DSP/issues
Building for speed
CMSIS-DSP is used when you need performance. As consequence CMSIS-DSP should be compiled with the options giving the best performance:
Options to use
-Ofastmust be used for best performances.- When using Helium it is strongly advised to use
-Ofast
When float are used, then the fpu should be selected to ensure that the compiler is not using a software float emulation.
When building with Helium support, it will be automatically detected by CMSIS-DSP. For Neon, it is not the case and you must enable the option -DARM_MATH_NEON for the C compilation. With cmake this option is controlled with -DNEON=ON.
-DLOOPUNROLL=ONcan also be used when compiling with cmake- It corresponds to the C options
-DARM_MATH_LOOPUNROLL
Compilers are doing unrolling. So this option may not be needed but it is highly dependent on the compiler. With some compilers, this option is needed to get better performances.
Speed of memory is important. If you can map the data and the constant tables used by CMSIS-DSP in DTCM memory then it is better. If you have a cache, enable it.
Options to avoid
-fno-builtin-ffreestandingbecause it enables previous options
The library is doing some type punning to process word 32 from memory as a pair of q15 or a quadruple of q7. Those type manipulations are done through memcpy functions. Most compilers should be able to optimize out those function calls when the length to copy is small (4 bytes).
This optimization will not occur when -fno-builtin is used and it will have a very bad impact on the performances.
Some compiler may also require the use of option -munaligned-access to specify that unaligned accesses are used.
How to build
The standard way to build is through the Open CMSIS-Pack included in the repository (or available in your IDE).
But cmake can also be used.
How to build CMSIS-DSP with cmake
Create a CMakeLists.txt and inside add a project.
Add CMSIS-DSP as a subdirectory. The variable CMSISDSP is the path to the CMSIS-DSP repository in below example.
cmake_minimum_required (VERSION 3.14)
# Define the project
project (testcmsisdsp VERSION 0.1)
add_subdirectory(${CMSISDSP}/Source bin_dsp)
CMSIS-DSP is dependent on the CMSIS Core includes. So, you should use a target_include_directories to define where the CMSIS_5\CMSIS\Core\Include is located. Or you can also define CMSISCORE on the cmake command line.
You should also set the compilation options to use to build the library.
You can also rely on the CMSIS-DSP test framework. The framework will analyze the ARM_CPU option and deduce from its values the compilation options and the location of the CMSIS Core includes.
Note that the test framework is only supporting a subset of all the cores.
The following lines are:
- Adding the test framework to the cmake module path
- Loading the module
configLib - Defining the compilation options and core includes using the
configLibfunction
configLib is requiring the variable CMSIS to be defined (on cmake command line) with the path to the CMSIS repository.
list(APPEND CMAKE_MODULE_PATH "${CMSISDSP}/Testing")
include(configLib)
configLib(CMSISDSP)
A typical cmake command (when using CMSIS-DSP test framework) may be:
cmake -DCMAKE_PREFIX_PATH="path to compiler" \
-DCMAKE_TOOLCHAIN_FILE="path_to_cmsisdsp/Testing/armac6.cmake" \
-DARM_CPU="cortex-m55" \
-DLOOPUNROLL=ON \
-DCMSISDSP="path_to_cmsisdsp" \
-DCMSIS="path_to_cmsis" \
-DCMAKE_C_FLAGS_RELEASE="-std=c11 -Ofast -ffast-math -DNDEBUG -Wall -Wextra" \
-DCMAKE_CXX_FLAGS_RELEASE="-fno-rtti -std=c++11 -Ofast -ffast-math -DNDEBUG -Wall -Wextra -Wno-unused-parameter" \
-DHELIUM=ON \
-G "Unix Makefiles" ..
It is also possible to build on the host PC:
cmake -DHOST=YES \
-DLOOPUNROLL=ON \
-DCMSISDSP="path_to_cmsisdsp" \
-DCMSIS="path_to_cmsis" \
-DCMAKE_C_FLAGS_RELEASE="-std=c11 -Ofast -ffast-math -DNDEBUG -Wall -Wextra" \
-DCMAKE_CXX_FLAGS_RELEASE="-fno-rtti -std=c++11 -Ofast -ffast-math -DNDEBUG -Wall -Wextra -Wno-unused-parameter" \
-G "Unix Makefiles" ..
How to build CMSIS-DSP examples with cmake and CMSIS-DSP test framework
Building examples with cmake is similar to building only the CMSIS-DSP library but in addition to that we also rely on the CMSIS-DSP test framework for the boot code.
In addition to the CMSIS variable, the variable CMSISDSPFOLDER must also be defined.
Examples/CMakeLists.txt can be used to build all the examples.
A possible cmake command may be:
cmake -DLOOPUNROLL=ON \
-DMATRIXCHECK=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_FLAGS_RELEASE="-Ofast -ffast-math -DNDEBUG -Wall -Wextra " \
-DCMAKE_CXX_FLAGS_RELEASE="-fno-rtti -Ofast -ffast-math -DNDEBUG -Wall -Wextra" \
-DCMAKE_PREFIX_PATH="path to compiler" \
-DCMAKE_TOOLCHAIN_FILE=../../Testing/armac6.cmake \
-DCMSIS="path_to_cmsis" \
-DCMSISDSPFOLDER="path_to_cmsisdsp" \
-DARM_CPU="cortex-m55" \
-DPLATFORM="FVP" \
-DHELIUM=ON \
-DFLOAT16=OFF \
-DBASICMATH=ON \
-DCOMPLEXMATH=ON \
-DQUATERNIONMATH=ON \
-DCONTROLLER=ON \
-DFASTMATH=ON \
-DFILTERING=ON \
-DMATRIX=ON \
-DSTATISTICS=ON \
-DSUPPORT=ON \
-DTRANSFORM=ON \
-DSVM=ON \
-DBAYES=ON \
-DDISTANCE=ON \
-DINTERPOLATION=ON \
-G "Unix Makefiles" "../ARM"
Building
Once cmake has generated the makefiles, you can use a GNU Make to build.
make VERBOSE=1
Running
The generated executable can be run on a fast model. For instance, if you built for m7, you could just do:
FVP_MPS2_Cortex-M7.exe -a arm_variance_example
The final executable has no extension in the filename.
Folders and files
Folders
- cmsisdsp
- Required to build the CMSIS-DSP PythonWrapper for the Python repository
- It contains all Python packages
- ComputeLibrary:
- Some kernels required when building CMSIS-DSP with Neon acceleration
- Examples:
- Examples of use of CMSIS-DSP
- Include:
- Include files for CMSIS-DSP
- PrivateInclude:
- Some include needed to build CMSIS-DSP
- PythonWrapper:
- C code for the CMSIS-DSP PythonWrapper
- Examples for the PythonWrapper
- Scripts:
- Debugging scripts
- Script to generate some coefficient tables used by CMSIS-DSP
- SDFTools:
- Examples for the Synchronous Data Flow
- C++ templates for the Synchronous Data Flow
- Source:
- CMSIS-DSP source
- Testing:
- CMSIS-DSP Test framework
Files
Some files are needed to generate the PythonWrapper:
- PythonWrapper_README.md
- LICENSE.txt
- MANIFEST.in
- pyproject.toml
- setup.py
And we have a script to make it easier to customize the build:
- cmsisdspconfig.py:
- Web browser UI to generate build configurations (temporary until the CMSIS-DSP configuration is reworked to be simpler and more maintainable)
Compilation symbols for tables
Some new compilations symbols have been introduced to avoid including all the tables if they are not needed.
If no new symbol is defined, everything will behave as usual. If ARM_DSP_CONFIG_TABLES is defined then the new symbols will be taken into account.
It is strongly suggested to use the new Python script cmsisdspconfig.py to generate the -D options to use on the compiler command line.
pip install streamlit
streamlit run cmsisdspconfig.py
If you use cmake, it is also easy since high level options are defined and they will select the right compilation symbols.
For instance, if you want to use the arm_rfft_fast_f32, in fft.cmake you'll see an option RFFT_FAST_F32_32.
If you don't use cmake nor the Python script, you can just look at fft.cmake or interpol.cmake in Source to see which compilation symbols are needed.
We see, for arm_rfft_fast_f32, that the following symbols need to be enabled :
ARM_TABLE_TWIDDLECOEF_F32_16ARM_TABLE_BITREVIDX_FLT_16ARM_TABLE_TWIDDLECOEF_RFFT_F32_32ARM_TABLE_TWIDDLECOEF_F32_16
In addition to that, ARM_DSP_CONFIG_TABLES must be enabled and finally ARM_FFT_ALLOW_TABLES must also be defined.
This last symbol is required because if no transform functions are included in the build, then by default all flags related to FFT tables are ignored.
Bit Reverse Tables for FFTs in CMSIS DSP
It is a question coming often.
It is now detailed in this github issue
Someone from the community has written a Python script to help