Added example for use of CMSIS-DSP in computegraph and cyclo-static scheduling

pull/94/head
Christophe Favergeon 3 years ago
parent 4415ded3fb
commit 981917584c

@ -1,5 +1,7 @@
# Dynamic Data Flow
This feature is illustrated in the [Example 10 : The dynamic dataflow mode](examples/example10/README.md)
Versions of the compute graph corresponding to CMSIS-DSP Version >= `1.14.3` and Python wrapper version >= `1.10.0` are supporting a new dynamic / asynchronous mode.
With a dynamic flow, the flow of data is potentially changing at each execution. The IOs can generate or consume a different amount of data at each execution of their node (including no data).

@ -1,5 +1,7 @@
# Cyclo static scheduling
This feature is illustrated in the [cyclo](examples/cyclo/README.md) example.
Beginning with the version `1.7.0` of the Python wrapper and version >= `1.12` of CMSIS-DSP, cyclo static scheduling has been added.
## What is the problem it is trying to solve ?

@ -4,7 +4,11 @@
1. ### [Introduction](Introduction.md)
2. ### [How to get started](examples/simple/README.md)
2. ### How to get started
1. [Simple graph creation example](examples/simple/README.md)
2. [Simple graph creation example with CMSIS-DSP](examples/simpledsp/README.md)
3. ### [Examples](examples/README.md)

@ -1,6 +1,8 @@
# Generic Nodes
# Generic and functions bodes
The generic node classes are used to build new kind of nodes. There are 3 classes provided by the framework :
The generic and function nodes are the basic nodes that you use to create other kind of nodes in the graph.
There are 3 generic classes provided by the framework to be used to create new nodes :
* `GenericSource`
* `GenericNode`
@ -8,6 +10,14 @@ The generic node classes are used to build new kind of nodes. There are 3 classe
They are defined in `cmsisdsp.cg.scheduler`
There are 3 other classes that can be used to create new nodes from functions:
* `Unary`
* `Binary`
* `Dsp`
## Generic Nodes
Any new kind of node must inherit from one of those classes. Those classes are providing the methods `addInput` and/or `addOutput` to define new IOs.
The method `typeName` from the parent class must be overridden.
@ -28,7 +38,7 @@ class ProcessingNode(GenericNode):
See the [simple](../examples/simple/README.md) example for more explanation about how to define a new node.
## Methods
### Methods
The constructor of the node is using the `addInput` and/or `addOutput` to define new IOs.
@ -56,7 +66,7 @@ def typeName(self):
This method defines the name of the C++ class implementing the wrapper for this node.
## Datatypes
### Datatypes
Datatypes for the IOs are inheriting from `CGStaticType`.
@ -65,7 +75,7 @@ Currently there are two classes defined:
* `CType` for the standard CMSIS-DSP types
* `CStructType` for a C struct
### CType
#### CType
You create such a type with `CType(id)` where `id` is one of the constant coming from the Python wrapper:
@ -84,7 +94,7 @@ You create such a type with `CType(id)` where `id` is one of the constant coming
For instance, to define a `float32_t` type for an IO you can use `CType(F32)`
### CStructType
#### CStructType
The constructor has the following definition
@ -100,4 +110,20 @@ In Python, there is no `struct`. This datatype is mapped to an object. Object ha
As consequence, in Python side you should never copy those structs since it would copy the reference. You should instead copy the members of the struct.
If you don't plan on generating a Python scheduler, you can just use whatever name you want for the `python_name`. It will be ignored by the C++ code generation.
If you don't plan on generating a Python scheduler, you can just use whatever name you want for the `python_name`. It will be ignored by the C++ code generation.
## Function and constant nodes
A Compute graph C++ wrapper is useful when the software components you use have a state that needs to be initialized in the C++ constructor, and preserved between successive calls to the `run` method of the wrapper.
Most CMSIS-DSP functions have no state. The compute graph framework is providing some ways to easily use functions in the graph without having to write a wrapper.
This feature is relying on the nodes:
* `Unary`
* `Binary`
* `Dsp`
* `Constant`
All of this is explained in detail in the [simple example with CMSIS-DSP](../examples/simpledsp/README.md).

@ -4,7 +4,7 @@ Python APIs to describe the nodes and graph and generate the C++, Python or Grap
1. ## [Graph class](Graph.md)
2. ## [Generic Node, Source and Sink classes](Generic.md)
2. ## [Generic and function nodes](Generic.md)
3. ## Scheduler

@ -74,6 +74,8 @@ add_subdirectory(example8 bin_example8)
add_subdirectory(example9 bin_example9)
add_subdirectory(example10 bin_example10)
add_subdirectory(simple bin_simple)
add_subdirectory(simpledsp bin_simpledsp)
add_subdirectory(cyclo bin_cyclo)
# Python examples
add_subdirectory(example4 bin_example4)

@ -48,8 +48,9 @@ python main.py
# List of examples
* [Simple example](simple/README.md) : How to get started
* [Example 1](example1/README.md) : Same as the simple example but explaining how to add arguments to the scheduler API and node constructors. This example is also giving a very detailed explanation of the C++ code generated for the scheduler
* [Simple example without CMSIS-DSP](simple/README.md) : **How to get started**
* [Simple example with CMSIS-DSP](simpledsp/README.md) : **How to get started with CMSIS-DSP**
* [Example 1](example1/README.md) : Same as the simple example but explaining how to add arguments to the scheduler API and node constructors. This example is also giving a **detailed explanation of the C++ code** generated for the scheduler
* [Example 2](example2/README.md) : Explain how to use CMSIS-DSP pure functions (no state) and add delay on the arcs of the graph. Explain some configuration options for the schedule generation.
* [Example 3](example3/README.md) : A full signal processing example with CMSIS-DSP using FFT and sliding windows and overlap and add node
* [Example 4](example4/README.md) : Same as example 3 but where we generate a Python implementation rather than a C++ implementation. The resulting graph can be executed thanks to the CMSIS-DSP Python wrapper
@ -59,4 +60,5 @@ python main.py
* [Example 8](example8/README.md) : Introduce structured datatype for the samples and implicit `Duplicate` nodes for the graph
* [Example 9](example9/README.md) : Check that duplicate nodes and arc delays are working together and a scheduling is generated
* [Example 10 : The dynamic dataflow mode](example10/README.md)
* [Cyclo-static scheduling](cyclo/README.md)

@ -0,0 +1,158 @@
/* ----------------------------------------------------------------------
* Project: CMSIS DSP Library
* Title: AppNodes.h
* Description: Application nodes for Example 1
*
* $Date: 29 July 2021
* $Revision: V1.10.0
*
* Target Processor: Cortex-M and Cortex-A cores
* -------------------------------------------------------------------- */
/*
* Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved.
*
* SPDX-License-Identifier: Apache-2.0
*
* Licensed under the Apache License, Version 2.0 (the License); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an AS IS BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _APPNODES_H_
#define _APPNODES_H_
#include <iostream>
template<typename IN, int inputSize>
class Sink: public GenericSink<IN, inputSize>
{
public:
Sink(FIFOBase<IN> &src):GenericSink<IN,inputSize>(src){};
int prepareForRunning() final
{
if (this->willUnderflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final
{
IN *b=this->getReadBuffer();
printf("Sink\n");
for(int i=0;i<inputSize;i++)
{
std::cout << (int)b[i] << std::endl;
}
return(0);
};
};
template<typename OUT,int outputSize>
class Source: public GenericSource<OUT,outputSize>
{
public:
Source(FIFOBase<OUT> &dst):GenericSource<OUT,outputSize>(dst),
mPeriod(0),mValuePeriodStart(0){};
int getSamplesForPeriod() const
{
if (mPeriod == 0)
{
return(3);
}
return(2);
}
void updatePeriod(){
mPeriod++;
mValuePeriodStart = 3;
if (mPeriod == 2)
{
mPeriod = 0;
mValuePeriodStart = 0;
}
}
int prepareForRunning() final
{
/* Cyclo static scheduling do not make sense in
asynchronous mode so the default outputSize is used.
This function is never used in cyclo-static scheduling
*/
if (this->willOverflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final{
OUT *b=this->getWriteBuffer(getSamplesForPeriod());
printf("Source\n");
for(int i=0;i<getSamplesForPeriod();i++)
{
b[i] = mValuePeriodStart + (OUT)i;
}
updatePeriod();
return(0);
};
protected:
int mPeriod;
OUT mValuePeriodStart;
};
template<typename IN, int inputSize,typename OUT,int outputSize>
class ProcessingNode;
template<typename IN, int inputOutputSize>
class ProcessingNode<IN,inputOutputSize,IN,inputOutputSize>:
public GenericNode<IN,inputOutputSize,IN,inputOutputSize>
{
public:
ProcessingNode(FIFOBase<IN> &src,
FIFOBase<IN> &dst):GenericNode<IN,inputOutputSize,
IN,inputOutputSize>(src,dst){};
int prepareForRunning() final
{
if (this->willOverflow() ||
this->willUnderflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final{
printf("ProcessingNode\n");
IN *a=this->getReadBuffer();
IN *b=this->getWriteBuffer();
for(int i=0;i<inputOutputSize;i++)
{
b[i] = a[i]+1;
}
return(0);
};
};
#endif

@ -0,0 +1,13 @@
cmake_minimum_required (VERSION 3.14)
include(CMakePrintHelpers)
project(cyclo)
add_executable(cyclo main.cpp)
sdf(cyclo create.py cyclo)
add_sdf_dir(cyclo)
target_include_directories(cyclo PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
target_include_directories(cyclo PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/generated)

@ -0,0 +1,18 @@
# Makefile for MSVC compiler on Windows
SHELL = cmd
CC = cl.exe
RM = del /Q /F
INCLUDES = /Igenerated /I../../cg/src /I.
WINFLAGS = /DWIN32 /D_WINDOWS /EHsc /Zi /Ob0 /Od /RTC1 -MDd
CFLAGS = $(INCLUDES) $(WINFLAGS)
all:
$(CC) /Fecyclo.exe $(CFLAGS) generated/scheduler.cpp main.cpp
clean:
$(RM) main.obj
$(RM) scheduler.obj
$(RM) cyclo.ilk
$(RM) cyclo.exe
$(RM) *.pdb

@ -0,0 +1,143 @@
# README
This example is inside the folder `examples/cyclo` of the Compute graph folder. Before reading this documentation you need to understand the principles explained in the [simple example without CMSIS-DSP](../simple/README.md)
![cyclo](docassets/cyclo.png)
The nodes are:
* A source generating floating point values (0,1,2,3,4).
* A processing node adding 1 to those values
* A sink printing its input values (1,2,3,4,5)
The graph generates an infinite streams of values : 1,2,3,4,5,1,2,3,4,5,1,2,3,4,5 ... For this example, the number of iterations will be limited so that it does not run forever.
The big difference compared to the [simple example without CMSIS-DSP](../simple/README.md) is the source node:
* The source node is no more generating samples per packet of 5
* The first call to the source node will generate 3 samples
* The second call to the source node will generate 2 samples
* Other execution will just reproduce this schedule : 3,2,3,2 ...
The flow is not static, but it is periodically static : **cyclo-static scheduling**.
## C++ Implementation
The C++ wrapper must take into account this periodic schedule of sample generation.
First call should generate only 3 samples and second call generate 2.
We want the first call to generate `0,1,2` and the second call to generate `3,4`.
The C++ wrapper has been modified for this. Here is the body of the `run` function:
```C++
OUT *b=this->getWriteBuffer(getSamplesForPeriod());
printf("Source\n");
for(int i=0;i<getSamplesForPeriod();i++)
{
b[i] = mValuePeriodStart + (OUT)i;
}
updatePeriod();
```
The `run` function is generating only the number of samples required in a given period.
The value generated is using `mValuePeriodStart`.
The template for `Source` has not changed and is :
```C++
template<typename OUT,int outputSize>
class Source: public GenericSource<OUT,outputSize>
```
`outputSize` cannot be the list `[3,2]`.
The generated code is using the max of the values, so here `3`:
```C++
Source<float32_t,3> source(fifo0);
```
## Expected output:
```
Schedule length = 26
Memory usage 88 bytes
```
The schedule length is `26` compared to `19` for the simple example where source is generating samples by packet of 5. The source node executions must be a multiple of 2 in this graph because the period of sample generation has length 2. In the original graph, the number of executions could be an odd number. That's why there are more executions in this cyclo-static scheduling.
The memory usage (FIFO) is the same as the one for the simple example without cyclo-static scheduling.
The expected output of the execution is still 1,2,3,4,5,1,2,3,4,5 ... but the scheduling is different. There are more source executions.
```
Start
Source
Source
Source
ProcessingNode
Sink
1
2
3
4
5
Source
Source
Source
ProcessingNode
Sink
1
2
3
4
5
Source
Source
Source
ProcessingNode
Sink
1
2
3
4
5
Sink
1
2
3
4
5
Source
Source
ProcessingNode
Sink
1
2
3
4
5
Source
Source
Source
ProcessingNode
Sink
1
2
3
4
5
Sink
1
2
3
4
5
```

@ -0,0 +1,31 @@
# Include definition of the nodes
from nodes import *
# Include definition of the graph
from graph import *
# Create a configuration object
conf=Configuration()
# The number of schedule iteration is limited to 1
# to prevent the scheduling from running forever
# (which should be the case for a stream computation)
conf.debugLimit=1
# Disable inclusion of CMSIS-DSP headers so that we don't have
# to recompile CMSIS-DSP for such a simple example
conf.CMSISDSP = False
# Compute a static scheduling of the graph
# The size of FIFO is also computed
scheduling = the_graph.computeSchedule(config=conf)
# Print some statistics about the compute schedule
# and the memory usage
print("Schedule length = %d" % scheduling.scheduleLength)
print("Memory usage %d bytes" % scheduling.memory)
# Generate the C++ code for the static scheduler
scheduling.ccode("generated",conf)
# Generate a graphviz representation of the graph
with open("cyclo.dot","w") as f:
scheduling.graphviz(f)

@ -0,0 +1,5 @@
#ifndef _CUSTOM_H_
typedef float float32_t;
#endif

@ -0,0 +1,48 @@
digraph structs {
node [shape=plaintext]
rankdir=LR
edge [arrowsize=0.5]
fontname="times"
processing [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">processing<BR/>(ProcessingNode)</TD>
</TR>
</TABLE>>];
sink [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">sink<BR/>(Sink)</TD>
</TR>
</TABLE>>];
source [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">source<BR/>(Source)</TD>
</TR>
</TABLE>>];
source:i -> processing:i [label="f32(11)"
,headlabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >7</FONT>
</TD></TR></TABLE>>
,taillabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >[3, 2]</FONT>
</TD></TR></TABLE>>]
processing:i -> sink:i [label="f32(11)"
,headlabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >5</FONT>
</TD></TR></TABLE>>
,taillabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >7</FONT>
</TD></TR></TABLE>>]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 KiB

@ -0,0 +1,170 @@
/*
Generated with CMSIS-DSP Compute Graph Scripts.
The generated code is not covered by CMSIS-DSP license.
The support classes and code is covered by CMSIS-DSP license.
*/
#include "custom.h"
#include "GenericNodes.h"
#include "AppNodes.h"
#include "scheduler.h"
#if !defined(CHECKERROR)
#define CHECKERROR if (cgStaticError < 0) \
{\
goto errorHandling;\
}
#endif
#if !defined(CG_BEFORE_ITERATION)
#define CG_BEFORE_ITERATION
#endif
#if !defined(CG_AFTER_ITERATION)
#define CG_AFTER_ITERATION
#endif
#if !defined(CG_BEFORE_SCHEDULE)
#define CG_BEFORE_SCHEDULE
#endif
#if !defined(CG_AFTER_SCHEDULE)
#define CG_AFTER_SCHEDULE
#endif
#if !defined(CG_BEFORE_BUFFER)
#define CG_BEFORE_BUFFER
#endif
#if !defined(CG_BEFORE_FIFO_BUFFERS)
#define CG_BEFORE_FIFO_BUFFERS
#endif
#if !defined(CG_BEFORE_FIFO_INIT)
#define CG_BEFORE_FIFO_INIT
#endif
#if !defined(CG_BEFORE_NODE_INIT)
#define CG_BEFORE_NODE_INIT
#endif
#if !defined(CG_AFTER_INCLUDES)
#define CG_AFTER_INCLUDES
#endif
#if !defined(CG_BEFORE_SCHEDULER_FUNCTION)
#define CG_BEFORE_SCHEDULER_FUNCTION
#endif
#if !defined(CG_BEFORE_NODE_EXECUTION)
#define CG_BEFORE_NODE_EXECUTION
#endif
#if !defined(CG_AFTER_NODE_EXECUTION)
#define CG_AFTER_NODE_EXECUTION
#endif
CG_AFTER_INCLUDES
/*
Description of the scheduling.
*/
static unsigned int schedule[26]=
{
2,2,2,0,1,2,2,2,0,1,2,2,2,0,1,1,2,2,0,1,2,2,2,0,1,1,
};
CG_BEFORE_FIFO_BUFFERS
/***********
FIFO buffers
************/
#define FIFOSIZE0 11
#define FIFOSIZE1 11
#define BUFFERSIZE1 11
CG_BEFORE_BUFFER
float32_t buf1[BUFFERSIZE1]={0};
#define BUFFERSIZE2 11
CG_BEFORE_BUFFER
float32_t buf2[BUFFERSIZE2]={0};
CG_BEFORE_SCHEDULER_FUNCTION
uint32_t scheduler(int *error)
{
int cgStaticError=0;
uint32_t nbSchedule=0;
int32_t debugCounter=1;
CG_BEFORE_FIFO_INIT;
/*
Create FIFOs objects
*/
FIFO<float32_t,FIFOSIZE0,0,0> fifo0(buf1);
FIFO<float32_t,FIFOSIZE1,0,0> fifo1(buf2);
CG_BEFORE_NODE_INIT;
/*
Create node objects
*/
ProcessingNode<float32_t,7,float32_t,7> processing(fifo0,fifo1);
Sink<float32_t,5> sink(fifo1);
Source<float32_t,3> source(fifo0);
/* Run several schedule iterations */
CG_BEFORE_SCHEDULE;
while((cgStaticError==0) && (debugCounter > 0))
{
/* Run a schedule iteration */
CG_BEFORE_ITERATION;
for(unsigned long id=0 ; id < 26; id++)
{
CG_BEFORE_NODE_EXECUTION;
switch(schedule[id])
{
case 0:
{
cgStaticError = processing.run();
}
break;
case 1:
{
cgStaticError = sink.run();
}
break;
case 2:
{
cgStaticError = source.run();
}
break;
default:
break;
}
CG_AFTER_NODE_EXECUTION;
CHECKERROR;
}
debugCounter--;
CG_AFTER_ITERATION;
nbSchedule++;
}
errorHandling:
CG_AFTER_SCHEDULE;
*error=cgStaticError;
return(nbSchedule);
}

@ -0,0 +1,26 @@
/*
Generated with CMSIS-DSP Compute Graph Scripts.
The generated code is not covered by CMSIS-DSP license.
The support classes and code is covered by CMSIS-DSP license.
*/
#ifndef _SCHEDULER_H_
#define _SCHEDULER_H_
#ifdef __cplusplus
extern "C"
{
#endif
extern uint32_t scheduler(int *error);
#ifdef __cplusplus
}
#endif
#endif

@ -0,0 +1,39 @@
# Include definitions from the Python package to
# define datatype for the IOs and to have access to the
# Graph class
from cmsisdsp.cg.scheduler import *
# Include definition of the nodes
from nodes import *
# Define the datatype we are using for all the IOs in this
# example
floatType=CType(F32)
# Instantiate a Source node with a float datatype and
# working with packet of 5 samples (each execution of the
# source in the C code will generate 5 samples)
# "source" is the name of the C variable that will identify
# this node
src=Source("source",floatType,[3,2])
# Instantiate a Processing node using a float data type for
# both the input and output. The number of samples consumed
# on the input and produced on the output is 7 each time
# the node is executed in the C code
# "processing" is the name of the C variable that will identify
# this node
processing=ProcessingNode("processing",floatType,7,7)
# Instantiate a Sink node with a float datatype and consuming
# 5 samples each time the node is executed in the C code
# "sink" is the name of the C variable that will identify
# this node
sink=Sink("sink",floatType,5)
# Create a Graph object
the_graph = Graph()
# Connect the source to the processing node
the_graph.connect(src.o,processing.i)
# Connect the processing node to the sink
the_graph.connect(processing.o,sink.i)

@ -0,0 +1,11 @@
#include <cstdio>
#include <cstdint>
#include "scheduler.h"
int main(int argc, char const *argv[])
{
int error;
printf("Start\n");
uint32_t nbSched=scheduler(&error);
return 0;
}

@ -0,0 +1,77 @@
# Include definitions from the Python package
from cmsisdsp.cg.scheduler import GenericNode,GenericSink,GenericSource
### Define new types of Nodes
class ProcessingNode(GenericNode):
"""
Definition of a ProcessingNode for the graph
Parameters
----------
name : str
Name of the C variable identifying this node
in the C code
theType : CGStaticType
The datatype for the input and output
inLength : int
The number of samples consumed by input
outLength : int
The number of samples produced on output
"""
def __init__(self,name,theType,inLength,outLength):
GenericNode.__init__(self,name)
self.addInput("i",theType,inLength)
self.addOutput("o",theType,outLength)
@property
def typeName(self):
"""The name of the C++ class implementing this node"""
return "ProcessingNode"
class Sink(GenericSink):
"""
Definition of a Sink node for the graph
Parameters
----------
name : str
Name of the C variable identifying this node
in the C code
theType : CGStaticType
The datatype for the input
inLength : int
The number of samples consumed by input
"""
def __init__(self,name,theType,inLength):
GenericSink.__init__(self,name)
self.addInput("i",theType,inLength)
@property
def typeName(self):
"""The name of the C++ class implementing this node"""
return "Sink"
class Source(GenericSource):
"""
Definition of a Source node for the graph
Parameters
----------
name : str
Name of the C variable identifying this node
in the C code
theType : CGStaticType
The datatype for the output
outLength : int
The number of samples produced on output
"""
def __init__(self,name,theType,outLength):
GenericSource.__init__(self,name)
self.addOutput("o",theType,outLength)
@property
def typeName(self):
"""The name of the C++ class implementing this node"""
return "Source"

@ -2,11 +2,11 @@
Please refer to the [simple example](../simple/README.md) to have an overview of how to define a graph and it nodes and how to generate the C++ code for the static scheduler.
The [simple example with CMSIS-DSP](../simpledsp/README.md) is giving more details about `Constant` nodes and CMSIS-DSP functions in the compute graph.
In this example. we are just analyzing a much more complex example to see some new features:
- Delay
- CMSIS-DSP function
- Constant node
- SlidingBuffer
This example is not really using a MFCC or a TensorFlow Lite node. It is just providing some wrappers to show how such a nodes could be included in a graph:

@ -0,0 +1,128 @@
/* ----------------------------------------------------------------------
* Project: CMSIS DSP Library
* Title: AppNodes.h
* Description: Application nodes for Example 1
*
* $Date: 29 July 2021
* $Revision: V1.10.0
*
* Target Processor: Cortex-M and Cortex-A cores
* -------------------------------------------------------------------- */
/*
* Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved.
*
* SPDX-License-Identifier: Apache-2.0
*
* Licensed under the Apache License, Version 2.0 (the License); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an AS IS BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _APPNODES_H_
#define _APPNODES_H_
#include <iostream>
template<typename IN, int inputSize>
class Sink: public GenericSink<IN, inputSize>
{
public:
Sink(FIFOBase<IN> &src):GenericSink<IN,inputSize>(src){};
int prepareForRunning() final
{
if (this->willUnderflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final
{
IN *b=this->getReadBuffer();
printf("Sink\n");
for(int i=0;i<inputSize;i++)
{
std::cout << (int)b[i] << std::endl;
}
return(0);
};
};
template<typename OUT,int outputSize>
class Source: public GenericSource<OUT,outputSize>
{
public:
Source(FIFOBase<OUT> &dst):GenericSource<OUT,outputSize>(dst){};
int prepareForRunning() final
{
if (this->willOverflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final{
OUT *b=this->getWriteBuffer();
printf("Source\n");
for(int i=0;i<outputSize;i++)
{
b[i] = (OUT)i;
}
return(0);
};
};
template<typename IN, int inputSize,typename OUT,int outputSize>
class ProcessingNode;
template<typename IN, int inputOutputSize>
class ProcessingNode<IN,inputOutputSize,IN,inputOutputSize>:
public GenericNode<IN,inputOutputSize,IN,inputOutputSize>
{
public:
ProcessingNode(FIFOBase<IN> &src,
FIFOBase<IN> &dst):GenericNode<IN,inputOutputSize,
IN,inputOutputSize>(src,dst){};
int prepareForRunning() final
{
if (this->willOverflow() ||
this->willUnderflow())
{
return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution
}
return(0);
};
int run() final{
printf("ProcessingNode\n");
IN *a=this->getReadBuffer();
IN *b=this->getWriteBuffer();
for(int i=0;i<inputOutputSize;i++)
{
b[i] = a[i]+1;
}
return(0);
};
};
#endif

@ -0,0 +1,14 @@
cmake_minimum_required (VERSION 3.14)
include(CMakePrintHelpers)
project(simpledsp)
add_executable(simpledsp main.cpp)
sdf(simpledsp create.py simpledsp)
add_sdf_dir(simpledsp)
target_include_directories(simpledsp PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
target_include_directories(simpledsp PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/generated)
target_link_libraries(simpledsp PRIVATE CMSISDSP)

@ -0,0 +1,175 @@
# README
This example is inside the folder `examples/simpledsp` of the Compute graph folder. Before reading this documentation you need to understand the principles explained in the [simple example without CMSIS-DSP](../simple/README.md)
This example explains how to create a very simple synchronous compute graph with 3 nodes. The difference with the [simple example without CMSIS-DSP](../simple/README.md) is that the `Processing` node has been replaced by a CMSIS-DSP function.
![simpledsp](docassets/simpledsp.png)
A CMSIS-DSP function can be used as explained so far : by creating a C++ wrapper. It is indeed the only way when you need to integrate functions, like FFT and FIR, that have a state (`arm_cfft_instance_f32` or `arm_fir_instance_f32`). The state is initialized in the constructor of the C++ wrapper and is preserved between successive executions of the `run` function.
But most of CMSIS-DSP functions are pure functions with no state and side effects: They take an input and generate an output.
For instance, the interface of `arm_offset_f32` is:
```c
void arm_offset_f32(
const float32_t * pSrc,
float32_t offset,
float32_t * pDst,
uint32_t blockSize);
```
This function is adding an offset to its input array. There is no state to initialize or preserve. To make it easier to integrate functions like that one, it is possible to use those functions directly in the compute graph. No C++ wrapper is needed. The Python will generate the code for calling the function automatically.
This integration is done with the nodes `Unary` and `Binary` defined in `cmsisdsp.cg.scheduler`.
`arm_offset_f32` can be used by creating a node `Binary` in the Python script:
```python
processing = Binary("arm_offset_f32",floatType,7)
```
Functions used with `Binary` must have a type like:
```C
void binary_function(
const T* pFirst or T pFirst,
T* pSecond or T pFirst,
T *pResult,
uint32_t numberOfSamplesGenerated);
```
Where `T` is a basic CMSIS-DSP type like `float32_t` ...
Functions used with `Unary` are similar but with just one input.
When the type is `T` only (and not a pointer), the argument cannot be connected to a FIFO. It occurs, for instance, when some arguments are scalars (offset, scaling ...)
To handle this case, a new kind of node is available : The `Constant` node. A constant node is defined with:
```python
offsetValue = Constant("OFFSET_VALUE")
```
The string `"OFFSET_VALUE"` is a C symbol (variable or `#define`).
As you can see in the picture, the node `OFFSET_VALUE` has no IO. There is no value displayed close to the node to show the amount of samples generated on an output.
The edge connecting this constant node the the CMSIS-DSP function is not a FIFO : there is no length displayed on this edge since there is no memory buffer allocated for this edge.
You can see in the `graph.py` that a constant node is connected directly. There is no IO property:
```python
the_graph.connect(offsetValue,processing.ib)
```
Here we are using `offsetValue` directly. There is no `.o` property used.
Constant nodes and edges are ignored by the scheduling. But the code generator is using them to replace some arguments with a `C` symbol : variable. `#define` ...
In this example, in `custom.h`, we have defined:
```C
#define OFFSET_VALUE 2.0f
```
The code generated to call the `arm_offset_f32` is:
```c
float32_t* i0;
float32_t* o2;
i0=fifo0.getReadBuffer(7);
o2=fifo1.getWriteBuffer(7);
arm_offset_f32(i0,OFFSET_VALUE,o2,7);
cgStaticError = 0;
```
Note that constant nodes can only be used with function nodes like `Binary` and `Unary`. The Python will not (currently) check that a constant node is connected only to function nodes.
There is another function node : `Dsp`.
It is a work in progress. `Dsp` attempts to detect if a CMSIS-DSP node is unary or binary and use the sample type to generate the function name.
For instance, you would write:
```python
scale=Dsp("scale",floatType,NB)
```
instead of:
```python
scale=Binary("arm_scale_f32",floatType,NB)
```
`Dsp` node is currently detecting only a very small subset of the `Binary` nodes. So, it is better to use `Binary` for now.
## How to build the example
This example requires CMSIS-DSP. Contrary to the [simple example without CMSIS-DSP](../simple/README.md), there is no simple `Makefile` to build it. You need to build it like all other examples using `cmake` as explained in the [top level documentation for the examples](../README.md).
## Expected output
Python output:
```
Schedule length = 19
Memory usage 88 bytes
```
Executable output:
```
Start
Source
Source
Sink
2
3
4
5
6
Source
Sink
2
3
4
5
6
Source
Source
Sink
2
3
4
5
6
Sink
2
3
4
5
6
Source
Sink
2
3
4
5
6
Source
Sink
2
3
4
5
6
Sink
2
3
4
5
6
```

@ -0,0 +1,28 @@
# Include definition of the nodes
from nodes import *
# Include definition of the graph
from graph import *
# Create a configuration object
conf=Configuration()
# The number of schedule iteration is limited to 1
# to prevent the scheduling from running forever
# (which should be the case for a stream computation)
conf.debugLimit=1
# Compute a static scheduling of the graph
# The size of FIFO is also computed
scheduling = the_graph.computeSchedule(config=conf)
# Print some statistics about the compute schedule
# and the memory usage
print("Schedule length = %d" % scheduling.scheduleLength)
print("Memory usage %d bytes" % scheduling.memory)
# Generate the C++ code for the static scheduler
scheduling.ccode("generated",conf)
# Generate a graphviz representation of the graph
with open("simpledsp.dot","w") as f:
scheduling.graphviz(f)

@ -0,0 +1,7 @@
#ifndef _CUSTOM_H_
typedef float float32_t;
#define OFFSET_VALUE 2.0f
#endif

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

@ -0,0 +1,179 @@
/*
Generated with CMSIS-DSP Compute Graph Scripts.
The generated code is not covered by CMSIS-DSP license.
The support classes and code is covered by CMSIS-DSP license.
*/
#include "arm_math.h"
#include "custom.h"
#include "GenericNodes.h"
#include "AppNodes.h"
#include "scheduler.h"
#if !defined(CHECKERROR)
#define CHECKERROR if (cgStaticError < 0) \
{\
goto errorHandling;\
}
#endif
#if !defined(CG_BEFORE_ITERATION)
#define CG_BEFORE_ITERATION
#endif
#if !defined(CG_AFTER_ITERATION)
#define CG_AFTER_ITERATION
#endif
#if !defined(CG_BEFORE_SCHEDULE)
#define CG_BEFORE_SCHEDULE
#endif
#if !defined(CG_AFTER_SCHEDULE)
#define CG_AFTER_SCHEDULE
#endif
#if !defined(CG_BEFORE_BUFFER)
#define CG_BEFORE_BUFFER
#endif
#if !defined(CG_BEFORE_FIFO_BUFFERS)
#define CG_BEFORE_FIFO_BUFFERS
#endif
#if !defined(CG_BEFORE_FIFO_INIT)
#define CG_BEFORE_FIFO_INIT
#endif
#if !defined(CG_BEFORE_NODE_INIT)
#define CG_BEFORE_NODE_INIT
#endif
#if !defined(CG_AFTER_INCLUDES)
#define CG_AFTER_INCLUDES
#endif
#if !defined(CG_BEFORE_SCHEDULER_FUNCTION)
#define CG_BEFORE_SCHEDULER_FUNCTION
#endif
#if !defined(CG_BEFORE_NODE_EXECUTION)
#define CG_BEFORE_NODE_EXECUTION
#endif
#if !defined(CG_AFTER_NODE_EXECUTION)
#define CG_AFTER_NODE_EXECUTION
#endif
CG_AFTER_INCLUDES
/*
Description of the scheduling.
*/
static unsigned int schedule[19]=
{
2,2,0,1,2,0,1,2,2,0,1,1,2,0,1,2,0,1,1,
};
CG_BEFORE_FIFO_BUFFERS
/***********
FIFO buffers
************/
#define FIFOSIZE0 11
#define FIFOSIZE1 11
#define BUFFERSIZE1 11
CG_BEFORE_BUFFER
float32_t buf1[BUFFERSIZE1]={0};
#define BUFFERSIZE2 11
CG_BEFORE_BUFFER
float32_t buf2[BUFFERSIZE2]={0};
CG_BEFORE_SCHEDULER_FUNCTION
uint32_t scheduler(int *error)
{
int cgStaticError=0;
uint32_t nbSchedule=0;
int32_t debugCounter=1;
CG_BEFORE_FIFO_INIT;
/*
Create FIFOs objects
*/
FIFO<float32_t,FIFOSIZE0,0,0> fifo0(buf1);
FIFO<float32_t,FIFOSIZE1,0,0> fifo1(buf2);
CG_BEFORE_NODE_INIT;
/*
Create node objects
*/
Sink<float32_t,5> sink(fifo1);
Source<float32_t,5> source(fifo0);
/* Run several schedule iterations */
CG_BEFORE_SCHEDULE;
while((cgStaticError==0) && (debugCounter > 0))
{
/* Run a schedule iteration */
CG_BEFORE_ITERATION;
for(unsigned long id=0 ; id < 19; id++)
{
CG_BEFORE_NODE_EXECUTION;
switch(schedule[id])
{
case 0:
{
{
float32_t* i0;
float32_t* o2;
i0=fifo0.getReadBuffer(7);
o2=fifo1.getWriteBuffer(7);
arm_offset_f32(i0,OFFSET_VALUE,o2,7);
cgStaticError = 0;
}
}
break;
case 1:
{
cgStaticError = sink.run();
}
break;
case 2:
{
cgStaticError = source.run();
}
break;
default:
break;
}
CG_AFTER_NODE_EXECUTION;
CHECKERROR;
}
debugCounter--;
CG_AFTER_ITERATION;
nbSchedule++;
}
errorHandling:
CG_AFTER_SCHEDULE;
*error=cgStaticError;
return(nbSchedule);
}

@ -0,0 +1,26 @@
/*
Generated with CMSIS-DSP Compute Graph Scripts.
The generated code is not covered by CMSIS-DSP license.
The support classes and code is covered by CMSIS-DSP license.
*/
#ifndef _SCHEDULER_H_
#define _SCHEDULER_H_
#ifdef __cplusplus
extern "C"
{
#endif
extern uint32_t scheduler(int *error);
#ifdef __cplusplus
}
#endif
#endif

@ -0,0 +1,41 @@
# Include definitions from the Python package to
# define datatype for the IOs and to have access to the
# Graph class
from cmsisdsp.cg.scheduler import *
# Include definition of the nodes
from nodes import *
# Define the datatype we are using for all the IOs in this
# example
floatType=CType(F32)
# Instantiate a Source node with a float datatype and
# working with packet of 5 samples (each execution of the
# source in the C code will generate 5 samples)
# "source" is the name of the C variable that will identify
# this node
src=Source("source",floatType,5)
# Instantiate a Processing node using a float data type for
# both the input and output. The number of samples consumed
# on the input and produced on the output is 7 each time
# the node is executed in the C code
# "processing" is the name of the C variable that will identify
# this node
processing=Binary("arm_offset_f32",floatType,7)
offsetValue=Constant("OFFSET_VALUE")
# Instantiate a Sink node with a float datatype and consuming
# 5 samples each time the node is executed in the C code
# "sink" is the name of the C variable that will identify
# this node
sink=Sink("sink",floatType,5)
# Create a Graph object
the_graph = Graph()
# Connect the source to the processing node
the_graph.connect(src.o,processing.ia)
the_graph.connect(offsetValue,processing.ib)
# Connect the processing node to the sink
the_graph.connect(processing.o,sink.i)

@ -0,0 +1,11 @@
#include <cstdio>
#include <cstdint>
#include "scheduler.h"
int main(int argc, char const *argv[])
{
int error;
printf("Start\n");
uint32_t nbSched=scheduler(&error);
return 0;
}

@ -0,0 +1,49 @@
# Include definitions from the Python package
from cmsisdsp.cg.scheduler import GenericNode,GenericSink,GenericSource
class Sink(GenericSink):
"""
Definition of a Sink node for the graph
Parameters
----------
name : str
Name of the C variable identifying this node
in the C code
theType : CGStaticType
The datatype for the input
inLength : int
The number of samples consumed by input
"""
def __init__(self,name,theType,inLength):
GenericSink.__init__(self,name)
self.addInput("i",theType,inLength)
@property
def typeName(self):
"""The name of the C++ class implementing this node"""
return "Sink"
class Source(GenericSource):
"""
Definition of a Source node for the graph
Parameters
----------
name : str
Name of the C variable identifying this node
in the C code
theType : CGStaticType
The datatype for the output
outLength : int
The number of samples produced on output
"""
def __init__(self,name,theType,outLength):
GenericSource.__init__(self,name)
self.addOutput("o",theType,outLength)
@property
def typeName(self):
"""The name of the C++ class implementing this node"""
return "Source"

@ -0,0 +1,65 @@
digraph structs {
node [shape=plaintext]
rankdir=LR
edge [arrowsize=0.5]
fontname="times"
arm_offset_f321 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD PORT="ia"><FONT POINT-SIZE="9.0">ia</FONT></TD>
<TD ALIGN="CENTER" ROWSPAN="2">arm_offset_f32<BR/>(Function)</TD>
<TD PORT="o"><FONT POINT-SIZE="9.0">o</FONT></TD>
</TR>
<TR>
<TD PORT="ib"><FONT POINT-SIZE="9.0">ib</FONT></TD>
<TD></TD></TR>
</TABLE>>];
sink [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">sink<BR/>(Sink)</TD>
</TR>
</TABLE>>];
source [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">source<BR/>(Source)</TD>
</TR>
</TABLE>>];
source:i -> arm_offset_f321:ia [label="f32(11)"
,headlabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >7</FONT>
</TD></TR></TABLE>>
,taillabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >5</FONT>
</TD></TR></TABLE>>]
arm_offset_f321:o -> sink:i [label="f32(11)"
,headlabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >5</FONT>
</TD></TR></TABLE>>
,taillabel=<<TABLE BORDER="0" CELLPADDING="2"><TR><TD><FONT COLOR="blue" POINT-SIZE="12.0" >7</FONT>
</TD></TR></TABLE>>]
OFFSET_VALUE [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="4">
<TR>
<TD ALIGN="CENTER" PORT="i">OFFSET_VALUE</TD>
</TR>
</TABLE>>];
OFFSET_VALUE:i -> arm_offset_f321:ib
}
Loading…
Cancel
Save