diff --git a/ComputeGraph/Dynamic.md b/ComputeGraph/Async.md similarity index 86% rename from ComputeGraph/Dynamic.md rename to ComputeGraph/Async.md index bf70407b..6780d7f3 100644 --- a/ComputeGraph/Dynamic.md +++ b/ComputeGraph/Async.md @@ -1,5 +1,7 @@ # Dynamic Data Flow +This feature is illustrated in the [Example 10 : The dynamic dataflow mode](examples/example10/README.md) + Versions of the compute graph corresponding to CMSIS-DSP Version >= `1.14.3` and Python wrapper version >= `1.10.0` are supporting a new dynamic / asynchronous mode. With a dynamic flow, the flow of data is potentially changing at each execution. The IOs can generate or consume a different amount of data at each execution of their node (including no data). @@ -13,7 +15,7 @@ With a dynamic flow and scheduling, there is no more any way to ensure that ther * Another node may decide to do nothing and skip the execution * Another node may decide to raise an error. -With dynamic scheduling, a node must implement the function `prepareForRunning` and decide what to do. +With dynamic flow, a node must implement the function `prepareForRunning` and decide what to do. 3 error / status codes are reserved for this. They are defined in the header `cg_status.h`. This header is not included by default, but if you define you own error codes, they should be coherent with `cg_status` and use the same values for the 3 status / error codes which are used in dynamic mode: @@ -23,9 +25,9 @@ With dynamic scheduling, a node must implement the function `prepareForRunning` Any other returned value will stop the execution. -The dynamic mode (also named asynchronous), is enabled with option : `asynchronous` +The dynamic mode (also named asynchronous), is enabled with option : `asynchronous` of the configuration object used with the scheduling functions. -The system will still compute a scheduling and FIFO sizes as if the flow was static. We can see the static flow as an average of the dynamic flow. In dynamic mode, the FIFOs may need to be bigger than the ones computed in static mode. The static estimation is giving a first idea of what the size of the FIFOs should be. The size can be increased by specifying a percent increase with option `FIFOIncrease`. +The system will still compute a synchronous scheduling and FIFO sizes as if the flow was static. We can see the static flow as an average of the dynamic flow. In dynamic mode, the FIFOs may need to be bigger than the ones computed in static mode. The static estimation is giving a first idea of what the size of the FIFOs should be. The size can be increased by specifying a percent increase with option `FIFOIncrease`. For pure compute functions (like CMSIS-DSP ones), which are not packaged into a C++ class, there is no way to customize the decision logic in case of a problem with FIFO. There is a global option : `asyncDefaultSkip`. @@ -82,7 +84,7 @@ If the `getReadBuffer` and `getWriteBuffer` are causing an underflow or overflow ## Graph constraints -The dynamic / asynchronous mode is using a synchronous graph as average / ideal case. But it is important to understand that we are no more in static / synchronous mode and some static graph may be too complex for the dynamic mode. Let's take the following graph as example: +The dynamic mode is using a synchronous graph as average / ideal case. But it is important to understand that we are no more in static / synchronous mode and some static graph may be too complex for the dynamic mode. Let's take the following graph as example: ![async_topological2](documentation/async_topological2.png) @@ -104,14 +106,14 @@ sink If we use a strategy of skipping the execution of a node in case of overflow / underflow, what will happen is: -* Schedule execution 1 +* Schedule iteration 1 * First `src` node execution is successful since there is a sample * All other execution attempts will be skipped -* Schedule execution 2 +* Schedule iteration 2 * First `src` node execution is successful since there is a sample * All other execution attempt will be skipped * ... -* Schedule execution 5: +* Schedule iteration 5: * First `src` node execution is successful since there is a sample * 4 other `src` node executions are skipped * The `filter` execution can finally take place since enough data has been generated @@ -143,5 +145,3 @@ As consequence, the recommendation in dynamic / asynchronous mode is to: * Ensure that the amount of data produced and consumed on each FIFO end is the same (so that each node execution is attempted only once during a schedule) * Use the maximum amount of samples required on both ends of the FIFO * Here `sink` is generating at most `1` sample, `filter` needs 5. So we use `5` on both ends of the FIFO -* More complex graphs will create a useless overhead in dynamic / asynchronous mode - diff --git a/ComputeGraph/CycloStatic.md b/ComputeGraph/CycloStatic.md index db57be3b..aefb01a5 100644 --- a/ComputeGraph/CycloStatic.md +++ b/ComputeGraph/CycloStatic.md @@ -1,5 +1,7 @@ # Cyclo static scheduling +This feature is illustrated in the [cyclo](examples/cyclo/README.md) example. + Beginning with the version `1.7.0` of the Python wrapper and version >= `1.12` of CMSIS-DSP, cyclo static scheduling has been added. ## What is the problem it is trying to solve ? diff --git a/ComputeGraph/FAQ.md b/ComputeGraph/FAQ.md index 2c5aadee..1fd89b8a 100644 --- a/ComputeGraph/FAQ.md +++ b/ComputeGraph/FAQ.md @@ -20,8 +20,8 @@ The read buffer and write buffers used to interact with a FIFO have the alignmen If the number of samples read is `NR` and the number of samples written if `NW`, the alignments (in number of samples) may be: -* `r0 . NR` (where `r0 ` if an integer with `r0 >= 0`) -* `w . NW - r1 . NR` (where `r1 ` and `w` are integers with `r1 >= 0` and `w >= 0`) +* `r0 . NR` for a read buffer in the FIFO (where `r0 ` if an integer with `r0 >= 0`) +* `w . NW - r1 . NR` for a write buffer in the FIFO (where `r1 ` and `w` are integers with `r1 >= 0` and `w >= 0`) If you need a stronger alignment, you'll need to chose `NR` and `NW` in the right way. @@ -29,7 +29,7 @@ For instance, if you need an alignment on a multiple of `16` bytes with a buffer If you can't choose freely the values of `NR` and `NW` then you may need to do a copy inside your component to align the buffer (of course only if the overhead due to the lack of alignment is bigger than doing a copy.) -## Memory sharing +## Memory sharing example When the `memoryOptimization` is enabled, the memory may be reused for different FIFOs to minimize the memory usage. But the scheduling algorithm is not trying to optimize this. So depending on how the graph was scheduled, the level of sharing may be different. @@ -42,21 +42,23 @@ If you share memory, you are using reference semantic and it should be hidden fr One could define an audio buffer data type : ```c++ -template +template struct SharedAudioBuf { float32_t *buf; static int getNbSamples() {return nbSamples;}; }; -template +template using SharedBuf = struct SharedAudioBuf; ``` -The template tracks the number of samples and the reference count. +The template tracks the number of samples and the reference count statically. `refCount` is not a value of the struct. It is a template argument : a number at type level. -The FIFO are no more containing the float samples but only the shared buffers. +The FIFOs are no more containing the audio samples but only a pointer to a shared buffers of samples. In this example, instead of having a length of 128 `float` samples, a FIFO would have a length of one `SharedBuf<128,r>` samples. @@ -64,7 +66,7 @@ An example of compute graph could be: ![shared_buffer](documentation/shared_buffer.png) -The copy of a `SharedBuf` is copying a pointer to a buffer and not the buffer. It is reference semantic and the buffer should not be modified if the ref count if > 1. +A copy of the struct `SharedBuf` is copying a pointer to a buffer and not the buffer. It is reference semantic and the buffer should not be modified if the ref count is > 1. In the above graph, there is a processing node doing in-place modification of the buffer and it could have a template specialization defined as: @@ -84,7 +86,7 @@ public GenericNode,1, The meaning is: * The input and output FIFOs have a length of 1 sample -* The sample has a type `SharedBuf` +* The sample has a type `SharedBuf` for both input and output * The reference count is statically known to be 1 so it is safe to do in place modifications of the buffer and the output buffer is a pointer to the input one In case of duplication, the template specialization could look like: @@ -257,67 +259,133 @@ public: The `input` and `output` arrays, used in the sink / source, are defined as extern. The source is reading from `input` and the sink is writing to `output`. -If we look at the asm code generated with `-Ofast` with armclang `AC6` and for one iteration of the schedule, we get: +The generated scheduler is: -```txt -PUSH {r4-r6,lr} -MOVW r5,#0x220 -MOVW r1,#0x620 -MOVT r5,#0x3000 -MOV r4,r0 -MOVT r1,#0x3000 -MOV r0,r5 -MOV r2,#0x200 -BL __aeabi_memcpy4 ; 0x10000a94 -MOVW r6,#0x420 -MOV r0,r5 -MOVT r6,#0x3000 -MOVS r2,#0x80 -VMOV.F32 s0,#0.5 -MOV r1,r6 -BL arm_offset_f32 ; 0x10002cd0 -MOV r0,#0x942c -MOV r1,r6 -MOVT r0,#0x3000 -MOV r2,#0x200 -BL __aeabi_memcpy4 ; 0x10000a94 -MOVS r1,#0 -MOVS r0,#1 -STR r1,[r4,#0] -POP {r4-r6,pc} +```C++ +uint32_t scheduler(int *error) +{ + int cgStaticError=0; + uint32_t nbSchedule=0; + int32_t debugCounter=1; + + CG_BEFORE_FIFO_INIT; + /* + Create FIFOs objects + */ + FIFO fifo0(buf0); + FIFO fifo1(buf1); + + CG_BEFORE_NODE_INIT; + /* + Create node objects + */ + ProcessingNode proc(fifo0,fifo1); + Sink sink(fifo1); + Source source(fifo0); + + /* Run several schedule iterations */ + CG_BEFORE_SCHEDULE; + while((cgStaticError==0) && (debugCounter > 0)) + { + /* Run a schedule iteration */ + CG_BEFORE_ITERATION; + for(unsigned long id=0 ; id < 3; id++) + { + CG_BEFORE_NODE_EXECUTION; + + switch(schedule[id]) + { + case 0: + { + cgStaticError = proc.run(); + } + break; + + case 1: + { + cgStaticError = sink.run(); + } + break; + + case 2: + { + cgStaticError = source.run(); + } + break; + + default: + break; + } + CG_AFTER_NODE_EXECUTION; + CHECKERROR; + } + debugCounter--; + CG_AFTER_ITERATION; + nbSchedule++; + } + +errorHandling: + CG_AFTER_SCHEDULE; + *error=cgStaticError; + return(nbSchedule); +} ``` -It is the code you would get if you was manually writing a call to the corresponding CMSIS-DSP function. All the C++ templates have disappeared. The switch / case used to implement the scheduler has also been removed. - -The code was generated with `memoryOptimization` enabled and the Python script detected in this case that the FIFOs are used as arrays. As consequence, there is no FIFO update code. They are used as normal arrays. - -The generated code is as efficient as something manually coded. - -The sink and the sources have been replaced by a `memcpy`. The call to the CMSIS-DSP function is just loading the registers and branching to the CMSIS-DSP function. - -The input buffer `input` is at address `0x30000620`. - -The `output` buffer is at address `0x3000942c`. - -We can see in the code: +If we look at the asm of the scheduler generated for a Cortex-M7 with `-Ofast` with armclang `AC6.19` and for **one** iteration of the schedule, we get (disassembly is from uVision IDE): ```txt -MOVW r1,#0x620 -... -MOVT r1,#0x3000 +0x000004B0 B570 PUSH {r4-r6,lr} + 97: b[i] = input[i]; +0x000004B2 F2402518 MOVW r5,#0x218 +0x000004B6 F2406118 MOVW r1,#0x618 +0x000004BA F2C20500 MOVT r5,#0x2000 +0x000004BE 4604 MOV r4,r0 +0x000004C0 F2C20100 MOVT r1,#0x2000 +0x000004C4 F44F7200 MOV r2,#0x200 +0x000004C8 4628 MOV r0,r5 +0x000004CA F00BF8E6 BL.W 0x0000B69A __aeabi_memcpy4 +0x000004CE EEB60A00 VMOV.F32 s0,#0.5 + 131: arm_offset_f32(a,0.5,b,inputSize); +0x000004D2 F2404618 MOVW r6,#0x418 +0x000004D6 F2C20600 MOVT r6,#0x2000 +0x000004DA 2280 MOVS r2,#0x80 +0x000004DC 4628 MOV r0,r5 +0x000004DE 4631 MOV r1,r6 +0x000004E0 F002FC5E BL.W 0x00002DA0 arm_offset_f32 + 63: output[i] = b[i]; +0x000004E4 F648705C MOVW r0,#0x8F5C +0x000004E8 F44F7200 MOV r2,#0x200 +0x000004EC F2C20000 MOVT r0,#0x2000 +0x000004F0 4631 MOV r1,r6 +0x000004F2 F00BF8D2 BL.W 0x0000B69A __aeabi_memcpy4 + 163: CG_AFTER_ITERATION; + 164: nbSchedule++; + 165: } + 166: + 167: errorHandling: + 168: CG_AFTER_SCHEDULE; + 169: *error=cgStaticError; + 170: return(nbSchedule); +0x000004F6 F2402014 MOVW r0,#0x214 +0x000004FA F2C20000 MOVT r0,#0x2000 +0x000004FE 6801 LDR r1,[r0,#0x00] +0x00000500 3101 ADDS r1,r1,#0x01 +0x00000502 6001 STR r1,[r0,#0x00] + 171: } +0x00000504 2001 MOVS r0,#0x01 +0x00000506 2100 MOVS r1,#0x00 + 169: *error=cgStaticError; +0x00000508 6021 STR r1,[r4,#0x00] +0x0000050A BD70 POP {r4-r6,pc} ``` -or - -``` -MOV r0,#0x942c -... -MOVT r0,#0x3000 -``` +It is the code you would get if you was manually writing a call to the corresponding CMSIS-DSP functions. All the C++ templates have disappeared. The switch / case used to implement the scheduler has also been removed. -just before the `memcpy` +The code was generated with `memoryOptimization` enabled and the Python script detected in this case that the FIFOs are used as arrays. As consequence, there is no FIFO update code. They are used as normal arrays. +The generated code is as efficient as something manually coded. +The sink and the sources have been replaced by a `memcpy`. The call to the CMSIS-DSP function is just loading the registers and branching to the CMSIS-DSP function. It is not always as ideal as in this example. But it demonstrates that the use of C++ templates and a Python code generator is enabling a low overhead solution to the problem of streaming and compute graph. diff --git a/ComputeGraph/Introduction.md b/ComputeGraph/Introduction.md new file mode 100644 index 00000000..f4bbdfaa --- /dev/null +++ b/ComputeGraph/Introduction.md @@ -0,0 +1,98 @@ +# Introduction + +Embedded systems are often used to implement streaming solutions : the software is processing and / or generating stream of samples. The software is made of components that have no concept of streams : they are working with buffers. As a consequence, implementing a streaming solution is forcing the developer to think about scheduling questions, FIFO sizing etc ... + +The CMSIS-DSP compute graph is a **low overhead** solution to this problem : it makes it easier to build streaming solutions by connecting components and computing a scheduling at **build time**. The use of C++ template also enables the compiler to have more information about the components for better code generation. + +A dataflow graph is a representation of how compute blocks are connected to implement a streaming processing. + +Here is an example with 3 nodes: + +- A source +- A filter +- A sink + +Each node is producing and consuming some amount of samples. For instance, the source node is producing 5 samples each time it is run. The filter node is consuming 7 samples each time it is run. + +The FIFOs lengths are represented on each edge of the graph : 11 samples for the leftmost FIFO and 5 for the other one. + +In blue, the amount of samples generated or consumed by a node each time it is called. + +graph1 + +When the processing is applied to a stream of samples then the problem to solve is : + +> **how the blocks must be scheduled and the FIFOs connecting the block dimensioned** + +The general problem can be very difficult. But, if some constraints are applied to the graph then some algorithms can compute a static schedule at build time. + +When the following constraints are satisfied we say we have a Synchronous / Static Dataflow Graph: + +- Each node is always consuming and producing the same number of samples (static / synchronous flow) + +The CMSIS-DSP Compute Graph Tools are a set of Python scripts and C++ classes with following features: + +- A compute graph and its static flow can be described in Python +- The Python script will compute a static schedule and the optimal FIFOs size +- A static schedule is: + - A periodic sequence of functions calls + - A periodic execution where the FIFOs remain bounded + - A periodic execution with no deadlock : when a node is run there is enough data available to run it +- The Python script will generate a [Graphviz](https://graphviz.org/) representation of the graph +- The Python script will generate a C++ implementation of the static schedule +- The Python script can also generate a Python implementation of the static schedule (for use with the CMSIS-DSP Python wrapper) + +There is no FIFO underflow or overflow due to the scheduling. If there are not enough cycles to run the processing, the real-time will be broken and the solution won't work. But this problem is independent from the scheduling itself. + +# Why it is useful + +Without any scheduling tool for a dataflow graph, there is a problem of modularity : a change on a node may impact other nodes in the graph. For instance, if the number of samples consumed by a node is changed: + +- You may need to change how many samples are produced by the predecessor blocks in the graph (assuming it is possible) +- You may need to change how many times the predecessor blocks must run +- You may have to change the FIFOs sizes + +With the CMSIS-DSP Compute Graph (CG) Tools you don't have to think about those details while you are still experimenting with your data processing pipeline. It makes it easier to experiment, add or remove blocks, change their parameters. + +The tools will generate a schedule and the FIFOs. Even if you don't use this at the end for a final implementation, the information could be useful : is the schedule too long ? Are the FIFOs too big ? Is there too much latency between the sources and the sinks ? + +Let's look at an (artificial) example: + +graph1 + +Without a tool, the user would probably try to modify the number of samples so that the number of sample produced is equal to the number of samples consumed. With the CG Tools we know that such a graph can be scheduled and that the FIFO sizes need to be 11 and 5. + +The periodic schedule generated for this graph has a length of 19. It is big for such a small graph and it is because, indeed 5 and 7 are not very well chosen values. But, it is working even with those values. + +The schedule is (the number of samples in the FIFOs after the execution of the nodes are displayed in the brackets): + +``` +source [ 5 0] +source [10 0] +filter [ 3 5] +sink [ 3 0] +source [ 8 0] +filter [ 1 5] +sink [ 1 0] +source [ 6 0] +source [11 0] +filter [ 4 5] +sink [ 4 0] +source [ 9 0] +filter [ 2 5] +sink [ 2 0] +source [ 7 0] +filter [ 0 5] +sink [ 0 0] +``` + +At the end, both FIFOs are empty so the schedule can be run again : it is periodic ! + +The compute graph is focusing on the synchronous / static case but some extensions have been introduced for more flexibility: + +* A [cyclo-static scheduling](CycloStatic.md) (nearly static) +* A [dynamic/asynchronous](Async.md) mode + +Here is a summary of the different configuration supported by the compute graph. The cyclo-static scheduling is part of the static flow mode. + +![supported_configs](documentation/supported_configs.png) \ No newline at end of file diff --git a/ComputeGraph/README.md b/ComputeGraph/README.md index 465f8b45..5651d48e 100644 --- a/ComputeGraph/README.md +++ b/ComputeGraph/README.md @@ -1,438 +1,34 @@ # Compute Graph for streaming with CMSIS-DSP -## Introduction +## Table of contents -Embedded systems are often used to implement streaming solutions : the software is processing and / or generating stream of samples. The software is made of components that have no concept of streams : they are working with buffers. As a consequence, implementing a streaming solution is forcing the developer to think about scheduling questions, FIFO sizing etc ... +1. ### [Introduction](Introduction.md) -The CMSIS-DSP compute graph is a **low overhead** solution to this problem : it makes it easier to build streaming solutions by connecting components and computing a scheduling at **build time**. The use of C++ template also enables the compiler to have more information about the components for better code generation. +2. ### How to get started -A dataflow graph is a representation of how compute blocks are connected to implement a streaming processing. + 1. [Simple graph creation example](examples/simple/README.md) -Here is an example with 3 nodes: + 2. [Simple graph creation example with CMSIS-DSP](examples/simpledsp/README.md) -- A source -- A filter -- A sink +3. ### [Examples](examples/README.md) -Each node is producing and consuming some amount of samples. For instance, the source node is producing 5 samples each time it is run. The filter node is consuming 7 samples each time it is run. +4. ### [Python API](documentation/PythonAPI.md) -The FIFOs lengths are represented on each edge of the graph : 11 samples for the leftmost FIFO and 5 for the other one. +5. ### [C++ Default nodes](documentation/CPPNodes.md) -In blue, the amount of samples generated or consumed by a node each time it is called. +6. ### [Python default nodes](documentation/PythonNodes.md) -graph1 +7. ### Extensions -When the processing is applied to a stream of samples then the problem to solve is : + 1. #### [Memory optimizations](documentation/Memory.md) -> **how the blocks must be scheduled and the FIFOs connecting the block dimensioned** + 2. #### [Cyclo-static scheduling](CycloStatic.md) -The general problem can be very difficult. But, if some constraints are applied to the graph then some algorithms can compute a static schedule at build time. + 3. #### [Dynamic / Asynchronous mode](Async.md) -When the following constraints are satisfied we say we have a Synchronous / Static Dataflow Graph: +8. ### [Maths principles](MATHS.md) -- Static graph : graph topology is not changing -- Each node is always consuming and producing the same number of samples (static flow) +9. ### [FAQ](FAQ.md) -The CMSIS-DSP Compute Graph Tools are a set of Python scripts and C++ classes with following features: -- A compute graph and its static flow can be described in Python -- The Python script will compute a static schedule and the FIFOs size -- A static schedule is: - - A periodic sequence of functions calls - - A periodic execution where the FIFOs remain bounded - - A periodic execution with no deadlock : when a node is run there is enough data available to run it -- The Python script will generate a [Graphviz](https://graphviz.org/) representation of the graph -- The Python script will generate a C++ implementation of the static schedule -- The Python script can also generate a Python implementation of the static schedule (for use with the CMSIS-DSP Python wrapper) - -There is no FIFO underflow or overflow due to the scheduling. If there are not enough cycles to run the processing, the real-time will be broken and the solution won't work But this problem is independent from the scheduling itself. - -## Why it is useful - -Without any scheduling tool for a dataflow graph, there is a problem of modularity : a change on a node may impact other nodes in the graph. For instance, if the number of samples consumed by a node is changed: - -- You may need to change how many samples are produced by the predecessor blocks in the graph (assuming it is possible) -- You may need to change how many times the predecessor blocks must run -- You may have to change the FIFOs sizes - -With the CMSIS-DSP Compute Graph (CG) Tools you don't have to think about those details while you are still experimenting with your data processing pipeline. It makes it easier to experiment, add or remove blocks, change their parameters. - -The tools will generate a schedule and the FIFOs. Even if you don't use this at the end for a final implementation, the information could be useful : is the schedule too long ? Are the FIFOs too big ? Is there too much latency between the sources and the sinks ? - -Let's look at an (artificial) example: - -graph1 - -Without a tool, the user would probably try to modify the number of samples so that the number of sample produced is equal to the number of samples consumed. With the CG Tools we know that such a graph can be scheduled and that the FIFO sizes need to be 11 and 5. - -The periodic schedule generated for this graph has a length of 19. It is big for such a small graph and it is because, indeed 5 and 7 are not very well chosen values. But, it is working even with those values. - -The schedule is (the size of the FIFOs after the execution of the node displayed in the brackets): - -``` -source [ 5 0] -source [10 0] -filter [ 3 5] -sink [ 3 0] -source [ 8 0] -filter [ 1 5] -sink [ 1 0] -source [ 6 0] -source [11 0] -filter [ 4 5] -sink [ 4 0] -source [ 9 0] -filter [ 2 5] -sink [ 2 0] -source [ 7 0] -filter [ 0 5] -sink [ 0 0] -``` - -At the end, both FIFOs are empty so the schedule can be run again : it is periodic ! - -The compute graph is focusing on the synchronous / static case but some extensions have been introduced for more flexibility: - -* A [cyclo-static scheduling](CycloStatic.md) (nearly static) -* A [dynamic/asynchronous](Dynamic.md) mode - -Here is a summary of the different configuration supported by the compute graph. The cyclo-static scheduling is part of the static flow mode. - -![supported_configs](documentation/supported_configs.png) - -More details about the maths behind the code generator are available in a [separate document](MATHS.md). - -## How to use the static scheduler generator - -First, you must install the `CMSIS-DSP` PythonWrapper: - -``` -pip install cmsisdsp -``` - -The functions and classes inside the cmsisdsp wrapper can be used to describe and generate the schedule. - -To start, you can create a `graph.py` file and include : - -```python -from cmsisdsp.cg.scheduler import * -``` - -In this file, you can describe new type of blocks that you need in the compute graph if they are not provided by the python package by default. - -Finally, you can execute `graph.py` to generate the C++ files. - -The generated files need to include the `ComputeGraph/cg/src/GenericNodes.h` and the nodes used in the graph and which can be found in `cg/nodes/cpp`. Those headers are part of the CMSIS-DSP Pack. They are optional so you'll need to select the compute graph extension in the pack. - -If you have declared new nodes in `graph.py` then you'll need to provide an implementation. - -More details and explanations can be found in the documentation for the examples. The first example is a deep dive giving all the details about the Python and C++ sides of the tool: - -* [Example 1 : how to describe a simple graph](documentation/example1.md) -* [Example 2 : More complex example with delay and CMSIS-DSP](documentation/example2.md) -* [Example 3 : Working example with CMSIS-DSP and FFT](documentation/example3.md) -* [Example 4 : Same as example 3 but with the CMSIS-DSP Python wrapper](documentation/example4.md) -* [Example 10 : The asynchronous mode](documentation/example10.md) - -Examples 5 and 6 are showing how to use the CMSIS-DSP MFCC with a synchronous data flow. - -Example 7 is communicating with OpenModelica. The Modelica model (PythonTest) in the example is implementing a Larsen effect. - -Example 8 is showing how to define a new custom datatype for the IOs of the nodes. Example 8 is also demonstrating a new feature where an IO can be connected up to 3 inputs and the static scheduler will automatically generate duplicate nodes. - -## Frequently asked questions: - -There is a [FAQ](FAQ.md) document. - -## Options - -Several options can be used in the Python to control the schedule generation. Some options are used by the scheduling algorithm and other options are used by the code generators or graphviz generator: - -### Options for the graph - -Those options needs to be used on the graph object created with `Graph()`. - -For instance : - -```python -g = Graph() -g.defaultFIFOClass = "FIFO" -``` - -#### defaultFIFOClass (default = "FIFO") - -Class used for FIFO by default. Can also be customized for each connection (`connect` of `connectWithDelay` call) with something like: - -`g.connect(src.o,b.i,fifoClass="FIFOClassNameForThisConnection")` - -#### duplicateNodeClassName(default="Duplicate") - -Prefix used to generate the duplicate node classes like `Duplicate2`, `Duplicate3` ... - -### Options for the scheduling - -Those options needs to be used on a configuration objects passed as argument of the scheduling function. For instance: - -```python -conf = Configuration() -conf.debugLimit = 10 -sched = g.computeSchedule(config = conf) -``` - -Note that the configuration object also contain options for the code generators. - -#### memoryOptimization (default = False) - -When the amount of data written to a FIFO and read from the FIFO is the same, the FIFO is just an array. In this case, depending on the scheduling, the memory used by different arrays may be reused if those arrays are not needed at the same time. - -This option is enabling an analysis to optimize the memory usage by merging some buffers when it is possible. - -#### sinkPriority (default = True) - -Try to prioritize the scheduling of the sinks to minimize the latency between sources and sinks. - -When this option is enabled, the tool may not be able to find a schedule in all cases. If it can't find a schedule, it will raise a `DeadLock` exception. - -#### displayFIFOSizes (default = False) - -During computation of the schedule, the evolution of the FIFO sizes is generated on `stdout`. - -#### dumpSchedule (default = False) - -During computation of the schedule, the human readable schedule is generated on `stdout`. - -### Options for the code generator - -#### debugLimit (default = 0) - -When `debugLimit` is > 0, the number of iterations of the scheduling is limited to `debugLimit`. Otherwise, the scheduling is running forever or until an error has occured. - -#### dumpFIFO (default = False) - -When true, generate some code to dump the FIFO content at runtime. Only useful for debug. - -In C++ code generation, it is only available when using the mode `codeArray == False`. - -When this mode is enabled, the first line of the scheduler file is : - -`#define DEBUGSCHED 1` - -and it also enable some debug code in `GenericNodes.h` - -#### schedName (default = "scheduler") - -Name of the scheduler function used in the generated code. - -#### prefix (default = "") - -Prefix to add before the FIFO buffer definitions. Those buffers are not static and are global. If you want to use several schedulers in your code, the buffer names used by each should be different. - -Another possibility would be to make the buffer static by redefining the macro `CG_BEFORE_BUFFER` - -#### Options for C Code Generation only - -##### cOptionalArgs (default = "") - -Optional arguments to pass to the C API of the scheduler function - -It can either use a `string` or a list of `string` where an element is an argument of the function (and should be valid `C`). - -##### codeArray (default = True) - -When true, the scheduling is defined as an array. Otherwise, a list of function calls is generated. - -A list of function call may be easier to read but if the schedule is long, it is not good for code size. In that case, it is better to encode the schedule as an array rather than a list of functions. - -When `codeArray` is True, the option `switchCase`can also be used. - -##### switchCase (default = True) - -`codeArray` must be true or this option is ignored. - -When the schedule is encoded as an array, it can either be an array of function pointers (`switchCase` false) or an array of indexes for a state machine (`switchCase` true) - -##### eventRecorder (default = False) - -Enable the generation of `CMSIS EventRecorder` intrumentation in the code. The CMSIS-DSP Pack is providing definition of 3 events: - -* Schedule iteration -* Node execution -* Error - -##### customCName (default = "custom.h") - -Name of custom header in generated C code. If you use several scheduler, you may want to use different headers for each one. - -##### postCustomCName (default = "") - -Name of custom header in generated C code coming after all of the other includes. - -##### genericNodeCName (default = "GenericNodes.h") - -Name of GenericNodes header in generated C code. If you use several scheduler, you may want to use different headers for each one. - -##### appNodesCName (default = "AppNodes.h") - -Name of AppNodes header in generated C code. If you use several scheduler, you may want to use different headers for each one. - -##### schedulerCFileName (default = "scheduler") - -Name of scheduler cpp and header in generated C code. If you use several scheduler, you may want to use different headers for each one. - -If the option is set to `xxx`, the names generated will be `xxx.cpp` and `xxx.h` - -##### CAPI (default = True) - -By default, the scheduler function is callable from C. When false, it is a standard C++ API. - -##### CMSISDSP (default = True) - -If you don't use any of the datatypes or functions of the CMSIS-DSP, you don't need to include the `arm_math.h` in the scheduler file. This option can thus be set to `False`. - -##### asynchronous (default = False) - -When true, the scheduling is for a dynamic / asynchronous flow. A node may not always produce or consume the same amount of data. As consequence, a scheduling can fail. Each node needs to implement a `prepareForRunning` function to identify and recover from FIFO underflows and overflows. - -A synchronous schedule is used as start and should describe the average case. - -This implies `codeArray` and `switchCase`. This disables `memoryOptimizations`. - -Synchronous FIFOs that are just buffers will be considered as FIFOs in asynchronous mode. - -More info are available in the documentation for [this mode](Dynamic.md). - -##### FIFOIncrease (default 0) - -In case of dynamic / asynchronous scheduling, the FIFOs may need to be bigger than what is computed assuming a static / synchronous scheduling. This option is used to increase the FIFO size. It represents a percent increase. - -For instance, a value of 10 means the FIFO will have their size updated from `oldSize` to `1.1 * oldSize` which is ` (1 + 10%)* oldSize` - -If the value is a `float` instead of an `int` it will be used as is. For instance, `1.1` would increase the size by `1.1` and be equivalent to the setting `10` (for 10 percent). - -##### asyncDefaultSkip (default True) - -Behavior of a pure function (like CMSIS-DSP) in asynchronous mode. When `True`, the execution is skipped if the function can't be executed. If `False`, an error is raised. - -If another error recovery is needed, the function must be packaged into a C++ class to implement a `prepareForRun` function. - -#### Options for Python code generation only - -##### pyOptionalArgs (default = "") - -Optional arguments to pass to the Python version of the scheduler function - -##### customPythonName (default = "custom") - -Name of custom header in generated Python code. If you use several scheduler, you may want to use different headers for each one. - -##### appNodesPythonName (default = "appnodes") - -Name of AppNodes header in generated Python code. If you use several scheduler, you may want to use different headers for each one. - -##### schedulerPythonFileName (default = "sched") - -Name of scheduler file in generated Python code. If you use several scheduler, you may want to use different headers for each one. - -If the option is set to `xxx`, the name generated will be `xxx.py` - -### Options for the graphviz generator - -#### horizontal (default = True) - -Horizontal or vertical layout for the graph. - -#### displayFIFOBuf (default = False) - -By default, the graph is displaying the FIFO sizes. If you want to know with FIFO variable is used in the code, you can set this option to true and the graph will display the FIFO variable names. - -### Options for connections - -It is now possible to write something like: - -```python -g.connect(src.o,b.i,fifoClass="FIFOSource") -``` - -The `fifoClass` argument allows to choose a specific FIFO class in the generated C++ or Python. - -Only the `FIFO` class is provided by default. Any new implementation must inherit from `FIFObase` - -There is also an option to set the scaling factor when used in asynchronous mode: - -```python -g.connect(odd.o,debug.i,fifoScale=3.0) -``` - -When this option is set, it will be used (instead of the global setting). This must be a float. - -## How to build the examples - -In folder `ComputeGraph/example/build`, type the `cmake` command: - -```bash -cmake -DHOST=YES \ - -DDOT="path to dot.EXE" \ - -DCMSISCORE="path to cmsis core include directory" \ - -G "Unix Makefiles" .. -``` - -The Graphviz dot tool is requiring a recent version supporting the HTML-like labels. - -If cmake is successful, you can type `make` to build the examples. It will also build CMSIS-DSP for the host. - -If you don't have graphviz, the option -DDOT can be removed. - -If for some reason it does not work, you can go into an example folder (for instance example1), and type the commands: - -```bash -python graph.py -dot -Tpdf -o test.pdf test.dot -``` - -It will generate the C++ files for the schedule and a pdf representation of the graph. - -Note that the Python code is relying on the CMSIS-DSP PythonWrapper which is now also containing the Python scripts for the Synchronous Data Flow. - -For `example3` which is using an input file, `cmake` should have copied the input test pattern `input_example3.txt` inside the build folder. The output file will also be generated in the build folder. - -`example4` is like `example3` but in pure Python and using the CMSIS-DSP Python wrapper (which must already be installed before trying the example). To run a Python example, you need to go into an example folder and type: - -```bash -python main.py -``` - -`example7` is communicating with `OpenModelica`. You need to install the VHTModelica blocks from the [VHT-SystemModeling](https://github.com/ARM-software/VHT-SystemModeling) project on our GitHub - -## Limitations - -- CMSIS-DSP integration must be improved to make it easier -- The code is requiring a lot more comments and cleaning -- A C version of the code generator is missing -- The code generation could provide more flexibility for memory allocation with a choice between: - - Global - - Stack - - Heap - -## Default nodes -Here is a list of the nodes supported by default. More can be easily added: - -- Unary: - - Unary function with header `void function(T* src, T* dst, int nbSamples)` -- Binary: - - Binary function with header `void function(T* srcA, T* srcB, T* dst, int nbSamples)` -- CMSIS-DSP function: - - It will detect if it is an unary or binary function. - - The name must not contain the prefix `arm` nor the the type suffix - - For instance, use `Dsp("mult",CType(F32),NBSAMPLES)` to use `arm_mult_f32` - - Other CMSIS-DSP function (with an instance variable) are requiring the creation of a Node if it is not already provided -- CFFT / ICFFT : Use of CMSIS-DSP CFFT. Currently only F32, F16 and Q15 -- Zip / Unzip : To zip / unzip streams -- ToComplex : Map a real stream onto a complex stream -- ToReal : Extract real part of a complex stream -- FileSource and FileSink : Read/write float to/from a file (Host only) -- NullSink : Do nothing. Useful for debug -- InterleavedStereoToMono : Interleaved stereo converted to mono with scaling to avoid saturation of the addition -- Python only nodes: - - WavSink and WavSource to use wav files for testing - - VHTSDF : To communicate with OpenModelica using VHTModelica blocks diff --git a/ComputeGraph/cg/nodes/cpp/CFFT.h b/ComputeGraph/cg/nodes/cpp/CFFT.h index 043c86a4..ccfc173b 100644 --- a/ComputeGraph/cg/nodes/cpp/CFFT.h +++ b/ComputeGraph/cg/nodes/cpp/CFFT.h @@ -3,13 +3,11 @@ * Title: CFFT.h * Description: Node for CMSIS-DSP cfft * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -45,7 +43,7 @@ public: status=arm_cfft_init_f32(&sfft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -57,7 +55,7 @@ public: return(0); }; - int run() override + int run() final { float32_t *a=this->getReadBuffer(); float32_t *b=this->getWriteBuffer(); @@ -85,7 +83,7 @@ public: status=arm_cfft_init_f16(&sfft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -97,7 +95,7 @@ public: return(0); }; - int run() override + int run() final { float16_t *a=this->getReadBuffer(); float16_t *b=this->getWriteBuffer(); @@ -124,7 +122,7 @@ public: status=arm_cfft_init_q15(&sfft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -136,7 +134,7 @@ public: return(0); }; - int run() override + int run() final { q15_t *a=this->getReadBuffer(); q15_t *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/ICFFT.h b/ComputeGraph/cg/nodes/cpp/ICFFT.h index 59def60e..1349b041 100644 --- a/ComputeGraph/cg/nodes/cpp/ICFFT.h +++ b/ComputeGraph/cg/nodes/cpp/ICFFT.h @@ -3,13 +3,11 @@ * Title: ICFFT.h * Description: Node for CMSIS-DSP icfft * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -45,7 +43,7 @@ public: status=arm_cfft_init_f32(&sifft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -57,7 +55,7 @@ public: return(0); }; - int run() override + int run() final { float32_t *a=this->getReadBuffer(); float32_t *b=this->getWriteBuffer(); @@ -85,7 +83,7 @@ public: status=arm_cfft_init_f16(&sifft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -97,7 +95,7 @@ public: return(0); }; - int run() override + int run() final { float16_t *a=this->getReadBuffer(); float16_t *b=this->getWriteBuffer(); @@ -125,7 +123,7 @@ public: status=arm_cfft_init_q15(&sifft,inputSize>>1); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -137,7 +135,7 @@ public: return(0); }; - int run() override + int run() final { q15_t *a=this->getReadBuffer(); q15_t *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/InterleavedStereoToMono.h b/ComputeGraph/cg/nodes/cpp/InterleavedStereoToMono.h index 8192182f..ab1daa28 100644 --- a/ComputeGraph/cg/nodes/cpp/InterleavedStereoToMono.h +++ b/ComputeGraph/cg/nodes/cpp/InterleavedStereoToMono.h @@ -3,13 +3,11 @@ * Title: InterleavedStereoToMono.h * Description: Interleaved Stereo to mono stream in Q15 * - * $Date: 06 August 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -40,7 +38,7 @@ public: InterleavedStereoToMono(FIFOBase &src,FIFOBase &dst): GenericNode(src,dst){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -52,7 +50,7 @@ public: return(0); }; - int run() override + int run() final { q15_t *a=this->getReadBuffer(); q15_t *b=this->getWriteBuffer(); @@ -72,7 +70,7 @@ public: InterleavedStereoToMono(FIFOBase &src,FIFOBase &dst): GenericNode(src,dst){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -84,7 +82,7 @@ public: return(0); }; - int run() override + int run() final { q31_t *a=this->getReadBuffer(); q31_t *b=this->getWriteBuffer(); @@ -104,7 +102,7 @@ public: InterleavedStereoToMono(FIFOBase &src,FIFOBase &dst): GenericNode(src,dst){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -116,7 +114,7 @@ public: return(0); }; - int run() override + int run() final { float32_t *a=this->getReadBuffer(); float32_t *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/MFCC.h b/ComputeGraph/cg/nodes/cpp/MFCC.h index 197c09e2..c7f4327a 100644 --- a/ComputeGraph/cg/nodes/cpp/MFCC.h +++ b/ComputeGraph/cg/nodes/cpp/MFCC.h @@ -3,13 +3,11 @@ * Title: MFCC.h * Description: Node for CMSIS-DSP MFCC * - * $Date: 06 October 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -58,7 +56,7 @@ public: #endif }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -70,7 +68,7 @@ public: return(0); }; - int run() override + int run() final { float32_t *a=this->getReadBuffer(); float32_t *b=this->getWriteBuffer(); @@ -101,7 +99,7 @@ public: #endif }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -113,7 +111,7 @@ public: return(0); }; - int run() override + int run() final { float16_t *a=this->getReadBuffer(); float16_t *b=this->getWriteBuffer(); @@ -140,7 +138,7 @@ public: memory.resize(2*inputSize); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -152,7 +150,7 @@ public: return(0); }; - int run() override + int run() final { q31_t *a=this->getReadBuffer(); q31_t *b=this->getWriteBuffer(); @@ -178,7 +176,7 @@ public: memory.resize(2*inputSize); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -190,7 +188,7 @@ public: return(0); }; - int run() override + int run() final { q15_t *a=this->getReadBuffer(); q15_t *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/NullSink.h b/ComputeGraph/cg/nodes/cpp/NullSink.h index cde56640..f3669db4 100644 --- a/ComputeGraph/cg/nodes/cpp/NullSink.h +++ b/ComputeGraph/cg/nodes/cpp/NullSink.h @@ -3,13 +3,11 @@ * Title: NullSink.h * Description: Sink doing nothing for debug * - * $Date: 08 August 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -35,7 +33,7 @@ class NullSink: public GenericSink public: NullSink(FIFOBase &src):GenericSink(src){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willUnderflow() ) @@ -46,7 +44,7 @@ public: return(0); }; - int run() override + int run() final { IN *b=this->getReadBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/OverlapAndAdd.h b/ComputeGraph/cg/nodes/cpp/OverlapAndAdd.h index 79986b8e..2ab36f35 100644 --- a/ComputeGraph/cg/nodes/cpp/OverlapAndAdd.h +++ b/ComputeGraph/cg/nodes/cpp/OverlapAndAdd.h @@ -3,13 +3,11 @@ * Title: OverlapAndAdd.h * Description: Overlap And Add * - * $Date: 25 October 2022 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2022 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -40,7 +38,7 @@ public: memory.resize(overlap); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -52,7 +50,7 @@ public: return(0); }; - int run() override + int run() final { int i; IN *a=this->getReadBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/SlidingBuffer.h b/ComputeGraph/cg/nodes/cpp/SlidingBuffer.h index 0b7621ae..b782cd52 100644 --- a/ComputeGraph/cg/nodes/cpp/SlidingBuffer.h +++ b/ComputeGraph/cg/nodes/cpp/SlidingBuffer.h @@ -3,13 +3,11 @@ * Title: SlidingBuffer.h * Description: Sliding buffer * - * $Date: 25 October 2022 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2022 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -40,7 +38,7 @@ public: memory.resize(overlap); }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -52,7 +50,7 @@ public: return(0); }; - int run() override + int run() final { IN *a=this->getReadBuffer(); IN *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/ToComplex.h b/ComputeGraph/cg/nodes/cpp/ToComplex.h index 5498ef1d..e6b26da5 100644 --- a/ComputeGraph/cg/nodes/cpp/ToComplex.h +++ b/ComputeGraph/cg/nodes/cpp/ToComplex.h @@ -3,13 +3,11 @@ * Title: ToComplex.h * Description: Node to convert real to complex * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -44,7 +42,7 @@ public: ToComplex(FIFOBase &src,FIFOBase &dst):GenericNode(src,dst){ }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -56,7 +54,7 @@ public: return(0); }; - int run() override + int run() final { IN *a=this->getReadBuffer(); IN *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/ToReal.h b/ComputeGraph/cg/nodes/cpp/ToReal.h index 18e19557..13f8e0ed 100644 --- a/ComputeGraph/cg/nodes/cpp/ToReal.h +++ b/ComputeGraph/cg/nodes/cpp/ToReal.h @@ -3,13 +3,11 @@ * Title: ToReal.h * Description: Node to convert complex to reals * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -43,7 +41,7 @@ public: ToReal(FIFOBase &src,FIFOBase &dst):GenericNode(src,dst){ }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow() @@ -55,7 +53,7 @@ public: return(0); }; - int run() override + int run() final { IN *a=this->getReadBuffer(); IN *b=this->getWriteBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/Unzip.h b/ComputeGraph/cg/nodes/cpp/Unzip.h index 20f10eaf..eeca9826 100644 --- a/ComputeGraph/cg/nodes/cpp/Unzip.h +++ b/ComputeGraph/cg/nodes/cpp/Unzip.h @@ -3,13 +3,11 @@ * Title: Unzip.h * Description: Node to unzip a stream of pair * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -46,7 +44,7 @@ public: Unzip(FIFOBase &src,FIFOBase &dst1,FIFOBase &dst2): GenericNode12(src,dst1,dst2){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow1() || this->willOverflow2() || @@ -62,7 +60,7 @@ public: /* 2*outputSize1 == 2*outSize2 == inputSize */ - int run() override + int run() final { IN *a=this->getReadBuffer(); IN *b1=this->getWriteBuffer1(); diff --git a/ComputeGraph/cg/nodes/cpp/Zip.h b/ComputeGraph/cg/nodes/cpp/Zip.h index 33c2500b..bae6da48 100644 --- a/ComputeGraph/cg/nodes/cpp/Zip.h +++ b/ComputeGraph/cg/nodes/cpp/Zip.h @@ -3,13 +3,11 @@ * Title: Zip.h * Description: Node to zip a pair of stream * - * $Date: 06 August 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -39,7 +37,7 @@ public: Zip(FIFOBase &src1,FIFOBase &src2,FIFOBase &dst): GenericNode21(src1,src2,dst){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() || this->willUnderflow1() || @@ -52,7 +50,7 @@ public: return(0); }; - int run() override + int run() final { IN *a1=this->getReadBuffer1(); IN *a2=this->getReadBuffer2(); diff --git a/ComputeGraph/cg/nodes/cpp/host/FileSink.h b/ComputeGraph/cg/nodes/cpp/host/FileSink.h index dd6ced15..4907176d 100644 --- a/ComputeGraph/cg/nodes/cpp/host/FileSink.h +++ b/ComputeGraph/cg/nodes/cpp/host/FileSink.h @@ -3,13 +3,11 @@ * Title: FileSink.h * Description: Node for creating File sinks * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -35,7 +33,7 @@ class FileSink: public GenericSink public: FileSink(FIFOBase &src, std::string name):GenericSink(src),output(name){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willUnderflow() ) @@ -46,7 +44,7 @@ public: return(0); }; - int run() override + int run() final { IN *b=this->getReadBuffer(); diff --git a/ComputeGraph/cg/nodes/cpp/host/FileSource.h b/ComputeGraph/cg/nodes/cpp/host/FileSource.h index 5ebf29dc..dd112fd0 100644 --- a/ComputeGraph/cg/nodes/cpp/host/FileSource.h +++ b/ComputeGraph/cg/nodes/cpp/host/FileSource.h @@ -3,13 +3,11 @@ * Title: FileSource.h * Description: Node for creating File sources * - * $Date: 30 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -46,7 +44,7 @@ public: }; - int prepareForRunning() override + int prepareForRunning() final { if (this->willOverflow() ) @@ -57,7 +55,7 @@ public: return(0); }; - int run() override + int run() final { string str; int i; diff --git a/ComputeGraph/cg/src/GenericNodes.h b/ComputeGraph/cg/src/GenericNodes.h index 552d9977..166c94b9 100644 --- a/ComputeGraph/cg/src/GenericNodes.h +++ b/ComputeGraph/cg/src/GenericNodes.h @@ -3,13 +3,11 @@ * Title: GenericNodes.h * Description: C++ support templates for the compute graph with static scheduler * - * $Date: 29 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2022 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * @@ -83,16 +81,23 @@ class FIFO: public FIFOBase FIFO(uint8_t *buffer,int delay=0):mBuffer((T*)buffer),readPos(0),writePos(delay) {}; /* Not used in synchronous mode */ - bool willUnderflowWith(int nb) const override {return false;}; - bool willOverflowWith(int nb) const override {return false;}; - int nbSamplesInFIFO() const override {return 0;}; + bool willUnderflowWith(int nb) const final {return false;}; + bool willOverflowWith(int nb) const final {return false;}; + int nbSamplesInFIFO() const final {return 0;}; - T * getWriteBuffer(int nb) override + T * getWriteBuffer(int nb) final { T *ret; if (readPos > 0) { + /* This is re-aligning the read buffer. + Aligning buffer is better for vectorized code. + But it has an impact since more memcpy are + executed than required. + This is likely to be not so useful in practice + so a future version will optimize the memcpy usage + */ memcpy((void*)mBuffer,(void*)(mBuffer+readPos),(writePos-readPos)*sizeof(T)); writePos -= readPos; readPos = 0; @@ -103,7 +108,7 @@ class FIFO: public FIFOBase return(ret); }; - T* getReadBuffer(int nb) override + T* getReadBuffer(int nb) final { T *ret = mBuffer + readPos; @@ -145,16 +150,16 @@ class FIFO: public FIFOBase FIFO(uint8_t *buffer,int delay=0):mBuffer((T*)buffer),readPos(0),writePos(delay) {}; /* Not used in synchronous mode */ - bool willUnderflowWith(int nb) const override {return false;}; - bool willOverflowWith(int nb) const override {return false;}; - int nbSamplesInFIFO() const override {return 0;}; + bool willUnderflowWith(int nb) const final {return false;}; + bool willOverflowWith(int nb) const final {return false;}; + int nbSamplesInFIFO() const final {return 0;}; - T * getWriteBuffer(int nb) override + T * getWriteBuffer(int nb) final { return(mBuffer); }; - T* getReadBuffer(int nb) override + T* getReadBuffer(int nb) final { return(mBuffer); } @@ -198,7 +203,7 @@ class FIFO: public FIFOBase before using this function */ - T * getWriteBuffer(int nb) override + T * getWriteBuffer(int nb) final { T *ret; @@ -221,7 +226,7 @@ class FIFO: public FIFOBase before using this function */ - T* getReadBuffer(int nb) override + T* getReadBuffer(int nb) final { T *ret = mBuffer + readPos; @@ -230,17 +235,17 @@ class FIFO: public FIFOBase return(ret); } - bool willUnderflowWith(int nb) const override + bool willUnderflowWith(int nb) const final { return((nbSamples - nb)<0); } - bool willOverflowWith(int nb) const override + bool willOverflowWith(int nb) const final { return((nbSamples + nb)>length); } - int nbSamplesInFIFO() const override {return nbSamples;}; + int nbSamplesInFIFO() const final {return nbSamples;}; #ifdef DEBUGSCHED void dump() @@ -423,7 +428,7 @@ public: Duplicate2(FIFOBase &src,FIFOBase &dst1,FIFOBase &dst2): GenericNode12(src,dst1,dst2){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willUnderflow() || this->willOverflow1() || @@ -435,7 +440,7 @@ public: return(0); }; - int run() override { + int run() final { IN *a=this->getReadBuffer(); IN *b1=this->getWriteBuffer1(); IN *b2=this->getWriteBuffer2(); @@ -475,7 +480,7 @@ public: IN,inputSize, IN,inputSize>(src,dst1,dst2,dst3){}; - int prepareForRunning() override + int prepareForRunning() final { if (this->willUnderflow() || this->willOverflow1() || @@ -489,7 +494,7 @@ public: return(0); }; - int run() override { + int run() final { IN *a=this->getReadBuffer(); IN *b1=this->getWriteBuffer1(); IN *b2=this->getWriteBuffer2(); diff --git a/ComputeGraph/cg/src/cg_status.h b/ComputeGraph/cg/src/cg_status.h index 995b117b..dbf271db 100644 --- a/ComputeGraph/cg/src/cg_status.h +++ b/ComputeGraph/cg/src/cg_status.h @@ -1,3 +1,29 @@ +/* ---------------------------------------------------------------------- + * Project: CMSIS DSP Library + * Title: cg_status.h + * Description: Error code for the Compute Graph + * + * + * Target Processor: Cortex-M and Cortex-A cores + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed under the Apache License, Version 2.0 (the License); you may + * not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + #ifndef _CG_STATUS_H_ diff --git a/ComputeGraph/documentation/CCodeGen.md b/ComputeGraph/documentation/CCodeGen.md new file mode 100644 index 00000000..8d13b978 --- /dev/null +++ b/ComputeGraph/documentation/CCodeGen.md @@ -0,0 +1,106 @@ +# C Code generation + +## API + +```python +def ccode(self,directory,config=Configuration()) +``` + +It is a method of the `Schedule` object returned by `computeSchedule`. + +It generate C++ code implementing the static schedule. + +* `directory` : The directory where to generate the C++ files +* `config` : An optional configuration object + +## Options for C Code Generation + +### cOptionalArgs (default = "") + +Optional arguments to pass to the C API of the scheduler function + +It can either use a `string` or a list of `string` where an element is an argument of the function (and should be valid `C`). + +For instance: + +```Python +conf.cOptionalArgs=["int someVariable"] +``` + +### codeArray (default = True) + +When true, the scheduling is defined as an array. Otherwise, a list of function calls is generated. + +A list of function call may be easier to read but if the schedule is long, it is not good for code size. In that case, it is better to encode the schedule as an array rather than a list of functions. + +When `codeArray` is True, the option `switchCase`can also be used. + +### switchCase (default = True) + +`codeArray` must be true or this option is ignored. + +When the schedule is encoded as an array, it can either be an array of function pointers (`switchCase` false) or an array of indexes for a state machine (`switchCase` true) + +### eventRecorder (default = False) + +Enable the generation of `CMSIS EventRecorder` intrumentation in the code. The CMSIS-DSP Pack is providing definition of 3 events: + +* Schedule iteration +* Node execution +* Error + +### customCName (default = "custom.h") + +Name of custom header in generated C code. If you use several scheduler, you may want to use different headers for each one. + +### postCustomCName (default = "") + +Name of custom header in generated C code coming after all of the other includes. By default none is used. + +### genericNodeCName (default = "GenericNodes.h") + +Name of GenericNodes header in generated C code. If you use several scheduler, you may want to use different headers for each one. + +### appNodesCName (default = "AppNodes.h") + +Name of AppNodes header in generated C code. If you use several scheduler, you may want to use different headers for each one. + +### schedulerCFileName (default = "scheduler") + +Name of scheduler `cpp` and header in generated C code. If you use several scheduler, you may want to use different headers for each one. + +If the option is set to `xxx`, the names generated will be `xxx.cpp` and `xxx.h` + +### CAPI (default = True) + +By default, the scheduler function is callable from C. When false, it is a standard C++ API. + +### CMSISDSP (default = True) + +If you don't use any of the datatypes or functions of the CMSIS-DSP, you don't need to include the `arm_math.h` in the scheduler file. This option can thus be set to `False`. + +### asynchronous (default = False) + +When true, the scheduling is for a dynamic / asynchronous flow. A node may not always produce or consume the same amount of data. As consequence, a scheduling can fail. Each node needs to implement a `prepareForRunning` function to identify and recover from FIFO underflows and overflows. + +A synchronous schedule is used as start and should describe the average case. + +This implies `codeArray` and `switchCase`. This disables `memoryOptimizations`. + +Synchronous FIFOs that are just buffers will be considered as FIFOs in asynchronous mode. + +More info are available in the documentation for [this mode](../Async.md). + +### FIFOIncrease (default 0) + +In case of dynamic / asynchronous scheduling, the FIFOs may need to be bigger than what is computed assuming a static / synchronous scheduling. This option is used to increase the FIFO size. It represents a percent increase. + +For instance, a value of `10` means the FIFO will have their size updated from `oldSize` to `1.1 * oldSize` which is ` (1 + 10%)* oldSize` + +If the value is a `float` instead of an `int` it will be used as is. For instance, `1.1` would increase the size by `1.1` and be equivalent to the setting `10` (for 10 percent). + +### asyncDefaultSkip (default True) + +Behavior of a pure function (like CMSIS-DSP) in asynchronous mode. When `True`, the execution is skipped if the function can't be executed. If `False`, an error is raised. + +If another error recovery is needed, the function must be packaged into a C++ class to implement a `prepareForRun` function. \ No newline at end of file diff --git a/ComputeGraph/documentation/CPPNodes.md b/ComputeGraph/documentation/CPPNodes.md new file mode 100644 index 00000000..08912f54 --- /dev/null +++ b/ComputeGraph/documentation/CPPNodes.md @@ -0,0 +1,455 @@ +# CPP Nodes and classes + +## Mandatory classes + +Those classes are defined in `GenericNodes.h` a header that is always included by the scheduler. + +As consequence, the definition for those classes is always included : that's the meaning of mandatory. + +### FIFO + +FIFO classes are inheriting from the virtual class `FIFOBase`: + +```C++ +template +class FIFOBase{ +public: + virtual T* getWriteBuffer(int nb)=0; + virtual T* getReadBuffer(int nb)=0; + virtual bool willUnderflowWith(int nb) const = 0; + virtual bool willOverflowWith(int nb) const = 0; + virtual int nbSamplesInFIFO() const = 0; + +}; +``` + +The functions `willUnderflowWith`, `willOverflowWith` and `nbSamplesInFIFO` are only used in asynchronous mode. + +If you implement a FIFO for synchronous mode you only need to implement `getWriteBuffer` and `getReadBuffer`. + +FIFO must be templates with a type defined as: + +```C++ +template +class FIFO; +``` + +* `T` is a C datatype that must have value semantic : standard C type like `float` or `struct` +* `length` is the length of the FIFO in **samples** +* `isArray` is set to 1 when the scheduler has identified that the FIFO is always used as a buffer. So it is possible to provide a more optimized implementation for this case +* `isAsync` is set to 1 for the asynchronous mode + +If you implement you own FIFO class, it should come from a template with the same arguments. For instance: + +```C++ +template +class MyCustomFIFO; +``` + +and it should inherit from `FIFOBase`. + +`GenericNodes.h` is providing 3 default implementations. Their are specialization of the FIFO template: + +#### FIFO for synchronous mode + +```C++ +template +class FIFO: public FIFOBase +``` + +#### Buffer for synchronous mode + +In some case a FIFO is just used as a buffer. An optimized implementation for this case is provided + +```C++ +template +class FIFO: public FIFOBase +``` + +In this mode, the FIFO implementation is very light. For instance, for `getWriteBuffer` we have: + +```C++ +T * getWriteBuffer(int nb) const final +{ + return(mBuffer); +}; +``` + +#### FIFO for asynchronous mode + +```C++ +template +class FIFO: public FIFOBase +``` + +This implementation is a bit more heavy and is providing implementations of following function that is doing something useful : + +```C++ +bool willUnderflowWith(int nb) const; +bool willOverflowWith(int nb) const; +int nbSamplesInFIFO() const; +``` + +### Nodes + +Nodes are inheriting from the virtual class: + +```C++ +class NodeBase +{ +public: + virtual int run()=0; + virtual int prepareForRunning()=0; +}; +``` + +`GenericNode`, `GenericSource` and `GenericSink` are providing accesses to the FIFOs for each IO. The goal of those wrappers is to define the IOs (number of IO, their type and length) and hide the API to the FIFOs. + +There are different versions depending on the number of inputs and/or output. Other nodes of that kind can be created by the user if different IO configurations are required: + +#### GenericNode + +The template is: + +```C++ +template +class GenericNode:public NodeBase +``` + +There is one input and one output. + +The constructor is: + +```C++ +GenericNode(FIFOBase &src,FIFOBase &dst); +``` + +It is taking the input and output FIFOs as argument. The real type of the FIFO is hidden since the type `FIFOBase` is used. So `GenericNode` can be used with any FIFO implementation. + +The main role of this `GenericNode` class is to provide functions to connect to the FIFOs. + +The functions to access the FIFO buffers are: + +```C++ +OUT * getWriteBuffer(int nb = outputSize); +IN * getReadBuffer(int nb = inputSize); +``` + +`getWriteBuffer` is getting a pointer to a buffer of length `nb` to write the output samples. + +`getReadBuffer` is getting a pointer to a buffer of length `nb` to read the input samples. + +`nb` must be chosen so that there is no underflow / overflow. In synchronous mode, it will work by design if the length defined in the template argument is used. The template length is thus chosen as default value for `nb`. + +This value may be changed in cyclo-static or asynchronous mode. In asynchronous mode, additional functions are provided to test for a possibility of underflow / overflow **before** getting a pointer to the buffer. + +It is done with following function that are also provided by `GenericNode`: + +```C++ +bool willOverflow(int nb = outputSize); +bool willUnderflow(int nb = inputSize); +``` + +All of those functions introduced by `GenericNode` are doing nothing more than calling the underlying FIFO methods. But they hide those FIFOs from the user code. The FIFO can only be accessed through those APIs. + +#### GenericNode12 + +Same as `GenericNode` but with two outputs. + +```C++ +template +class GenericNode12:public NodeBase +``` + +It provides: + +```C++ +IN * getReadBuffer(int nb=inputSize); +OUT1 * getWriteBuffer1(int nb=output1Size); +OUT2 * getWriteBuffer2(int nb=output2Size); + +bool willUnderflow(int nb = inputSize); +bool willOverflow1(int nb = output1Size); +bool willOverflow2(int nb = output2Size); +``` + +#### GenericNode13 + +Same but with 3 outputs. + +#### GenericNode21 + +Same but with 2 inputs and 1 output. + +#### GenericSource + +Similar to a `GenericNode` but there is no inputs. + +#### GenericSink + +Similar to a `GenericNode` but there is no outputs. + +#### Duplicate2 + +This node is duplicating its input to 2 outputs. + +The template is: + +```C++ +template +class Duplicate2; +``` + +Only one specialization of this template makes sense : the output must have same type and same length as the input. + +```C++ +template +class Duplicate2 : +public GenericNode12 +``` + + + +#### Duplicate3 + +Similar to `Duplicate2` but with 3 outputs. + +## Optional nodes + +Those nodes are not included by default. They can be found in `ComputeGraph/cg/nodes/cpp` + +To use any of them you just need to include the header (for instance in your `AppNodes.h` file): + +```C++ +#include "CFFT.h" +``` + +### CFFT / CIFFT + +Those nodes are for using the CMSIS-DSP FFT. + +Template: + +```C++ +template +class CFFT; +``` + +Specialization provided only for `float32_t`, `float16_t`,`q15_t`. + +The wrapper is copying the input buffer before doing the FFT (since CMSIS-DSP FFT is modifying the input buffer). It is normally possible to modify the input buffer even if it is in the input FIFO. + +This implementation has made the choice of not touching the input FIFO with the cost of an additional copy. + +Other data types can be easily added based on the current provided example. The user can just implement other specializations. + +`CIFFT` is defined with class `CIFFT`. + +### InterleavedStereoToMono + +Deinterleave a stream of stereo samples to **one** stream of mono samples. + +Template: + +```C++ +template +class InterleavedStereoToMono; +``` + +For specialization `q15_t` and `q31_t`, the inputs are divided by 2 before being added to avoid any overflow. + +For specialization `float32_t` : The output is multiplied by `0.5f` for consistency for the fixed point version. + +### MFCC + +Those nodes are for using the CMSIS-DSP MFCC. + +Template: + +```C++ +template +class MFCC; +``` + +Specializations provided for `float32_t`, `float16_t`, `q31_t` and `q15_t`. + +The MFCC is requiring a temporary buffer. The wrappers are thus allocating a memory buffer during initialization of the node. + +The buffer is allocated as a C++ vector. See the documentation of the MFCC in CMSIS-DSP to know more about the size of this buffer. + +### NullSink + +Template: + +```C++ +template +class NullSink: public GenericSink +``` + +It is useful for development and debug. This node is doing nothing and just consuming its input. + +### OverlapAndAdd + +Template: + +```c++ +template +class OverlapAdd: public GenericNode +``` + +There are two sizes in the template arguments : `windowSize` and `overlap`. + +From those size, the template is computing the number of samples consumed and produced by the node. + +The implementation is generic but will only build for a type `IN` having an addition operator. + +This node is using a little memory (C++ vector) of size `overlap` that is allocated during creation of the node. + +This node will overlap input data by `overlap` samples and add the common overlapping samples. + +### SlidingBuffer + +Template: + +```C++ +template +class SlidingBuffer: public GenericNode + +``` + +There are two sizes in the template arguments : `windowSize` and `overlap`. + +For those size, the template is computing the number of samples consumed and produced by the node. + +The implementation is generic and will work with all types. + +This node is using a little memory (C++ vector) of size `overlap` allocated during creation of the node. + +This node is moving a window on the input data with an overlap. The output data is the content of the window. + +Note that this node is not doing any multiplication with window functions that can be found in signal processing literature. This multiplication has to be implemented in the compute graph in a separate node. + +### ToComplex + +Template: + +```C++ +template +class ToComplex; +``` + +Convert a stream of reals a b c d ... to complexes a 0 b 0 c 0 d 0 ... + +The implementation is generic and does not enforce the required size constraints. + +### ToReal + +Template: + +```C++ +template +class ToReal; +``` + +Convert a stream of complex a 0 b 0 c 0 ... to reals a b c ... + +The implementation is generic and does not enforce the required size constraints. + +### Unzip + +Template: + +```C++ +template +class Unzip; +``` + +Unzip a stream a1 a2 b1 b2 c1 c2 ... + +Into 2 streams: +a1 b1 c1 ... +a2 b2 c2 ... + +The implementation is generic and does not enforce the required size constraints. + +### Zip + +Template: + +```C++ +template +class Zip; +``` + +Transform two input streams: + +a1 b1 c1 ... + +a2 b2 c2 ... + +into one output stream: + +a1 a2 b1 b2 c1 c2 ... + +The implementation is generic and does not enforce the required size constraints + +### Host + +Those nodes are for host (Windows, Linux, Mac). They can be useful to experiment with a compute graph. + +By default there is no nodes to read / write `.wav` files but you can easily add some if needed (`dr_wav.h` is a simple way to add `.wav` reading / writing and is freely available from the web). + +#### FileSink + +Template + +```C++ +template +class FileSink: public GenericSink +``` + +Write the input samples to a file. The implementation is generic and use iostream for writing the datatype. + +The constructor has an additional argument : the name/path of the output file: + +```C++ +FileSink(FIFOBase &src, std::string name) +``` + +#### FileSource + +Template: + +```C++ +template class FileSource; +``` + +There is only one specialization for the `float32_t` type. + +It is reading text file with one float per file and generating a stream of float. + +At the end of file, 0s are generated on the output indefinitely. + +The constructor has an additional argument : the name/path of the input file: + +```C++ +FileSource(FIFOBase &dst,std::string name) +``` + diff --git a/ComputeGraph/documentation/CodegenOptions.md b/ComputeGraph/documentation/CodegenOptions.md new file mode 100644 index 00000000..b54e6dd9 --- /dev/null +++ b/ComputeGraph/documentation/CodegenOptions.md @@ -0,0 +1,29 @@ +# Common options for the code generators + +Global options for the code generators. There are specific options for the C, Python and Graphviz generators. They are described in different part of the documentation. + +## debugLimit (default = 0) + +When `debugLimit` is > 0, the number of iterations of the scheduling is limited to `debugLimit`. Otherwise, the scheduling is running forever or until an error has occured. + +## dumpFIFO (default = False) + +When true, generate some code to dump the FIFO content at **runtime**. Only useful for debug. + +In C++ code generation, it is only available when using the mode `codeArray == False`. + +When this mode is enabled, the first line of the scheduler file is : + +`#define DEBUGSCHED 1` + +and it also enable some debug code in `GenericNodes.h` + +## schedName (default = "scheduler") + +Name of the scheduler function used in the generated code. + +## prefix (default = "") + +Prefix to add before the FIFO buffer definitions. Those buffers are not static and are global. If you want to use several schedulers in your code, the buffer names used by each should be different. + +Another possibility would be to make the buffer static by redefining the macro `CG_BEFORE_BUFFER` \ No newline at end of file diff --git a/ComputeGraph/documentation/Generic.md b/ComputeGraph/documentation/Generic.md new file mode 100644 index 00000000..f867c7d3 --- /dev/null +++ b/ComputeGraph/documentation/Generic.md @@ -0,0 +1,129 @@ +# Generic and functions bodes + +The generic and function nodes are the basic nodes that you use to create other kind of nodes in the graph. + +There are 3 generic classes provided by the framework to be used to create new nodes : + +* `GenericSource` +* `GenericNode` +* `GenericSink` + +They are defined in `cmsisdsp.cg.scheduler` + +There are 3 other classes that can be used to create new nodes from functions: + +* `Unary` +* `Binary` +* `Dsp` + +## Generic Nodes + +Any new kind of node must inherit from one of those classes. Those classes are providing the methods `addInput` and/or `addOutput` to define new IOs. + +The method `typeName` from the parent class must be overridden. + +A new kind of node is generally defined as: + +```python +class ProcessingNode(GenericNode): + def __init__(self,name,theType,inLength,outLength): + GenericNode.__init__(self,name) + self.addInput("i",theType,inLength) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + return "ProcessingNode" +``` + +See the [simple](../examples/simple/README.md) example for more explanation about how to define a new node. + +### Methods + +The constructor of the node is using the `addInput` and/or `addOutput` to define new IOs. + +```python +def addInput(self,name,theType,theLength): +``` + +* `name` is the name of the input. It will becomes a property of the Python object so it must not conflict with existing properties. If `name` is, for instance, "i" then it can be accessed with `node.i` in the code +* `theType` is the datatype of the IO. It must inherit from `CGStaticType` (see below for more details about defining the types) +* `theLength` is the amount of **samples** consumed by this IO at each execution of the node + +```python +def addOutput(self,name,theType,theLength): +``` + +* `name` is the name of the input. It will becomes a property of the Python object so it must not conflict with existing properties. If `name` is, for instance, "o" then it can be accessed with `node.o` in the code +* `theType` is the datatype of the IO. It must inherit from `CGStaticType` (see below for more details about defining the types) +* `theLength` is the amount of **samples** produced by this IO at each execution of the node + +```python +@property +def typeName(self): + return "ProcessingNode" +``` + +This method defines the name of the C++ class implementing the wrapper for this node. + +### Datatypes + +Datatypes for the IOs are inheriting from `CGStaticType`. + +Currently there are two classes defined: + +* `CType` for the standard CMSIS-DSP types +* `CStructType` for a C struct + +#### CType + +You create such a type with `CType(id)` where `id` is one of the constant coming from the Python wrapper: + +* F64 +* F32 +* F16 +* Q31 +* Q15 +* Q7 +* UINT32 +* UINT16 +* UINT8 +* SINT32 +* SINT16 +* SINT8 + +For instance, to define a `float32_t` type for an IO you can use `CType(F32)` + +#### CStructType + +The constructor has the following definition + +```python +def __init__(self,name,python_name,size_in_bytes): +``` + +* `name` is the name of the C struct +* `python_name` is the name of the Python class implementing this type (when you generate a Python schedule) +* `size_in_bytes` is the size of the struct. It should take into account padding. It is used in case of buffer sharing since the datatype of the shared buffer is `int8_t`. The Python script must be able to compute the size of those buffers and needs to know the size of the structure. + +In Python, there is no `struct`. This datatype is mapped to an object. Object have reference type. Compute graph FIFOs are assuming a value type semantic. + +As consequence, in Python side you should never copy those structs since it would copy the reference. You should instead copy the members of the struct. + +If you don't plan on generating a Python scheduler, you can just use whatever name you want for the `python_name`. It will be ignored by the C++ code generation. + +## Function and constant nodes + +A Compute graph C++ wrapper is useful when the software components you use have a state that needs to be initialized in the C++ constructor, and preserved between successive calls to the `run` method of the wrapper. + +Most CMSIS-DSP functions have no state. The compute graph framework is providing some ways to easily use functions in the graph without having to write a wrapper. + +This feature is relying on the nodes: + +* `Unary` +* `Binary` +* `Dsp` +* `Constant` + +All of this is explained in detail in the [simple example with CMSIS-DSP](../examples/simpledsp/README.md). + diff --git a/ComputeGraph/documentation/Graph.md b/ComputeGraph/documentation/Graph.md new file mode 100644 index 00000000..9ddcd4a8 --- /dev/null +++ b/ComputeGraph/documentation/Graph.md @@ -0,0 +1,57 @@ +# API of the Graph Class + +## Creating a connection + +Those methods must be applied to a graph object created with `Graph()`. The `Graph` class is defined inside `cmsisdsp.cg.scheduler` from the CMSIS-DSP Python wrapper. + +```python +def connect(self,input_io,output_io,fifoClass=None,fifoScale = 1.0): +``` + +Typically this method is used as: + +```python +the_graph = Graph() + +# Connect the source output to the processing node input +the_graph.connect(src.o,processing.i) +``` + +There are two optional arguments: + +* `fifoClass` : To use a different C++ class for implementing the connection between the two IOs. (it is also possible to change the FIFO class globally by setting an option on the graph. See below). Only the `FIFO` class is provided by default. Any new implementation must inherit from `FIFObase` +* `fifoScale` : In asynchronous mode, it is a scaling factor to increase the length of the FIFO compared to what has been computed by the synchronous approximation. This setting can also be set globally using the scheduler options. `fifoScale` is overriding the global setting. It must be a `float` (not an `int`). + +```python +def connectWithDelay(self,input_io,output_io,delay,fifoClass=None,fifoScale=1.0): +``` + +The only difference with the previous function is the `delay` argument. It could be used like: + +```python +the_graph.connect(src.o,processing.i, 10) +``` + +The `delay` is the number of samples contained in the FIFO at start (initialized to zero). The FIFO length (computed by the scheduling) is generally bigger by this amount of sample. The result is that it is delaying the output by `delay` samples. + +It is generally useful when the graph has some loops to make it schedulable. + +## Options for the graph + +Those options needs to be used on the graph object created with `Graph()`. + +For instance : + +```python +g = Graph() +g.defaultFIFOClass = "FIFO" +``` + +### defaultFIFOClass (default = "FIFO") + +Class used for FIFO by default. Can also be customized for each connection (`connect` of `connectWithDelay` call). + +### duplicateNodeClassName(default="Duplicate") + +Prefix used to generate the duplicate node classes like `Duplicate2`, `Duplicate3` ... + diff --git a/ComputeGraph/documentation/GraphvizGen.md b/ComputeGraph/documentation/GraphvizGen.md new file mode 100644 index 00000000..0859a039 --- /dev/null +++ b/ComputeGraph/documentation/GraphvizGen.md @@ -0,0 +1,24 @@ + + +# Graphviz generation + +## API + +```python +def graphviz(self,f,config=Configuration()) +``` + +It is a method of the `Schedule` object returned by `computeSchedule`. + +* `f` : Opened file where to write the graphviz description +* `config` : An optional configuration object + +## Options for the graphviz generator + +### horizontal (default = True) + +Horizontal or vertical layout for the graph. + +### displayFIFOBuf (default = False) + +By default, the graph is displaying the FIFO sizes computed as result of the scheduling. If you want to know the FIFO variable names used in the code, you can set this option to true and the graph will display the FIFO variable names. \ No newline at end of file diff --git a/ComputeGraph/documentation/Memory.md b/ComputeGraph/documentation/Memory.md new file mode 100644 index 00000000..af24d8d5 --- /dev/null +++ b/ComputeGraph/documentation/Memory.md @@ -0,0 +1,79 @@ +# Memory optimizations + +## Buffers + +Sometimes, a FIFO is in fact a buffer. In below graph, the source is writing 5 samples and the sink is reading 5 samples. + +![buffer](buffer.png) + +The scheduling will obviously be something like: + +`Source, Sink, Source, Sink ...` + +In this case, the FIFO is used as a simple buffer. The read and the write are always taking place from the start of the buffer. + +The schedule generator will detect FIFOs that are used as buffer and the FIFO implementation will be replaced by buffers : the third argument of the template (`isArray`) is set to one: + +```C++ +FIFO fifo0(buf1); +``` + +## Buffer sharing + +When several FIFOs are used as buffers then it may be possible to share the underlying memory for all of those buffers. This optimization is enabled by setting `memoryOptimization` to `true` in the configuration object: + +```python +conf.memoryOptimization=True +``` + +The optimization depends on how the graph has been scheduled. + +With the following graph there is a possibility for buffer sharing: + +![memory](memory.png) + +Without `memoryOptimization`, the FIFO are consuming 60 bytes (4*5 * 3 FIFOs). With `memoryOptimization`, only 40 bytes are needed. + +You cannot share memory for the input / output of a node since a node needs both to read and write for its execution. This imposes some constraints on the graph. + +The constraints are internally represented by a different graph that represents when buffers are live at the same time : the interference graph. The input / output buffers of a node are live at the same time. Graph coloring is used to identify, from this graph of interferences, when memory for buffers can be shared. + +The interference graph is highly depend on how the compute graph is scheduled : a buffer is live when a write has taken place but no read has yet read the full content. + +For the above compute graph and its computed schedule, the interference graph would be: + +![inter](inter.png) + + + +Adjacent vertices in the graph should use different colors. A coloring of this graph is equivalent to assigning memory areas. Graph coloring of the previous interference graph is giving the following buffer sharing: + +![fifos](fifos.png) + +The dimension of the buffer is the maximum for all the edges using this buffers. + +In the C++ code it is represented as: + +```C++ +#define BUFFERSIZE0 20 +CG_BEFORE_BUFFER +uint8_t buf0[BUFFERSIZE0]={0}; +``` + +`uint8_t` is used (instead of the `float32_t` of this example) because different edges of the graph may use different datatypes. + +It is really important that you use the macro `CG_BEFORE_BUFFER` to align this buffer so that the alignment is coherent with the datatype used on all the FIFOs. + +### Shared buffer sizing + +Let's look at a more complex example to see how the size of the shared buffer is computed: + +![shared_complex](shared_complex.png) + +The source is generating 10 samples instead of 5. The FIFOs are using 80 bytes without buffer sharing. + +With buffer sharing, 60 bytes are used. The buffer sharing is: + +![shared_complex_buffer](shared_complex_buffer.png) + +Buffer 1 is used by first and last edge in the graph. The dimension of this buffer is 40 bytes : big enough to be usable by edge 0 and edge 3 in the graph. \ No newline at end of file diff --git a/ComputeGraph/documentation/PythonAPI.md b/ComputeGraph/documentation/PythonAPI.md new file mode 100644 index 00000000..3d047f50 --- /dev/null +++ b/ComputeGraph/documentation/PythonAPI.md @@ -0,0 +1,30 @@ +# Python API + +Python APIs to describe the nodes and graph and generate the C++, Python or Graphviz code. + +1. ## [Graph class](Graph.md) + +2. ## [Generic and function nodes](Generic.md) + +3. ## Scheduler + + 1. ### [Schedule computation](SchedOptions.md) + + 2. ### Code generation + + 1. #### [C++ Code generation](CCodeGen.md) + + 2. #### [Python code generation](PythonGen.md) + + 3. #### [Graphviz representation](GraphvizGen.md) + + 4. #### [Common options](CodegenOptions.md) + + + + + + + + + diff --git a/ComputeGraph/documentation/PythonGen.md b/ComputeGraph/documentation/PythonGen.md new file mode 100644 index 00000000..9a65d6b8 --- /dev/null +++ b/ComputeGraph/documentation/PythonGen.md @@ -0,0 +1,34 @@ +# Python code generation + +## API + +```python +def pythoncode(self,directory,config=Configuration()) +``` + +It is a method of the `Schedule` object returned by `computeSchedule`. + +It generate Python code to implement the static schedule. + +* `directory` : The directory where to generate the C++ files +* `config` : An optional configuration object + +## Options for Python code generation + +### pyOptionalArgs (default = "") + +Optional arguments to pass to the Python version of the scheduler function + +### customPythonName (default = "custom") + +Name of custom header in generated Python code. If you use several scheduler, you may want to use different headers for each one. + +### appNodesPythonName (default = "appnodes") + +Name of AppNodes header in generated Python code. If you use several scheduler, you may want to use different headers for each one. + +### schedulerPythonFileName (default = "sched") + +Name of scheduler file in generated Python code. If you use several scheduler, you may want to use different headers for each one. + +If the option is set to `xxx`, the name generated will be `xxx.py` \ No newline at end of file diff --git a/ComputeGraph/documentation/PythonNodes.md b/ComputeGraph/documentation/PythonNodes.md new file mode 100644 index 00000000..c5f29868 --- /dev/null +++ b/ComputeGraph/documentation/PythonNodes.md @@ -0,0 +1,65 @@ +# Python Nodes and classes + +(DOCUMENTATION TO BE WRITTEN) + +## Mandatory classes + +FIFO + +GenericNode + +GenericNode12 + +GenericNode13 + +GenericNode21 + +GenericSource + +GenericSink + +OverlapAdd + +SlidingBuffer + +## Optional nodes + +CFFT + +CIFFT + +InterleavedStereoToMono + +MFCC + +NullSink + +ToComplex + +ToReal + +Unzip + +Zip + +Duplicate + +Duplicate2 + +Duplicate3 + +### Host + +FileSink + +FileSource + +WavSource + +WavSink + +NumpySink + +VHTSource + +VHTSink \ No newline at end of file diff --git a/ComputeGraph/documentation/SchedOptions.md b/ComputeGraph/documentation/SchedOptions.md new file mode 100644 index 00000000..6a22967a --- /dev/null +++ b/ComputeGraph/documentation/SchedOptions.md @@ -0,0 +1,49 @@ +# Schedule computation + +## API + +```python +def computeSchedule(self,config=Configuration()): +``` + +This is a method on the `Graph` object. It can take an optional `Configuration` object. + +It returns a `Schedule` object. This object contains: + +* A description of the static schedule +* The computed size of the FIFOs +* The FIFOs +* The buffers for the FIFOs (with sharing when possible if memory optimizations were enabled) +* A rewritten graph with `Duplicate` nodes inserted + +## Options for the scheduling + +Those options needs to be used on a configuration objects passed as argument of the scheduling function. For instance: + +```python +conf = Configuration() +conf.debugLimit = 10 +sched = g.computeSchedule(config = conf) +``` + +Note that the configuration object also contain options for the code generators. They are described in different part of the documentation. + +### memoryOptimization (default = False) + +When the amount of data written to a FIFO and read from the FIFO is the same, the FIFO is just an array. In this case, depending on the scheduling, the memory used by different arrays may be reused if those arrays are not needed at the same time. + +This option is enabling an analysis to optimize the memory usage by merging some buffers when it is possible. + +### sinkPriority (default = True) + +Try to prioritize the scheduling of the sinks to minimize the latency between sources and sinks. + +When this option is enabled, the tool may not be able to find a schedule in all cases. If it can't find a schedule, it will raise a `DeadLock` exception. + +### displayFIFOSizes (default = False) + +During computation of the schedule, the evolution of the FIFO sizes is generated on `stdout`. + +### dumpSchedule (default = False) + +During computation of the schedule, the human readable schedule is generated on `stdout`. \ No newline at end of file diff --git a/ComputeGraph/documentation/buffer.png b/ComputeGraph/documentation/buffer.png new file mode 100644 index 00000000..5760cbf3 Binary files /dev/null and b/ComputeGraph/documentation/buffer.png differ diff --git a/ComputeGraph/documentation/example1.md b/ComputeGraph/documentation/example1.md deleted file mode 100644 index f677b17e..00000000 --- a/ComputeGraph/documentation/example1.md +++ /dev/null @@ -1,483 +0,0 @@ -# Example 1 - -In this example we will see how to describe the following graph: - -graph1 - -The framework is coming with some default blocks. But for this example, we will create new blocks. The blocks that you to create need must be described with a simple Python class and a corresponding simple C++ class. - -## The steps - -It looks complex because there is a lot of information but the process is always the same: - -1. You define new kind of nodes in the Python. They define the IOs, sample types and amount of data read/written on each IO -2. You create instance of those new kind of Nodes -3. You connect them in a graph and generate a schedule -4. In your AppNodes.h file , you implement the new kind of nodes with C++ templates: - 1. The class is generally not doing a lot : defining the IOs and the function to call when run -5. If you need more control on the initialization, it is possible to pass additional arguments to the node constructors and to the scheduler function. - -## Python code - -Let's analyze the file `graph.py` in the `example1` folder. This file is describing the graph and the node and is calling the Python functions to generate the dot and C++ files. - - - -First, we add some path so that the example can find the CG static packages when run from example1 folder. - -```python -from cmsisdsp.cg.scheduler import * -``` - - - -Then, we describe the new kind of blocks that we need : Source, ProcessingNode and Sink. - -```python -class Sink(GenericSink): - def __init__(self,name,theType,inLength): - GenericSink.__init__(self,name) - self.addInput("i",theType,inLength) - - @property - def typeName(self): - return "Sink" -``` - -When creating a new kind of node (here a sink) we always need to do 2 things: - -- Add a type in typeName. It will be used to create objects in C++ or Python. So it must be a valid C++ or Python class name ; -- Add inputs and outputs. The convention is that an input is named "i" and output "o". When there are several inputs they are named "ia", "ib" etc ... -- For a sink you can only add an input. So the function addOutput is not available. -- The constructor is taking a length and a type. It is used to create the io -- When there are several inputs or outputs, they are ordered using alphabetical order. -It is important to know what is the ID of the corresponding IO in the C code. - -The definition of a new kind of Source is very similar: - -```python -class Source(GenericSource): - def __init__(self,name,theType,inLength): - GenericSource.__init__(self,name) - self.addOutput("o",theType,inLength) - - @property - def typeName(self): - return "Source" -``` - - - -Then for the processing node, we could define it directly. But, often there will be several Nodes in a graph, so it is useful to create a new Node blocks and inherit from it. - -```python -class Node(GenericNode): - def __init__(self,name,theType,inLength,outLength): - GenericNode.__init__(self,name) - self.addInput("i",theType,inLength) - self.addOutput("o",theType,outLength) -``` - - - -Note that this new kind of block has no type. It just has an input and an output. - -Now we can define the Processing node: - -```python -class ProcessingNode(Node): - @property - def typeName(self): - return "ProcessingNode" -``` - -We just define its type. - -Once it is done, we can start creating instance of those nodes. We will also need to define the type for the samples (float32 in this example). The functions and constants are defined in `cg.types`. - -```python -floatType=CType(F32) -``` - -It is also possible to use a custom datatype, the `example8` is giving an example: - -```python -complexType=CStructType("complex","MyComplex",8) -``` - -This is defining a new datatype that is mapped to the type `complex` in C/C++ and the class `MyComplex` in Python. The last argument is the size in bytes of the struct in C. - -The type complex may be defined with: - -```c -typedef struct { - float re; - float im; -} complex; -``` - -**Note that:** - -- The value **must have** value semantic in C/C++. So avoid classes -- In Python, the classes have reference semantic which implies some constraints: - - You should never modify an object from the read buffer - - You should change the field of an object in the write buffer - - If you need a new object : copy or create a new object. Never use an object from the read buffer as it is if you intend to customize it - -Once a datatype has been defined and chosen, we can define the nodes for the graph: - -```python -src=Source("source",floatType,5) -b=ProcessingNode("filter",floatType,7,5) -sink=Sink("sink",floatType,5) -``` - -For each node, we define : - -- The name (name of variable in C++ or Python generated code) -- The type for the inputs and outputs -- The numbers of samples consumed / produced on the io -- Inputs are listed first for the number of samples - -For `ProcessingNode` we are adding additional arguments to show how it is possible to add other arguments for initializing a node in the generated code: - -```python -b.addLiteralArg(4) -b.addLiteralArg("Test") -b.addVariableArg("someVariable") -``` - - - -The C++ for object of type `ProcessingNode` are taking 3 arguments in addition to the io. For those, arguments we are passing an int, a string and a variable name. - -Now that the nodes have been created, we can create the graph and connect the nodes: - -```python -g = Graph() - -g.connect(src.o,b.i) -g.connect(b.o,sink.i) -``` - -Then, before we generate a schedule, we can define some configuration: - -```python -conf=Configuration() -conf.debugLimit=1 -``` - -Since it is streamed based processing, the schedule should run forever. For testing, we can limit the number of iterations. Here the generated code will run just one iteration of the schedule. - -This configuration object can be used as argument of the scheduling function (named parameter config) and must be used as argument of the code generating functions. - -There are other fields for the configuration: - -- `dumpFIFO` : Will dump the output FIFOs content after each execution of the node (the code generator is inserting calls to the FIFO dump function) -- `displayFIFOSizes` : During the computation of the schedule, the Python script is displaying the evolution of the FIFO lengths. -- `schedName` : The name of the scheduler function (`scheduler` by default) -- `cOptionalArgs` and pyOptionalArgs for passing additional arguments to the scheduling function -- `prefix` to prefix the same of the global buffers -- `memoryOptimization` : Experimental. It is attempting to reuse buffer memory and share it between several FIFOs -- `codeArray` : Experimental. When a schedule is very long, representing it as a sequence of function calls is not good for the code size of the generated solution. When this option is enabled, the schedule is described with an array. It implies that the pure function calls cannot be inlined any more and are replaced by new nodes which are automatically generated. -- `eventRecorder` : Enable the support for the CMSIS Event Recorder. - -In the example 1, we are passing a variable to initialize the node of type ProcessingNode. So, it would be great if this variable was an argument of the scheduler function. So we define: - -```python -conf.cOptionalArgs="int someVariable" -``` - -This will be added after the error argument of the scheduling function. - -Once we have a configuration object, we can start to compute the schedule and generate the code: - -```python -sched = g.computeSchedule() -print("Schedule length = %d" % sched.scheduleLength) -print("Memory usage %d bytes" % sched.memory) -``` - -A schedule is computed. We also display: - -- The length of the schedule -- The total amount of memory used by all the FIFOs - -We could also have used: - -```python -sched = g.computeSchedule(config=conf) -``` - -to use the configuration object if we needed to dump the FIFOs lengths. - -Now, that we have a schedule, we can generate the graphviz and the C++ code: - -```python -with open("test.dot","w") as f: - sched.graphviz(f) - -sched.ccode("generated",conf) -``` - -The C++ code will be generated in the `example1` folder `generated` : sched.cpp - -## The C++ code - -The C++ code generated in`scheduler.cpp` and `scheduler.h` in `generated` folder is relying on some additional files which must be provided by the developer: - -- custom.h : to define some custom initialization or `#define` used by the code -- AppNodes.h to define the new C++ blocks - -Let's look at custom.h first: - -### custom.h - -```c++ -#ifndef _CUSTOM_H_ - - -#endif _CUSTOM_H_ -``` - -It is empty in `example1`. This file can be used to include or define some variables and constants used by the network. - -### AppNodes.h - -All the new nodes defined in the Python script must also be defined in the C++ code. They are very similar to the Python code but a bit more verbose. - -```c++ -template -class Sink: public GenericSink -{ -public: - Sink(FIFOBase &src):GenericSink(src){}; - - int prepareForRunning() override - { - if (this->willUnderflow()) - { - return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution - } - - return(0); - }; - - int run() override - { - IN *b=this->getReadBuffer(); - printf("Sink\n"); - for(int i=0;i -class Source: GenericSource -{ -public: - Source(FIFOBase &dst):GenericSource(dst),mCounter(0){}; - - int prepareForRunning() override - { - if (this->willOverflow()) - { - return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution - } - - return(0); - }; - - int run() override - { - OUT *b=this->getWriteBuffer(); - - printf("Source\n"); - for(int i=0;i -class ProcessingNode: public GenericNode -{ -public: - ProcessingNode(FIFOBase &src,FIFOBase &dst,int,const char*,int):GenericNode(src,dst){}; - - int prepareForRunning() override - { - if (this->willOverflow() || - this->willUnderflow() - ) - { - return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution - } - - return(0); - }; - - int run() override - { - printf("ProcessingNode\n"); - IN *a=this->getReadBuffer(); - OUT *b=this->getWriteBuffer(); - b[0] =(OUT)a[3]; - return(0); - }; - -}; -``` - -The processing node is (very arbitrary) copying the value at index 3 to index 0 of the output. - -The processing node is taking 3 arguments after the FIFOs in the constructor because the Python script is defining 3 additional arguments for this node : `int`, `string` and another `int` but passed trough a variable in the scheduler. - -### scheduler.cpp - -The generated code is first including the needed headers: - -```C++ -#include "arm_math.h" -#include "custom.h" -#include "GenericNodes.h" -#include "AppNodes.h" -#include "scheduler.h" -``` - -- CMSIS-DSP header -- Custom definitions -- Generic nodes from `GenericNodes.h` -- Application nodes -- scheduler API - -Then, the generated code is defining the buffers for the FIFOs: - -```C++ -/*********** - -FIFO buffers - -************/ - -#define FIFOSIZE0 11 -float32_t buf0[FIFOSIZE0]={0}; - -#define FIFOSIZE1 5 -float32_t buf1[FIFOSIZE1]={0}; -``` - -Then, the scheduling function is generated: - -```C++ -uint32_t scheduler(int *error,int someVariable) { -``` - -A value `<0` in `error` means there was an error during the execution. - -The returned valued is the number of schedules fully executed when the error occurred. - -The `someVariable` is defined in the Python script. The Python script can add as many arguments as needed with whatever type is needed. - -The scheduling function is starting with a definition of some variables used for debug and statistics: - -```C++ -int cgStaticError=0; -uint32_t nbSchedule=0; -int32_t debugCounter=1; -``` - -Then, it is followed with a definition of the FIFOs: - -```C++ -/* -Create FIFOs objects -*/ -FIFO fifo0(buf0); -FIFO fifo1(buf1); -``` - -Then, the nodes are created and connected to the FIFOs: - -```C++ -/* -Create node objects -*/ -ProcessingNode filter(fifo0,fifo1,4,"Test",someVariable); -Sink sink(fifo1); -Source source(fifo0); -``` - -One can see that the processing nodes has 3 additional arguments in addition to the FIFOs. Those arguments are defined in the Python script. The third argument is `someVariable` and this variable must be in the scope. That's why the Python script is adding an argument `someVariable` to the scheduler API. So, one can pass information to nay node from the outside of the scheduler using those additional arguments. - -And finally, the function is entering the scheduling loop: - -```C++ - while((cgStaticError==0) && (debugCounter > 0)) - { - nbSchedule++; - - cgStaticError = source.run(); - CHECKERROR; -``` - -`CHECKERROR` is a macro defined in `Sched.h`. It is just testing if `cgStaticError< 0` and breaking out of the loop if it is the case. This can be redefined by the user. - -Since an application may want to use several SDF graphs, the name of the `sched` and `customInit` functions can be customized in the `configuration` object on the Python side: - -```python -config.schedName = "sched" -``` - -A prefix can also be added before the name of the global FIFO buffers: - -```python -config.prefix="bufferPrefix" -``` - -## Summary - -It looks complex because there is a lot of information but the process is always the same: - -1. You define new kind of nodes in the Python. They define the IOs, type and amount of data read/written on each IO -2. You create Python instance of those new kind of Nodes -3. You connect them in a graph and generate a schedule -4. In you AppNodes.h, you implement the new kind of nodes with a C++ template: - 1. The template is generally defining the IO and the function to call when run - 1. It should be minimal. The template is just a wrapper. Don't forget those nodes are created on the stack in the scheduler function. So they should not be too big. They should just be simple wrappers -5. If you need more control on the initialization, it is possible to pass additional arguments to the nodes constructors and to the scheduler function. diff --git a/ComputeGraph/documentation/example10.md b/ComputeGraph/documentation/example10.md deleted file mode 100644 index 015351d6..00000000 --- a/ComputeGraph/documentation/example10.md +++ /dev/null @@ -1,27 +0,0 @@ -# Example 10 - -This example is implementing a dynamic / asynchronous mode. - -It is enabled in `graph.py` with: - -`conf.asynchronous = True` - -The FIFO sizes are doubled with: - -`conf.FIFOIncrease = 100` - -The graph implemented in this example is: - -![graph10](graph10.png) - -There is a global iteration count corresponding to one execution of the schedule. - -The odd source is generating a value only when the count is odd. - -The even source is generating a value only when the count is even. - -The processing is adding its inputs. If no data is available on an input, 0 is used. - -In case of fifo overflow or underflow, any node will slip its execution. - -All nodes are generating or consuming one sample but the FIFOs have a size of 2 because of the 100% increase requested in the configuration settings. \ No newline at end of file diff --git a/ComputeGraph/documentation/example3.md b/ComputeGraph/documentation/example3.md deleted file mode 100644 index 84293c69..00000000 --- a/ComputeGraph/documentation/example3.md +++ /dev/null @@ -1,127 +0,0 @@ -# Example 3 - -This example is implementing a working example with FFT. The graph is: - -![graph3](graph3.PNG) - -The example is: - -- Providing a file source which is reading a source file and then padding with zero -- A sliding window -- A multiplication with a Hann window -- A conversion to/from complex -- Use of CMSIS-DSP FFT/IFFT -- Overlap and add -- File sink writing the result into a file - -The new feature s compared to previous examples are: - -- The constant array HANN -- The CMSIS-DSP FFT - -## Constant array - -It is like in example 2 where the constant was a float. - -Now, the constant is an array: - -```python -hann=Constant("HANN") -``` - - - -In custom.h, this array is defined as: - -```C++ -extern const float32_t HANN[256]; -``` - - - -## CMSIS-DSP FFT - -The FFT node cannot be created using a `Dsp` node in Python because FFT is requiring specific initializations. So, a Python class and C++ class must be created : - - - -```python -class CFFT(GenericNode): - def __init__(self,name,inLength): - GenericNode.__init__(self,name) - - self.addInput("i",floatType,2*inLength) - self.addOutput("o",floatType,2*inLength) - - @property - def typeName(self): - return "CFFT" -``` - -Look at the definition of the inputs and outputs : The FFT is using complex number so the ports have twice the number of float samples. The argument of the constructor is the FFT length in complex sample. - -We suggest to use as arguments of the blocks a number of samples which is meaningful for the blocks and use the lengths in standard data type (f32, q31 ...) when defining the IO. - -So here, the number of complex samples is used as arguments. But the IO are using the number of floats required to encode those complex numbers. - -The corresponding C++ class is: - -```C++ -template -class CFFT: public GenericNode -{ -public: - CFFT(FIFOBase &src,FIFOBase &dst): - GenericNode(src,dst){ - arm_status status; - status=arm_cfft_init_f32(&sfft,inputSize>>1); - }; - - int prepareForRunning() override - { - if (this->willOverflow() || - this->willUnderflow() - ) - { - return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution - } - - return(0); - }; - - int run() override { - IN *a=this->getReadBuffer(); - OUT *b=this->getWriteBuffer(); - memcpy((void*)b,(void*)a,outputSize*sizeof(IN)); - arm_cfft_f32(&sfft,b,0,1); - return(0); - }; - - arm_cfft_instance_f32 sfft; - -}; -``` - -It is verbose but not difficult. The constructor is initializing the CMSIS-DSP FFT instance and connecting to the FIFO (through GenericNode). - - - -The run function is applying the `arm_cfft_f32`. Since this function is modifying the input buffer, there is a `memcpy`. It is not really needed here. The read buffer can be modified by the CFFT. It will just make it more difficult to debug if you'd like to inspect the content of the FIFOs. - - - -This node is provided in `cg/nodes/cpp` so no need to define it. You can just use it by including the right headers. - -It can be used by just doing in your `AppNodes.h` file : - -```c++ -#include "CFFT.h" -``` - -From Python side it would be: - -```python -from cmsisdsp.cg.scheduler import * -``` - -The scheduler module is automatically including the default nodes. diff --git a/ComputeGraph/documentation/fifos.png b/ComputeGraph/documentation/fifos.png new file mode 100644 index 00000000..aa6fb3d5 Binary files /dev/null and b/ComputeGraph/documentation/fifos.png differ diff --git a/ComputeGraph/documentation/graph1.PNG b/ComputeGraph/documentation/graph1.PNG deleted file mode 100644 index ce1e3782..00000000 Binary files a/ComputeGraph/documentation/graph1.PNG and /dev/null differ diff --git a/ComputeGraph/documentation/inter.png b/ComputeGraph/documentation/inter.png new file mode 100644 index 00000000..57195ed7 Binary files /dev/null and b/ComputeGraph/documentation/inter.png differ diff --git a/ComputeGraph/documentation/memory.png b/ComputeGraph/documentation/memory.png new file mode 100644 index 00000000..784122d3 Binary files /dev/null and b/ComputeGraph/documentation/memory.png differ diff --git a/ComputeGraph/documentation/shared_complex.png b/ComputeGraph/documentation/shared_complex.png new file mode 100644 index 00000000..f196a11d Binary files /dev/null and b/ComputeGraph/documentation/shared_complex.png differ diff --git a/ComputeGraph/documentation/shared_complex_buffer.png b/ComputeGraph/documentation/shared_complex_buffer.png new file mode 100644 index 00000000..97b48e69 Binary files /dev/null and b/ComputeGraph/documentation/shared_complex_buffer.png differ diff --git a/ComputeGraph/examples/CMakeLists.txt b/ComputeGraph/examples/CMakeLists.txt index bf876a19..3d43aa5e 100644 --- a/ComputeGraph/examples/CMakeLists.txt +++ b/ComputeGraph/examples/CMakeLists.txt @@ -5,22 +5,22 @@ set(Python_FIND_REGISTRY "LAST") find_package (Python COMPONENTS Interpreter) -function(sdf TARGET) +function(sdf TARGET SCRIPT DOTNAME) if (DOT) add_custom_command(TARGET ${TARGET} PRE_BUILD - BYPRODUCTS ${CMAKE_CURRENT_SOURCE_DIR}/test.pdf - COMMAND ${DOT} -Tpdf -o ${CMAKE_CURRENT_SOURCE_DIR}/test.pdf ${CMAKE_CURRENT_SOURCE_DIR}/test.dot + BYPRODUCTS ${CMAKE_CURRENT_SOURCE_DIR}/${DOTNAME}.pdf + COMMAND ${DOT} -Tpdf -o ${CMAKE_CURRENT_SOURCE_DIR}/${DOTNAME}.pdf ${CMAKE_CURRENT_SOURCE_DIR}/${DOTNAME}.dot WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} - DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/test.dot + DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/${DOTNAME}.dot VERBATIM ) endif() add_custom_command(OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/generated/scheduler.cpp - ${CMAKE_CURRENT_SOURCE_DIR}/test.dot - COMMAND ${Python_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/graph.py + ${CMAKE_CURRENT_SOURCE_DIR}/${DOTNAME}.dot + COMMAND ${Python_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/${SCRIPT} WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} - DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/graph.py + DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/${SCRIPT} VERBATIM ) target_sources(${TARGET} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/generated/scheduler.cpp) @@ -73,6 +73,9 @@ add_subdirectory(example6 bin_example6) add_subdirectory(example8 bin_example8) add_subdirectory(example9 bin_example9) add_subdirectory(example10 bin_example10) +add_subdirectory(simple bin_simple) +add_subdirectory(simpledsp bin_simpledsp) +add_subdirectory(cyclo bin_cyclo) # Python examples add_subdirectory(example4 bin_example4) diff --git a/ComputeGraph/examples/README.md b/ComputeGraph/examples/README.md new file mode 100644 index 00000000..064beae3 --- /dev/null +++ b/ComputeGraph/examples/README.md @@ -0,0 +1,64 @@ +## How to build the examples + +First, you must install the `CMSIS-DSP` PythonWrapper: + +``` +pip install cmsisdsp +``` + +The functions and classes inside the cmsisdsp wrapper can be used to describe and generate the schedule. + +You need a recent Graphviz dot tool supporting the HTML-like labels. You'll need `cmake` and `make` + +In folder `ComputeGraph/example/build`, type the `cmake` command: + +```bash +cmake -DHOST=YES \ + -DDOT="path to dot.EXE" \ + -DCMSISCORE="path to cmsis core include directory" \ + -G "Unix Makefiles" .. +``` + +The core include directory is something like `CMSIS_5/Core` ... + +If cmake is successful, you can type `make` to build the examples. It will also build CMSIS-DSP for the host. + +If you don't have graphviz, the option -DDOT can be removed. + +If for some reason it does not work, you can go into an example folder (for instance example1), and type the commands: + +```bash +python graph.py +dot -Tpdf -o test.pdf test.dot +``` + +It will generate the C++ files for the schedule and a pdf representation of the graph. + +Note that the Python code is relying on the CMSIS-DSP PythonWrapper which is now also containing the Python scripts for the Synchronous Data Flow. + +For `example3` which is using an input file, `cmake` should have copied the input test pattern `input_example3.txt` inside the build folder. The output file will also be generated in the build folder. + +`example4` is like `example3` but in pure Python and using the CMSIS-DSP Python wrapper (which must already be installed before trying the example). To run a Python example, you need to go into an example folder and type: + +```bash +python main.py +``` + +`example7` is communicating with `OpenModelica`. You need to install the VHTModelica blocks from the [AVH-SystemModeling](https://github.com/ARM-software/VHT-SystemModeling) project on our GitHub + +# List of examples + +* [Simple example without CMSIS-DSP](simple/README.md) : **How to get started** +* [Simple example with CMSIS-DSP](simpledsp/README.md) : **How to get started with CMSIS-DSP** +* [Example 1](example1/README.md) : Same as the simple example but explaining how to add arguments to the scheduler API and node constructors. This example is also giving a **detailed explanation of the C++ code** generated for the scheduler +* [Example 2](example2/README.md) : Explain how to use CMSIS-DSP pure functions (no state) and add delay on the arcs of the graph. Explain some configuration options for the schedule generation. +* [Example 3](example3/README.md) : A full signal processing example with CMSIS-DSP using FFT and sliding windows and overlap and add node +* [Example 4](example4/README.md) : Same as example 3 but where we generate a Python implementation rather than a C++ implementation. The resulting graph can be executed thanks to the CMSIS-DSP Python wrapper +* [Example 5](example5/README.md) : Another pure Python example showing how to compute a sequence of Q15 MFCC and generate an animation (using also the CMSIS-DSP Python wrapper) +* [Example 6](example6/README.md) : Same as example 5 but with C++ code generation +* [Example 7](example7/README.md) : Pure Python example demonstrating a communication between the compute graph and OpenModelica to generate a Larsen effect +* [Example 8](example8/README.md) : Introduce structured datatype for the samples and implicit `Duplicate` nodes for the graph +* [Example 9](example9/README.md) : Check that duplicate nodes and arc delays are working together and a scheduling is generated +* [Example 10 : The dynamic dataflow mode](example10/README.md) +* [Cyclo-static scheduling](cyclo/README.md) + diff --git a/ComputeGraph/examples/References.md b/ComputeGraph/examples/References.md deleted file mode 100644 index 498b8dca..00000000 --- a/ComputeGraph/examples/References.md +++ /dev/null @@ -1,36 +0,0 @@ -# Reference statistics - -The different examples should return following schedule statistics: - - -## Example 1 - Schedule length = 17 - Memory usage 64 bytes - -## Example 2 - Schedule length = 302 - Memory usage 10720 bytes - -## Example 3 - Schedule length = 25 - Memory usage 11264 bytes - -## Example 4 - Schedule length = 25 - Memory usage 11264 bytes - -## Example 5 - Schedule length = 292 - Memory usage 6614 bytes - -## Example 6 - Schedule length = 17 - Memory usage 2204 bytes - -## Example 7 - Schedule length = 3 - Memory usage 512 bytes - -## Example 8 - Schedule length = 37 - Memory usage 288 bytes diff --git a/ComputeGraph/examples/cyclo/AppNodes.h b/ComputeGraph/examples/cyclo/AppNodes.h new file mode 100644 index 00000000..e8ab0ed8 --- /dev/null +++ b/ComputeGraph/examples/cyclo/AppNodes.h @@ -0,0 +1,155 @@ +/* ---------------------------------------------------------------------- + * Project: CMSIS DSP Library + * Title: AppNodes.h + * Description: Application nodes for Example cyclo + * + * Target Processor: Cortex-M and Cortex-A cores + * -------------------------------------------------------------------- */ +/* + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed under the Apache License, Version 2.0 (the License); you may + * not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#ifndef _APPNODES_H_ +#define _APPNODES_H_ + +#include + +template +class Sink: public GenericSink +{ +public: + Sink(FIFOBase &src):GenericSink(src){}; + + int prepareForRunning() final + { + if (this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final + { + IN *b=this->getReadBuffer(); + printf("Sink\n"); + for(int i=0;i +class Source: public GenericSource +{ +public: + Source(FIFOBase &dst):GenericSource(dst), + mPeriod(0),mValuePeriodStart(0){}; + + int getSamplesForPeriod() const + { + if (mPeriod == 0) + { + return(3); + } + return(2); + } + + void updatePeriod(){ + mPeriod++; + mValuePeriodStart = 3; + if (mPeriod == 2) + { + mPeriod = 0; + mValuePeriodStart = 0; + } + } + + int prepareForRunning() final + { + /* Cyclo static scheduling do not make sense in + asynchronous mode so the default outputSize is used. + This function is never used in cyclo-static scheduling + */ + if (this->willOverflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + OUT *b=this->getWriteBuffer(getSamplesForPeriod()); + + printf("Source\n"); + for(int i=0;i +class ProcessingNode; + + +template +class ProcessingNode: + public GenericNode +{ +public: + ProcessingNode(FIFOBase &src, + FIFOBase &dst):GenericNode(src,dst){}; + + int prepareForRunning() final + { + if (this->willOverflow() || + this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + printf("ProcessingNode\n"); + IN *a=this->getReadBuffer(); + IN *b=this->getWriteBuffer(); + for(int i=0;igetWriteBuffer(getSamplesForPeriod()); + +printf("Source\n"); +for(int i=0;i +class Source: public GenericSource +``` + +`outputSize` cannot be the list `[3,2]`. + +The generated code is using the max of the values, so here `3`: + +```C++ +Source source(fifo0); +``` + +## Expected output: + +``` +Schedule length = 26 +Memory usage 88 bytes +``` + +The schedule length is `26` compared to `19` for the simple example where source is generating samples by packet of 5. The source node executions must be a multiple of 2 in this graph because the period of sample generation has length 2. In the original graph, the number of executions could be an odd number. That's why there are more executions in this cyclo-static scheduling. + +The memory usage (FIFO) is the same as the one for the simple example without cyclo-static scheduling. + +The expected output of the execution is still 1,2,3,4,5,1,2,3,4,5 ... but the scheduling is different. There are more source executions. + +``` +Start +Source +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Sink +1 +2 +3 +4 +5 +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Sink +1 +2 +3 +4 +5 +``` + diff --git a/ComputeGraph/examples/cyclo/create.py b/ComputeGraph/examples/cyclo/create.py new file mode 100644 index 00000000..edbf5676 --- /dev/null +++ b/ComputeGraph/examples/cyclo/create.py @@ -0,0 +1,31 @@ +# Include definition of the nodes +from nodes import * +# Include definition of the graph +from graph import * + +# Create a configuration object +conf=Configuration() +# The number of schedule iteration is limited to 1 +# to prevent the scheduling from running forever +# (which should be the case for a stream computation) +conf.debugLimit=1 +# Disable inclusion of CMSIS-DSP headers so that we don't have +# to recompile CMSIS-DSP for such a simple example +conf.CMSISDSP = False + +# Compute a static scheduling of the graph +# The size of FIFO is also computed +scheduling = the_graph.computeSchedule(config=conf) + +# Print some statistics about the compute schedule +# and the memory usage +print("Schedule length = %d" % scheduling.scheduleLength) +print("Memory usage %d bytes" % scheduling.memory) + +# Generate the C++ code for the static scheduler +scheduling.ccode("generated",conf) + +# Generate a graphviz representation of the graph +with open("cyclo.dot","w") as f: + scheduling.graphviz(f) + diff --git a/ComputeGraph/examples/cyclo/custom.h b/ComputeGraph/examples/cyclo/custom.h new file mode 100644 index 00000000..41b6c6ee --- /dev/null +++ b/ComputeGraph/examples/cyclo/custom.h @@ -0,0 +1,5 @@ +#ifndef _CUSTOM_H_ + +typedef float float32_t; + +#endif \ No newline at end of file diff --git a/ComputeGraph/examples/cyclo/cyclo.dot b/ComputeGraph/examples/cyclo/cyclo.dot new file mode 100644 index 00000000..cefab4f4 --- /dev/null +++ b/ComputeGraph/examples/cyclo/cyclo.dot @@ -0,0 +1,48 @@ + + + + +digraph structs { + node [shape=plaintext] + rankdir=LR + edge [arrowsize=0.5] + fontname="times" + + +processing [label=< + + + + +
processing
(ProcessingNode)
>]; + +sink [label=< + + + + +
sink
(Sink)
>]; + +source [label=< + + + + +
source
(Source)
>]; + + + +source:i -> processing:i [label="f32(11)" +,headlabel=<
7 +
> +,taillabel=<
[3, 2] +
>] + +processing:i -> sink:i [label="f32(11)" +,headlabel=<
5 +
> +,taillabel=<
7 +
>] + + +} diff --git a/ComputeGraph/examples/cyclo/cyclo.exe b/ComputeGraph/examples/cyclo/cyclo.exe new file mode 100644 index 00000000..cfa0d79a Binary files /dev/null and b/ComputeGraph/examples/cyclo/cyclo.exe differ diff --git a/ComputeGraph/examples/cyclo/cyclo.ilk b/ComputeGraph/examples/cyclo/cyclo.ilk new file mode 100644 index 00000000..624ef808 Binary files /dev/null and b/ComputeGraph/examples/cyclo/cyclo.ilk differ diff --git a/ComputeGraph/examples/cyclo/cyclo.pdb b/ComputeGraph/examples/cyclo/cyclo.pdb new file mode 100644 index 00000000..5d4eb75d Binary files /dev/null and b/ComputeGraph/examples/cyclo/cyclo.pdb differ diff --git a/ComputeGraph/examples/cyclo/cyclo.pdf b/ComputeGraph/examples/cyclo/cyclo.pdf new file mode 100644 index 00000000..e8a51280 Binary files /dev/null and b/ComputeGraph/examples/cyclo/cyclo.pdf differ diff --git a/ComputeGraph/examples/cyclo/docassets/cyclo.png b/ComputeGraph/examples/cyclo/docassets/cyclo.png new file mode 100644 index 00000000..97b1f740 Binary files /dev/null and b/ComputeGraph/examples/cyclo/docassets/cyclo.png differ diff --git a/ComputeGraph/examples/cyclo/generated/scheduler.cpp b/ComputeGraph/examples/cyclo/generated/scheduler.cpp new file mode 100644 index 00000000..5eb21ab0 --- /dev/null +++ b/ComputeGraph/examples/cyclo/generated/scheduler.cpp @@ -0,0 +1,170 @@ +/* + +Generated with CMSIS-DSP Compute Graph Scripts. +The generated code is not covered by CMSIS-DSP license. + +The support classes and code is covered by CMSIS-DSP license. + +*/ + + +#include "custom.h" +#include "GenericNodes.h" +#include "AppNodes.h" +#include "scheduler.h" + +#if !defined(CHECKERROR) +#define CHECKERROR if (cgStaticError < 0) \ + {\ + goto errorHandling;\ + } + +#endif + +#if !defined(CG_BEFORE_ITERATION) +#define CG_BEFORE_ITERATION +#endif + +#if !defined(CG_AFTER_ITERATION) +#define CG_AFTER_ITERATION +#endif + +#if !defined(CG_BEFORE_SCHEDULE) +#define CG_BEFORE_SCHEDULE +#endif + +#if !defined(CG_AFTER_SCHEDULE) +#define CG_AFTER_SCHEDULE +#endif + +#if !defined(CG_BEFORE_BUFFER) +#define CG_BEFORE_BUFFER +#endif + +#if !defined(CG_BEFORE_FIFO_BUFFERS) +#define CG_BEFORE_FIFO_BUFFERS +#endif + +#if !defined(CG_BEFORE_FIFO_INIT) +#define CG_BEFORE_FIFO_INIT +#endif + +#if !defined(CG_BEFORE_NODE_INIT) +#define CG_BEFORE_NODE_INIT +#endif + +#if !defined(CG_AFTER_INCLUDES) +#define CG_AFTER_INCLUDES +#endif + +#if !defined(CG_BEFORE_SCHEDULER_FUNCTION) +#define CG_BEFORE_SCHEDULER_FUNCTION +#endif + +#if !defined(CG_BEFORE_NODE_EXECUTION) +#define CG_BEFORE_NODE_EXECUTION +#endif + +#if !defined(CG_AFTER_NODE_EXECUTION) +#define CG_AFTER_NODE_EXECUTION +#endif + +CG_AFTER_INCLUDES + + +/* + +Description of the scheduling. + +*/ +static unsigned int schedule[26]= +{ +2,2,2,0,1,2,2,2,0,1,2,2,2,0,1,1,2,2,0,1,2,2,2,0,1,1, +}; + +CG_BEFORE_FIFO_BUFFERS +/*********** + +FIFO buffers + +************/ +#define FIFOSIZE0 11 +#define FIFOSIZE1 11 + +#define BUFFERSIZE1 11 +CG_BEFORE_BUFFER +float32_t buf1[BUFFERSIZE1]={0}; + +#define BUFFERSIZE2 11 +CG_BEFORE_BUFFER +float32_t buf2[BUFFERSIZE2]={0}; + + +CG_BEFORE_SCHEDULER_FUNCTION +uint32_t scheduler(int *error) +{ + int cgStaticError=0; + uint32_t nbSchedule=0; + int32_t debugCounter=1; + + CG_BEFORE_FIFO_INIT; + /* + Create FIFOs objects + */ + FIFO fifo0(buf1); + FIFO fifo1(buf2); + + CG_BEFORE_NODE_INIT; + /* + Create node objects + */ + ProcessingNode processing(fifo0,fifo1); + Sink sink(fifo1); + Source source(fifo0); + + /* Run several schedule iterations */ + CG_BEFORE_SCHEDULE; + while((cgStaticError==0) && (debugCounter > 0)) + { + /* Run a schedule iteration */ + CG_BEFORE_ITERATION; + for(unsigned long id=0 ; id < 26; id++) + { + CG_BEFORE_NODE_EXECUTION; + + switch(schedule[id]) + { + case 0: + { + cgStaticError = processing.run(); + } + break; + + case 1: + { + cgStaticError = sink.run(); + } + break; + + case 2: + { + cgStaticError = source.run(); + } + break; + + default: + break; + } + CG_AFTER_NODE_EXECUTION; + CHECKERROR; + } + debugCounter--; + CG_AFTER_ITERATION; + nbSchedule++; + } + +errorHandling: + CG_AFTER_SCHEDULE; + *error=cgStaticError; + return(nbSchedule); +} diff --git a/ComputeGraph/examples/cyclo/generated/scheduler.h b/ComputeGraph/examples/cyclo/generated/scheduler.h new file mode 100644 index 00000000..d98d9e63 --- /dev/null +++ b/ComputeGraph/examples/cyclo/generated/scheduler.h @@ -0,0 +1,26 @@ +/* + +Generated with CMSIS-DSP Compute Graph Scripts. +The generated code is not covered by CMSIS-DSP license. + +The support classes and code is covered by CMSIS-DSP license. + +*/ + +#ifndef _SCHEDULER_H_ +#define _SCHEDULER_H_ + +#ifdef __cplusplus +extern "C" +{ +#endif + + +extern uint32_t scheduler(int *error); + +#ifdef __cplusplus +} +#endif + +#endif + diff --git a/ComputeGraph/examples/cyclo/graph.py b/ComputeGraph/examples/cyclo/graph.py new file mode 100644 index 00000000..0cd811e7 --- /dev/null +++ b/ComputeGraph/examples/cyclo/graph.py @@ -0,0 +1,39 @@ +# Include definitions from the Python package to +# define datatype for the IOs and to have access to the +# Graph class +from cmsisdsp.cg.scheduler import * +# Include definition of the nodes +from nodes import * + +# Define the datatype we are using for all the IOs in this +# example +floatType=CType(F32) + +# Instantiate a Source node with a float datatype and +# working with packet of 5 samples (each execution of the +# source in the C code will generate 5 samples) +# "source" is the name of the C variable that will identify +# this node +src=Source("source",floatType,[3,2]) +# Instantiate a Processing node using a float data type for +# both the input and output. The number of samples consumed +# on the input and produced on the output is 7 each time +# the node is executed in the C code +# "processing" is the name of the C variable that will identify +# this node +processing=ProcessingNode("processing",floatType,7,7) +# Instantiate a Sink node with a float datatype and consuming +# 5 samples each time the node is executed in the C code +# "sink" is the name of the C variable that will identify +# this node +sink=Sink("sink",floatType,5) + +# Create a Graph object +the_graph = Graph() + +# Connect the source to the processing node +the_graph.connect(src.o,processing.i) +# Connect the processing node to the sink +the_graph.connect(processing.o,sink.i) + + diff --git a/ComputeGraph/examples/cyclo/main.cpp b/ComputeGraph/examples/cyclo/main.cpp new file mode 100644 index 00000000..a1fd4028 --- /dev/null +++ b/ComputeGraph/examples/cyclo/main.cpp @@ -0,0 +1,11 @@ +#include +#include +#include "scheduler.h" + +int main(int argc, char const *argv[]) +{ + int error; + printf("Start\n"); + uint32_t nbSched=scheduler(&error); + return 0; +} \ No newline at end of file diff --git a/ComputeGraph/examples/cyclo/main.obj b/ComputeGraph/examples/cyclo/main.obj new file mode 100644 index 00000000..281b094c Binary files /dev/null and b/ComputeGraph/examples/cyclo/main.obj differ diff --git a/ComputeGraph/examples/cyclo/nodes.py b/ComputeGraph/examples/cyclo/nodes.py new file mode 100644 index 00000000..6f5a5987 --- /dev/null +++ b/ComputeGraph/examples/cyclo/nodes.py @@ -0,0 +1,77 @@ +# Include definitions from the Python package +from cmsisdsp.cg.scheduler import GenericNode,GenericSink,GenericSource + +### Define new types of Nodes + +class ProcessingNode(GenericNode): + """ + Definition of a ProcessingNode for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the input and output + inLength : int + The number of samples consumed by input + outLength : int + The number of samples produced on output + """ + def __init__(self,name,theType,inLength,outLength): + GenericNode.__init__(self,name) + self.addInput("i",theType,inLength) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "ProcessingNode" + +class Sink(GenericSink): + """ + Definition of a Sink node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the input + inLength : int + The number of samples consumed by input + """ + def __init__(self,name,theType,inLength): + GenericSink.__init__(self,name) + self.addInput("i",theType,inLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Sink" + +class Source(GenericSource): + """ + Definition of a Source node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the output + outLength : int + The number of samples produced on output + """ + def __init__(self,name,theType,outLength): + GenericSource.__init__(self,name) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Source" + diff --git a/ComputeGraph/examples/cyclo/scheduler.obj b/ComputeGraph/examples/cyclo/scheduler.obj new file mode 100644 index 00000000..e492dcc1 Binary files /dev/null and b/ComputeGraph/examples/cyclo/scheduler.obj differ diff --git a/ComputeGraph/examples/cyclo/vc140.pdb b/ComputeGraph/examples/cyclo/vc140.pdb new file mode 100644 index 00000000..2d98e8d6 Binary files /dev/null and b/ComputeGraph/examples/cyclo/vc140.pdb differ diff --git a/ComputeGraph/examples/example1/AppNodes.h b/ComputeGraph/examples/example1/AppNodes.h index 653d5e82..0d1f1b95 100644 --- a/ComputeGraph/examples/example1/AppNodes.h +++ b/ComputeGraph/examples/example1/AppNodes.h @@ -3,13 +3,10 @@ * Title: AppNodes.h * Description: Application nodes for Example 1 * - * $Date: 29 July 2021 - * $Revision: V1.10.0 - * * Target Processor: Cortex-M and Cortex-A cores * -------------------------------------------------------------------- */ /* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example1/CMakeLists.txt b/ComputeGraph/examples/example1/CMakeLists.txt index 6a37dd49..5920cc73 100644 --- a/ComputeGraph/examples/example1/CMakeLists.txt +++ b/ComputeGraph/examples/example1/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example1) add_executable(example1 main.cpp) -sdf(example1) +sdf(example1 graph.py test) add_sdf_dir(example1) target_include_directories(example1 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example1/README.md b/ComputeGraph/examples/example1/README.md new file mode 100644 index 00000000..679eed41 --- /dev/null +++ b/ComputeGraph/examples/example1/README.md @@ -0,0 +1,320 @@ +# Example 1 + +Please refer to the [simple example](../simple/README.md) to have an overview of how to define a graph and it nodes and how to generate the C++ code for the static scheduler. This document is only explaining additional details: + +* How to define new arguments for the C implementation of the nodes +* How to define new arguments for the C API of the scheduler function +* Detailed description of the generated C++ scheduler + +The graph is is nearly the same as the one in the [simple example](../simple/README.md) but the processing node is just generating 5 samples in this example: + +graph1 + +Contrary to the [simple example](../simple/README.md) , there is only one Python script `graph.py` and it is containing everything : nodes, graph description and C++ code generation. + +## Defining new arguments for a node and the scheduler + +For `ProcessingNode`, we are adding additional arguments in this example to show how it is possible to do it for initializing a node in the generated code. + +If `processing` is the node, we can add arguments with the APIs `addLiteralArg` and `addVariableArg`. + +```python +processing.addLiteralArg(4,"testString") +processing.addVariableArg("someVariable") +``` + +* `addLiteralArg(4,"testString")` will pass the value `4` as first additional argument of the C++ constructor (after the FIFOs) and the string `"testString"` as second additional argument of the C++ constructor (after the FIFOs) +* `addVariableArg("someVariable")` will pass the variable `someVariable` as third additional argument of the C++ constructor (after the FIFOs) + +The constructor API will look like: + +```C++ +ProcessingNode(FIFOBase &src,FIFOBase &dst,int,const char*,int) +``` + +This API is defined in `AppNodes.h` by the developper. The types are not generated by the scripts. Here the variable `someVariable` is chosen to have type `int` hence the last argument of the constructor has type `int`. But it is not imposed by the Python script that is just declaring the existence of a variable. + +In the generated scheduler, the constructor is used as: + +```C++ +ProcessingNode processing(fifo0,fifo1,4,"testString",someVariable); +``` + +This variable `someVariable` must come from somewhere. The API of the scheduler is: + +```C++ +extern uint32_t scheduler(int *error,int someVariable); +``` + +This new argument to the scheduler is defined in the Python script: + +```python +conf.cOptionalArgs=["int someVariable"] +``` + +## The C++ code + +The C++ code generated in`scheduler.cpp` and `scheduler.h` in `generated` folder + +### scheduler.cpp + +#### Included headers + +The generated code is first including the needed headers: + +```C++ +#include "arm_math.h" +#include "custom.h" +#include "GenericNodes.h" +#include "AppNodes.h" +#include "scheduler.h" +``` + +- CMSIS-DSP header +- Custom definitions +- Generic nodes from `GenericNodes.h` +- Application nodes +- scheduler API + +#### Macros + +The generated code is then including some macro definitions that can all be redefined to customize some aspects of the generated scheduler. By default those macros, except `CHECKERROR`, are doing nothing: + +* CHECKERROR + * Check for an error after each node executioin. Default action is to branch out of the scheduler loop and return an error +* CG_BEFORE_ITERATION + * Code to execute before each iteration of the scheduler +* CG_AFTER_ITERATION + * Code to executed after each iteration of the scheduler +* CG_BEFORE_SCHEDULE + * Code to execute before starting the scheduler loop +* CG_AFTER_SCHEDULE + * Code to execute after the end of the scheduler loop +* CG_BEFORE_BUFFER + * Code before any buffer definition. Can be used, for instance, to align a buffer or to put this buffer in a specific memory section +* CG_BEFORE_FIFO_BUFFERS + * Code included before the definitions of the globals FIFO buffers +* CG_BEFORE_FIFO_INIT + * Code to execute before the creation of the FIFO C++ objects +* CG_BEFORE_NODE_INIT + * Code to execute before the creation of the node C++ objects +* CG_AFTER_INCLUDES + * Code coming after the include files (useful to add other include files after the default ones) +* CG_BEFORE_SCHEDULER_FUNCTION + * Code defined before the scheduler function +* CG_BEFORE_NODE_EXECUTION + * Code executed before a node execution +* CG_AFTER_NODE_EXECUTION + * Code executed after a node execution and before the error checking + +#### Memory buffers and FIFOs + +Then, the generated code is defining the buffers for the FIFOs. First the size are defined: + +```C++ +CG_BEFORE_FIFO_BUFFERS +/*********** + +FIFO buffers + +************/ +#define FIFOSIZE0 11 +#define FIFOSIZE1 5 +``` + +The FIFOs may have size different from the buffer when a buffer is shared between different FIFOs. So, there are different defines for the buffer sizes: + +```C++ +#define BUFFERSIZE1 11 +CG_BEFORE_BUFFER +float32_t buf1[BUFFERSIZE1]={0}; + +#define BUFFERSIZE2 5 +CG_BEFORE_BUFFER +float32_t buf2[BUFFERSIZE2]={0}; +``` + +In case of buffer sharing, a shared buffer will be defined with `int8_t` type. It is **very important** to align such a buffer by defining `CG_BEFORE_BUFFER` See the [FAQ](../../FAQ.md) for more information about alignment issues. + +#### Description of the schedule + +```C++ +static unsigned int schedule[17]= +{ +2,2,0,1,2,0,1,2,2,0,1,2,0,1,2,0,1, +}; +``` + +There are different code generation modes in the compute graph. By default, the schedule is encoded as a list of numbers and a `switch/case` is used to execute the node corresponding to an identification number. + +#### Scheduler API + +Then, the scheduling function is generated: + +```C++ +uint32_t scheduler(int *error,int someVariable) { +``` + +A value `<0` in `error` means there was an error during the execution. + +The returned valued is the number of schedules fully executed when the error occurred. + +The `someVariable` is defined in the Python script. The Python script can add as many arguments as needed with whatever type is needed. + +#### Scheduler locals + +The scheduling function is starting with a definition of some variables used for debug and statistics: + +```C++ +int cgStaticError=0; +uint32_t nbSchedule=0; +int32_t debugCounter=1; +``` + +Then, it is followed with a definition of the FIFOs: + +```C++ +CG_BEFORE_FIFO_INIT; +/* +Create FIFOs objects +*/ +FIFO fifo0(buf1); +FIFO fifo1(buf2); +``` + +The FIFO template has type: + +```C++ +template +class FIFO; +``` + +`isArray` is set to `1` when the Python code can deduce that the FIFO is always used as an array. In this case, the memory buffer may be shared with other FIFO depending on the data flow dependencies of the graph. + +`isAsync` is set to 1 when the graph is an asynchronous one. + +Then, the nodes are created and connected to the FIFOs: + +```C++ +/* +Create node objects +*/ +ProcessingNode processing(fifo0,fifo1,4,"testString",someVariable); +Sink sink(fifo1); +Source source(fifo0); +``` + +And finally, the function is entering the scheduling loop: + +```C++ + /* Run several schedule iterations */ + CG_BEFORE_SCHEDULE; + while((cgStaticError==0) && (debugCounter > 0)) + { +``` + +The content of the loop is a `switch / case`: + +```C++ +CG_BEFORE_NODE_EXECUTION; + +switch(schedule[id]) +{ + case 0: + { + cgStaticError = processing.run(); + } + break; + + case 1: + { + cgStaticError = sink.run(); + } + break; + + case 2: + { + cgStaticError = source.run(); + } + break; + + default: + break; +} +CG_AFTER_NODE_EXECUTION; +CHECKERROR; +``` + +#### Error handling + +In case of error, the code is branching out to the end of the function: + +```C++ +errorHandling: + CG_AFTER_SCHEDULE; + *error=cgStaticError; + return(nbSchedule); +``` + +## Expected output + +Output of the Python script: + +``` +Schedule length = 17 +Memory usage 64 bytes +``` + +Output of the execution: + +``` +Start +Source +Source +ProcessingNode +Sink +3 +0 +0 +0 +0 +Source +ProcessingNode +Sink +10 +0 +0 +0 +0 +Source +Source +ProcessingNode +Sink +17 +0 +0 +0 +0 +Source +ProcessingNode +Sink +24 +0 +0 +0 +0 +Source +ProcessingNode +Sink +31 +0 +0 +0 +0 +``` + +The source is incrementing a counter and generate 0,1,2,3 ... + +The processing node is copying the 4th sample of the input to the first sample of the output. So there is a delta of 7 between each new value written to the output. + +The sink is displaying the 5 samples at the input. diff --git a/ComputeGraph/examples/example1/docassets/graph1.PNG b/ComputeGraph/examples/example1/docassets/graph1.PNG new file mode 100644 index 00000000..0df797af Binary files /dev/null and b/ComputeGraph/examples/example1/docassets/graph1.PNG differ diff --git a/ComputeGraph/examples/example1/generated/scheduler.cpp b/ComputeGraph/examples/example1/generated/scheduler.cpp index fa8918bd..5f7d6724 100644 --- a/ComputeGraph/examples/example1/generated/scheduler.cpp +++ b/ComputeGraph/examples/example1/generated/scheduler.cpp @@ -102,8 +102,7 @@ float32_t buf2[BUFFERSIZE2]={0}; CG_BEFORE_SCHEDULER_FUNCTION -uint32_t scheduler(int *error,const char *testString, - int someVariable) +uint32_t scheduler(int *error,int someVariable) { int cgStaticError=0; uint32_t nbSchedule=0; @@ -120,7 +119,7 @@ uint32_t scheduler(int *error,const char *testString, /* Create node objects */ - ProcessingNode filter(fifo0,fifo1,4,testString,someVariable); + ProcessingNode processing(fifo0,fifo1,4,"testString",someVariable); Sink sink(fifo1); Source source(fifo0); @@ -138,7 +137,7 @@ uint32_t scheduler(int *error,const char *testString, { case 0: { - cgStaticError = filter.run(); + cgStaticError = processing.run(); } break; diff --git a/ComputeGraph/examples/example1/generated/scheduler.h b/ComputeGraph/examples/example1/generated/scheduler.h index f595e01f..c1d5cb0d 100644 --- a/ComputeGraph/examples/example1/generated/scheduler.h +++ b/ComputeGraph/examples/example1/generated/scheduler.h @@ -16,8 +16,7 @@ extern "C" #endif -extern uint32_t scheduler(int *error,const char *testString, - int someVariable); +extern uint32_t scheduler(int *error,int someVariable); #ifdef __cplusplus } diff --git a/ComputeGraph/examples/example1/graph.py b/ComputeGraph/examples/example1/graph.py index e6d33116..048ca19b 100644 --- a/ComputeGraph/examples/example1/graph.py +++ b/ComputeGraph/examples/example1/graph.py @@ -36,42 +36,29 @@ class ProcessingNode(Node): ### Define nodes floatType=CType(F32) src=Source("source",floatType,5) -b=ProcessingNode("filter",floatType,7,5) -b.addLiteralArg(4) -b.addVariableArg("testString","someVariable") +processing=ProcessingNode("processing",floatType,7,5) +processing.addLiteralArg(4,"testString") +processing.addVariableArg("someVariable") sink=Sink("sink",floatType,5) g = Graph() -g.connect(src.o,b.i) -g.connect(b.o,sink.i) +g.connect(src.o,processing.i) +g.connect(processing.o,sink.i) print("Generate graphviz and code") conf=Configuration() conf.debugLimit=1 -conf.cOptionalArgs=["const char *testString" - ,"int someVariable" +conf.cOptionalArgs=["int someVariable" ] -#conf.displayFIFOSizes=True -# Prefix for global FIFO buffers -#conf.prefix="sched1" -#conf.dumpSchedule = True sched = g.computeSchedule(config=conf) -#print(sched.schedule) + print("Schedule length = %d" % sched.scheduleLength) print("Memory usage %d bytes" % sched.memory) -# - -#conf.postCustomCName = "post.h" -#conf.CAPI = True -#conf.prefix="global" -#conf.dumpFIFO = True -#conf.CMSISDSP = False -#conf.switchCase = False sched.ccode("generated",conf) with open("test.dot","w") as f: diff --git a/ComputeGraph/examples/example1/main.cpp b/ComputeGraph/examples/example1/main.cpp index 8e26aa31..b0cb113f 100644 --- a/ComputeGraph/examples/example1/main.cpp +++ b/ComputeGraph/examples/example1/main.cpp @@ -6,6 +6,6 @@ int main(int argc, char const *argv[]) { int error; printf("Start\n"); - uint32_t nbSched=scheduler(&error,"Test",1); + uint32_t nbSched=scheduler(&error,1); return 0; } \ No newline at end of file diff --git a/ComputeGraph/examples/example1/test.dot b/ComputeGraph/examples/example1/test.dot index 96de529c..a481dc2c 100644 --- a/ComputeGraph/examples/example1/test.dot +++ b/ComputeGraph/examples/example1/test.dot @@ -9,10 +9,10 @@ digraph structs { fontname="times" -filter [label=< +processing [label=< - +
filter
(ProcessingNode)
processing
(ProcessingNode)
>]; @@ -32,13 +32,13 @@ source [label=< -source:i -> filter:i [label="f32(11)" +source:i -> processing:i [label="f32(11)" ,headlabel=<
7
> ,taillabel=<
5
>] -filter:i -> sink:i [label="f32(5)" +processing:i -> sink:i [label="f32(5)" ,headlabel=<
5
> ,taillabel=<
5 diff --git a/ComputeGraph/examples/example1/test.pdf b/ComputeGraph/examples/example1/test.pdf index 82166212..e8c85243 100644 Binary files a/ComputeGraph/examples/example1/test.pdf and b/ComputeGraph/examples/example1/test.pdf differ diff --git a/ComputeGraph/examples/example10/AppNodes.h b/ComputeGraph/examples/example10/AppNodes.h index aa3111d8..47e21fdc 100644 --- a/ComputeGraph/examples/example10/AppNodes.h +++ b/ComputeGraph/examples/example10/AppNodes.h @@ -1,15 +1,12 @@ /* ---------------------------------------------------------------------- * Project: CMSIS DSP Library * Title: AppNodes.h - * Description: Application nodes for Example 1 - * - * $Date: 29 July 2021 - * $Revision: V1.10.0 + * Description: Application nodes for Example 10 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example10/CMakeLists.txt b/ComputeGraph/examples/example10/CMakeLists.txt index 71e9157f..e138c3a2 100644 --- a/ComputeGraph/examples/example10/CMakeLists.txt +++ b/ComputeGraph/examples/example10/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example10) add_executable(example10 main.cpp) -sdf(example10) +sdf(example10 graph.py test) add_sdf_dir(example10) target_include_directories(example10 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example10/README.md b/ComputeGraph/examples/example10/README.md new file mode 100644 index 00000000..0abe65ee --- /dev/null +++ b/ComputeGraph/examples/example10/README.md @@ -0,0 +1,68 @@ +# Example 10 + +Please refer to the [simple example](../simple/README.md) to have an overview of how to define a graph and it nodes and how to generate the C++ code for the static scheduler. This document is only explaining additional details + +This example is implementing a [dynamic / asynchronous mode](../../Async.md). + +It is enabled in `graph.py` with: + +`conf.asynchronous = True` + +There is an option to increase the FIFO size compared to their synchronous values. To double the value (increase by `100%`) we write: + +`conf.FIFOIncrease = 100` + +The graph implemented in this example is: + +![graph10](docassets/graph10.png) + +There is a global iteration count corresponding to one execution of the schedule. + +The odd source is generating a value only when the count is odd. + +The even source is generating a value only when the count is even. + +The processing is adding its inputs. If no data is available on an input, 0 is used. + +In case of FIFO overflow or underflow, any node will skip its execution. + +All nodes are generating or consuming one sample but the FIFOs have a size of 2 because of the 100% increase requested in the configuration settings. + +Thus in this example : + +* A sample is not always generated on an edge +* A sample is not always available on an edge + +The dataflow on each edge is thus not static and vary between iterations of the schedule + +## Expected outputs + +``` +Schedule length = 9 +Memory usage 34 bytes +``` + +``` +Start +0 +0 +1 +1 +2 +2 +3 +3 +4 +4 +5 +5 +6 +6 +7 +7 +8 +8 +9 +9 +``` + diff --git a/ComputeGraph/documentation/graph10.png b/ComputeGraph/examples/example10/docassets/graph10.png similarity index 100% rename from ComputeGraph/documentation/graph10.png rename to ComputeGraph/examples/example10/docassets/graph10.png diff --git a/ComputeGraph/examples/example10/graph.py b/ComputeGraph/examples/example10/graph.py index 550ce59d..95dee4d0 100644 --- a/ComputeGraph/examples/example10/graph.py +++ b/ComputeGraph/examples/example10/graph.py @@ -2,7 +2,7 @@ from cmsisdsp.cg.scheduler import * ### Define new types of Nodes - + class SinkAsync(GenericSink): def __init__(self,name,theType,inLength): diff --git a/ComputeGraph/examples/example2/AppNodes.h b/ComputeGraph/examples/example2/AppNodes.h index 28f1a430..5481d572 100644 --- a/ComputeGraph/examples/example2/AppNodes.h +++ b/ComputeGraph/examples/example2/AppNodes.h @@ -3,13 +3,11 @@ * Title: AppNodes.h * Description: Application nodes for Example 2 * - * $Date: 29 July 2021 - * $Revision: V1.10.0 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example2/CMakeLists.txt b/ComputeGraph/examples/example2/CMakeLists.txt index 8f9fb524..d5167dd5 100644 --- a/ComputeGraph/examples/example2/CMakeLists.txt +++ b/ComputeGraph/examples/example2/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example2) add_executable(example2 main.cpp) -sdf(example2) +sdf(example2 graph.py test) add_sdf_dir(example2) target_include_directories(example2 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/documentation/example2.md b/ComputeGraph/examples/example2/README.md similarity index 59% rename from ComputeGraph/documentation/example2.md rename to ComputeGraph/examples/example2/README.md index 60a01830..6cfc1cc8 100644 --- a/ComputeGraph/documentation/example2.md +++ b/ComputeGraph/examples/example2/README.md @@ -1,20 +1,23 @@ # Example 2 -Please refer to [Example 1](example1.md) for the details about how to create a graph and the C++ support classes. +Please refer to the [simple example](../simple/README.md) to have an overview of how to define a graph and it nodes and how to generate the C++ code for the static scheduler. + +The [simple example with CMSIS-DSP](../simpledsp/README.md) is giving more details about `Constant` nodes and CMSIS-DSP functions in the compute graph. In this example. we are just analyzing a much more complex example to see some new features: - Delay -- CMSIS-DSP functions -- Some default nodes : sliding buffer +- SlidingBuffer + +This example is not really using a MFCC or a TensorFlow Lite node. It is just providing some wrappers to show how such a nodes could be included in a graph: The graph is: -![graph2](graph2.PNG) +![graph2](docassets/graph2.PNG) It is much more complex: -- First we have a source delayed by 10 samples ; +- First we have a stereo source delayed by 10 samples ; - Then this stereo source is split into left/right samples using the default block Unzip - The samples are divided by 2 using a CMSIS-DSP function - The node HALF representing a constant is introduced (constant arrays are also supported) @@ -24,18 +27,11 @@ It is much more complex: - Another sliding buffer - An a block representing TensorFlow Lite for Micro (a fake TFLite node) -Note that those blocks (MFCC, TFLite) are doing nothing in this example. It is just to illustrate a more complex example that someone may want to experiment with for keyword spotting. +Note that those blocks (MFCC, TFLite) are doing nothing in this example. It is just to illustrate a more complex example typical of keyword spotting applications. Examples 5 and 6 are showing how to use the CMSIS-DSP MFCC. -The new features compared to `example1` are: - -- Delay -- CMSIS-DSP function -- Constant node -- SlidingBuffer - -Let's look at all of this: +Let's look at the new features compared to example 1: ## Delay @@ -43,9 +39,7 @@ Let's look at all of this: g.connectWithDelay(src.o, toMono.i,10) ``` - - -To add a delay on a link between 2 nodes, you just use the `connectWithDelay` function. Delays can be useful for some graphs which are not schedulable. They are implemented by starting the schedule with a FIFO which is not empty but contain 0 samples. +To add a delay on a link between 2 nodes, you just use the `connectWithDelay` function. Delays can be useful for some graphs which are not schedulable. They are implemented by starting the schedule with a FIFO which is not empty but contain some 0 samples. ## CMSIS-DSP function @@ -59,16 +53,18 @@ sa=Dsp("scale",floatType,blockSize) The corresponding CMSIS-DSP function will be named: `arm_scale_f32` -The code generated in `sched.cpp` will not require any C++ class, It will look like: +The code generated in `scheduler.cpp` will not require any C++ class, It will look like: ```C++ { - float32_t* i0; - float32_t* o2; - i0=fifo2.getReadBuffer(160); - o2=fifo4.getWriteBuffer(160); - arm_scale_f32(i0,HALF,o2,160); - cgStaticError = 0; + float32_t* i0; + float32_t* i1; + float32_t* o2; + i0=fifo3.getReadBuffer(160); + i1=fifo4.getReadBuffer(160); + o2=fifo5.getWriteBuffer(160); + arm_add_f32(i0,i1,o2,160); + cgStaticError = 0; } ``` @@ -84,23 +80,21 @@ A constant node is defined as: half=Constant("HALF") ``` +In the C++ code, `HALF` is expected to be a value defined in `custom.h` - -In the C++ code, HALF is expected to be a value defined in custom.h - -In the Python generated code, it would be in custom.py - -Constant values are not involved in the scheduling (they are ignored) and they have no io. So, to connect to a constant node we do: +Constant values are not involved in the scheduling (they are ignored) and they have no IO. So, to connect to a constant node we do: ```python g.connect(half,sa.ib) ``` -There is no "o", "oa" suffixes for the constant node half. +There is no "o", "oa" suffixes for the constant node `half`. + +Constant nodes are just here to make it easier to use CMSIS-DSP functions. ## SlidingBuffer -Sliding buffers and OverlapAndAdd are used a lot so they are provided by default. +Sliding buffers and OverlapAndAdd are used a lot so they are provided in the `cg/nodes/cpp`folder of the `ComputeGraph` folder. In Python, it can be used with: @@ -114,3 +108,18 @@ There is no C++ class to write for this since it is provided by default by the f It is named `SlidingBuffer` but not `SlidingWindow` because no multiplication with a window is done. It must be implemented with another block as will be demonstrated in the [example 3](example3.md) +## Expected outputs + +``` +Schedule length = 302 +Memory usage 10720 bytes +``` + +And when executed: + +``` +Start +Nb = 40 +``` + +Execution is running for 40 iterations without errors. diff --git a/ComputeGraph/documentation/graph2.PNG b/ComputeGraph/examples/example2/docassets/graph2.PNG similarity index 100% rename from ComputeGraph/documentation/graph2.PNG rename to ComputeGraph/examples/example2/docassets/graph2.PNG diff --git a/ComputeGraph/examples/example3/AppNodes.h b/ComputeGraph/examples/example3/AppNodes.h index 6136701b..7df66010 100644 --- a/ComputeGraph/examples/example3/AppNodes.h +++ b/ComputeGraph/examples/example3/AppNodes.h @@ -3,13 +3,10 @@ * Title: AppNodes.h * Description: Application nodes for Example 3 * - * $Date: 29 July 2021 - * $Revision: V1.10.0 - * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- +* + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example3/CMakeLists.txt b/ComputeGraph/examples/example3/CMakeLists.txt index 5b955c2b..9c647583 100644 --- a/ComputeGraph/examples/example3/CMakeLists.txt +++ b/ComputeGraph/examples/example3/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example3) add_executable(example3 main.cpp custom.cpp) -sdf(example3) +sdf(example3 graph.py test) add_sdf_dir(example3) target_include_directories(example3 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example3/README.md b/ComputeGraph/examples/example3/README.md new file mode 100644 index 00000000..0a78b2ac --- /dev/null +++ b/ComputeGraph/examples/example3/README.md @@ -0,0 +1,172 @@ +# Example 3 + +Please refer to the [simple example](../simple/README.md) to have an overview of how to define a graph and it nodes and how to generate the C++ code for the static scheduler. This document is only explaining additional details + +This example is implementing a working example with FFT. The graph is: + +![graph3](docassets/graph3.PNG) + +The example is: + +- Providing a file source which is reading a source file and then padding with zero +- A sliding window +- A multiplication with a Hann window +- A conversion to/from complex +- Use of CMSIS-DSP FFT/IFFT +- Overlap and add +- File sink writing the result into a file + +The new feature s compared to previous examples are: + +- The constant array HANN +- The CMSIS-DSP FFT + +## Constant array + +It is like in example 2 where the constant was a float. + +Now, the constant is an array: + +```python +hann=Constant("HANN") +``` + +In `custom.h`, this array is defined as: + +```C++ +extern const float32_t HANN[256]; +``` + + + +## CMSIS-DSP FFT + +The FFT node cannot be created using a `Dsp` node in Python because FFT is requiring specific initializations. So, a Python class and C++ class must be created. They are provided by default in the ffamework butg let's look at how they are implemented: + +```python +class CFFT(GenericNode): + def __init__(self,name,theType,inLength): + GenericNode.__init__(self,name) + + self.addInput("i",theType,2*inLength) + self.addOutput("o",theType,2*inLength) + + @property + def typeName(self): + return "CFFT" +``` + +Look at the definition of the inputs and outputs : The FFT is using complex number so the ports have twice the number of float samples. The argument of the constructor is the FFT length in **complex** sample but `addInput` and `addOutput` require the number of samples of the base type : here float. + +We suggest to use as arguments of the blocks a number of samples which is meaningful for the blocks and use the lengths in standard data type (f32, q31 ...) when defining the IO. + +So here, the number of complex samples is used as arguments. But the IO are using the number of floats required to encode those complex numbers hence a factor of 2. + +The C++ template is: + +```C++ +template +class CFFT; +``` + +There are only specific implementations for specific datatype. No generic implementation is provided. + +For, float we have: + +```C++ +template +class CFFT: public GenericNode +{ +public: + CFFT(FIFOBase &src,FIFOBase &dst):GenericNode(src,dst) + { + arm_status status; + status=arm_cfft_init_f32(&sfft,inputSize>>1); + }; + + int prepareForRunning() override + { + if (this->willOverflow() || + this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() override + { + float32_t *a=this->getReadBuffer(); + float32_t *b=this->getWriteBuffer(); + memcpy((void*)b,(void*)a,inputSize*sizeof(float32_t)); + arm_cfft_f32(&sfft,b,0,1); + return(0); + }; + + arm_cfft_instance_f32 sfft; + +}; +``` + +It is verbose but not difficult. The constructor is initializing the CMSIS-DSP FFT instance and connecting to the FIFO (through GenericNode). + +The run function is applying the `arm_cfft_f32`. Since this function is modifying the input buffer, there is a `memcpy`. It is not really needed here. The read buffer can be modified by the CFFT. It will just make it more difficult to debug if you'd like to inspect the content of the FIFOs. + +THe function `prepareForRunning` is only used in asynchronous mode. Please refer to the documentation for the asynchronous mode. + +This node is provided in `cg/nodes/cpp` so no need to define it. You can just use it by including the right headers. + +It can be used by just doing in your `AppNodes.h` file : + +```c++ +#include "CFFT.h" +``` + +From Python side it would be: + +```python +from cmsisdsp.cg.scheduler import * +``` + +The scheduler module is automatically including the default nodes. + +## Expected output + +Output of Python script: + +``` +Schedule length = 25 +Memory usage 11264 bytes +``` + +Output of execution: + +``` +Start +Nb = 40 +``` + +It is running for 40 iterations of the scheduler without errors. + +The python script `debug.py` can be used to display the content of `input_example3.txt` and `../build/output_example3.txt` + +It should display the same sinusoid but it is delayed in `output_example3.txt` by a few samples because of the sliding buffer. The sliding buffer will generate 256 samples in output each time 128 samples are received in input. As consequence, at start, 256 samples with the half set to zero are generated. + +We can check it in the debug script by comparing a delayed version of the original to the output. + +You should get something like: + +![sine](docassets/sine.png) + +We have 40 execution of the schedule iteration. In each schedule iteration we have two sinks. A sink is producing 192 samples. + +So, the execution is producing `40 * 2 * 192 == 15360` so a bit less than the `16000` samples in input. + +If we compare the input and output taking into account this length difference and the delay of 128 samples, we get (by running `debug.py`): + +``` +Comparison of input and output : max absolute error +6.59404862823898e-07 +``` + diff --git a/ComputeGraph/examples/example3/debug.py b/ComputeGraph/examples/example3/debug.py new file mode 100644 index 00000000..5cd13e05 --- /dev/null +++ b/ComputeGraph/examples/example3/debug.py @@ -0,0 +1,20 @@ +import numpy as np +from pylab import figure, clf, plot, xlabel, ylabel, xlim, ylim, title, grid, axes, show,semilogx, semilogy +from numpy import genfromtxt +ref_data = genfromtxt('input_example3.txt', delimiter=',') + +figure() +plot(ref_data) + +output_data = genfromtxt('../build/output_example3.txt', delimiter=',') + +plot(output_data) +show() + +print(ref_data.shape) +print(output_data.shape) +nb = output_data.shape[0] - 128 + +print("Comparison of input and output : max absolute error") +diff = output_data[128:] - ref_data[:nb] +print(np.max(np.abs(diff))) diff --git a/ComputeGraph/documentation/graph3.PNG b/ComputeGraph/examples/example3/docassets/graph3.PNG similarity index 100% rename from ComputeGraph/documentation/graph3.PNG rename to ComputeGraph/examples/example3/docassets/graph3.PNG diff --git a/ComputeGraph/examples/example3/docassets/sine.png b/ComputeGraph/examples/example3/docassets/sine.png new file mode 100644 index 00000000..5f2b5d72 Binary files /dev/null and b/ComputeGraph/examples/example3/docassets/sine.png differ diff --git a/ComputeGraph/documentation/example4.md b/ComputeGraph/examples/example4/README.md similarity index 82% rename from ComputeGraph/documentation/example4.md rename to ComputeGraph/examples/example4/README.md index 91bde327..0921aee8 100644 --- a/ComputeGraph/documentation/example4.md +++ b/ComputeGraph/examples/example4/README.md @@ -2,6 +2,8 @@ It is exactly the same example as example 3 but the code generation is generating Python code instead of C++. +![graph4](docassets/graph4.png) + The Python code is generated with: ```python @@ -12,6 +14,12 @@ and it will generate a `sched.py` file. A file `custom.py` and `appnodes.py` are also required. +The example can be run with: + +`python main.py` + +Do not confuse `graph.py,` which is used to describe the graph, with the other Python files that are used to execute the graph. + ## custom.py ```python @@ -25,7 +33,7 @@ An array HANN is defined for the Hann window. ## appnodes.py -This file is defining the new nodes which were used in `graph.py`. In `graph.py` which are just defining new kind of nodes for scheduling purpose : type and sizes. +This file is defining the new nodes which were used in `graph.py`. In `appnodes.py` we including new kind of nodes for simulation purpose: @@ -33,8 +41,6 @@ In `appnodes.py` we including new kind of nodes for simulation purpose: from cmsisdsp.cg.scheduler import * ``` - - The CFFT is very similar to the C++ version of example 3. But there is no `prepareForRunning`. Dynamic / asynchronous mode is not implemented for Python. ```python @@ -110,3 +116,22 @@ DISPBUF = np.zeros(16000) nb,error = s.scheduler(DISPBUF) ``` +The example can be run with: + +`python main.py` + + + +## Expected outputs + +``` +Generate graphviz and code +Schedule length = 25 +Memory usage 11264 bytes +``` + +And when executed: + +![sine](docassets/sine.png) + +As you can see at the beginning, there is a small delay during which the output signal is zero. diff --git a/ComputeGraph/examples/example4/appnodes.py b/ComputeGraph/examples/example4/appnodes.py index 2e2a85fb..7c1d6da8 100644 --- a/ComputeGraph/examples/example4/appnodes.py +++ b/ComputeGraph/examples/example4/appnodes.py @@ -3,13 +3,11 @@ # Title: appnodes.py # Description: Application nodes for Example 4 # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/ComputeGraph/examples/example4/debug.py b/ComputeGraph/examples/example4/debug.py deleted file mode 100644 index becd1195..00000000 --- a/ComputeGraph/examples/example4/debug.py +++ /dev/null @@ -1,20 +0,0 @@ -import numpy as np -from cmsisdsp.cg.static.nodes.simu import * - -a=np.zeros(10) -f=FIFO(10,a) - -f.dump() - -nb = 1 -for i in range(4): - w=f.getWriteBuffer(2) - w[0:2]=nb*np.ones(2) - nb = nb + 1 - f.dump() - -print(a) - -for i in range(4): - w=f.getReadBuffer(2) - print(w) \ No newline at end of file diff --git a/ComputeGraph/examples/example4/docassets/graph4.png b/ComputeGraph/examples/example4/docassets/graph4.png new file mode 100644 index 00000000..815ad3f1 Binary files /dev/null and b/ComputeGraph/examples/example4/docassets/graph4.png differ diff --git a/ComputeGraph/examples/example4/docassets/sine.png b/ComputeGraph/examples/example4/docassets/sine.png new file mode 100644 index 00000000..6b2bc8bc Binary files /dev/null and b/ComputeGraph/examples/example4/docassets/sine.png differ diff --git a/ComputeGraph/examples/example5/README.md b/ComputeGraph/examples/example5/README.md new file mode 100644 index 00000000..979bd9c4 --- /dev/null +++ b/ComputeGraph/examples/example5/README.md @@ -0,0 +1,25 @@ +# Example 5 + +This is a pure python example. It is computing a sequence of MFCC with an overlap of 0.5 s and it is creating an animation. + +It can be run with: + +`python main.py` + +The `NumPy` sink at the end is just recording all the MFCC outputs as a list of buffers. This list is used to create an animation. + +graph5 + +## Expected output + +``` +Generate graphviz and code +Schedule length = 292 +Memory usage 6614 bytes +``` + +And when executed you should get an animation looking like this: + +![mfcc](docassets/mfcc.png) + +The Python `main.py` contains a line which can be uncommented to record the animation as a `.mp4` video. \ No newline at end of file diff --git a/ComputeGraph/examples/example5/appnodes.py b/ComputeGraph/examples/example5/appnodes.py index c7f7667b..bbc5cc04 100644 --- a/ComputeGraph/examples/example5/appnodes.py +++ b/ComputeGraph/examples/example5/appnodes.py @@ -1,15 +1,12 @@ ########################################### # Project: CMSIS DSP Library # Title: appnodes.py -# Description: Application nodes for Example 4 -# -# $Date: 29 July 2021 -# $Revision: V1.10.0 +# Description: Application nodes for Example 5 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/ComputeGraph/examples/example5/docassets/graph5.png b/ComputeGraph/examples/example5/docassets/graph5.png new file mode 100644 index 00000000..fd0eb769 Binary files /dev/null and b/ComputeGraph/examples/example5/docassets/graph5.png differ diff --git a/ComputeGraph/examples/example5/docassets/mfcc.png b/ComputeGraph/examples/example5/docassets/mfcc.png new file mode 100644 index 00000000..f1890194 Binary files /dev/null and b/ComputeGraph/examples/example5/docassets/mfcc.png differ diff --git a/ComputeGraph/examples/example6/AppNodes.h b/ComputeGraph/examples/example6/AppNodes.h index e7c58343..b989dc14 100644 --- a/ComputeGraph/examples/example6/AppNodes.h +++ b/ComputeGraph/examples/example6/AppNodes.h @@ -3,13 +3,10 @@ * Title: AppNodes.h * Description: Application nodes for Example 6 * - * $Date: 29 July 2021 - * $Revision: V1.10.0 - * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example6/CMakeLists.txt b/ComputeGraph/examples/example6/CMakeLists.txt index 1591cd23..9d53513a 100644 --- a/ComputeGraph/examples/example6/CMakeLists.txt +++ b/ComputeGraph/examples/example6/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example6) add_executable(example6 main.cpp mfccConfigData.c) -sdf(example6) +sdf(example6 graph.py test) add_sdf_dir(example6) target_include_directories(example6 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example6/README.md b/ComputeGraph/examples/example6/README.md new file mode 100644 index 00000000..74b4fb82 --- /dev/null +++ b/ComputeGraph/examples/example6/README.md @@ -0,0 +1,15 @@ +# Example 6 + +This example is similar to example 5 but with C code generation instead of Python. + +![graph6](docassets/graph6.png) + +## Expected output + +``` +nbMFCCOutputs = 126 +Generate graphviz and code +Schedule length = 17 +Memory usage 2204 bytes +``` + diff --git a/ComputeGraph/examples/example6/docassets/graph6.png b/ComputeGraph/examples/example6/docassets/graph6.png new file mode 100644 index 00000000..caf7ea9c Binary files /dev/null and b/ComputeGraph/examples/example6/docassets/graph6.png differ diff --git a/ComputeGraph/examples/example7/PythonTest.mo b/ComputeGraph/examples/example7/PythonTest.mo index 778cdf66..0611783b 100644 --- a/ComputeGraph/examples/example7/PythonTest.mo +++ b/ComputeGraph/examples/example7/PythonTest.mo @@ -13,7 +13,7 @@ model PythonTest Placement(visible = true, transformation(origin = {-82, 8}, extent = {{-10, -10}, {10, 10}}, rotation = 0))); inner Modelica.Blocks.Noise.GlobalSeed globalSeed annotation( Placement(visible = true, transformation(origin = {-86, -28}, extent = {{-10, -10}, {10, 10}}, rotation = 0))); - ARM.Sound.WaveOutput waveOutput annotation( + ARM.Sound.WaveOutput waveOutput(path = "C:\\benchresults\\cmsis\\CMSIS-DSP\\ComputeGraph\\examples\\example7\\output.wav") annotation( Placement(visible = true, transformation(origin = {24, -32}, extent = {{-10, -10}, {10, 10}}, rotation = 0))); equation connect(vht.y, transferFunction.u) annotation( diff --git a/ComputeGraph/examples/example7/README.md b/ComputeGraph/examples/example7/README.md new file mode 100644 index 00000000..574046c2 --- /dev/null +++ b/ComputeGraph/examples/example7/README.md @@ -0,0 +1,62 @@ +# Example 7 + +This is an example showing how a graph in in Python (not C) can interact with an [OpenModelica](https://openmodelica.org/) model. + +![graph7](docassets/graph7.png) + +First you need to get the project [AVH-SystemModeling](https://github.com/ARM-software/AVH-SystemModeling) from our ARM-Software repository. + +Then, you need launch `OpenModelica` and choose `Open Model`. + +Select `AVH-SystemModeling/VHTModelicaBlock/ARM/package.mo` + +Then choose `Open Model` again and select `PythonTest.mo`. + +You should see something like that in `Open Modelica`: + +![modelica](docassets/modelica.png) + +Customize the output path in the `Wave` node. + +Refer to the `Open Modelica` documentation to know who to build and run this simulation. Once it is started in Modelica, launch the Python script in `example7`: + +`python main.py` + +You should see : + +``` +Connecting as INPUT +Connecting as OUTPUT +``` + +In Modelica window, the simulation should continue to `100%`. + +In the simulation window, you should be able to plot the output wav and get something like: + +![waveoutput](docassets/waveoutput.png) + +A `.wav` should have been generated so that you can listen to the result : A Larsen effect ! + +The `Processing` node in the compute graph is implemented in `custom.py` and is a gain computed with `CMSIS-DSP` Python wrapper + +```python +class Processing(GenericNode): + def __init__(self,inputSize,outputSize,fifoin,fifoout): + GenericNode.__init__(self,inputSize,outputSize,fifoin,fifoout) + + def run(self): + + i=self.getReadBuffer() + o=self.getWriteBuffer() + + b=dsp.arm_scale_q15(i,0x6000,1) + + o[:]=b[:] + + return(0) +``` + + + +The gain has been chosen to create an instability. + diff --git a/ComputeGraph/examples/example7/appnodes.py b/ComputeGraph/examples/example7/appnodes.py index 42fd62ea..f21604c0 100644 --- a/ComputeGraph/examples/example7/appnodes.py +++ b/ComputeGraph/examples/example7/appnodes.py @@ -1,15 +1,12 @@ ########################################### # Project: CMSIS DSP Library # Title: appnodes.py -# Description: Application nodes for Example 4 -# -# $Date: 29 July 2021 -# $Revision: V1.10.0 +# Description: Application nodes for Example 7 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2022 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/ComputeGraph/examples/example7/docassets/graph7.png b/ComputeGraph/examples/example7/docassets/graph7.png new file mode 100644 index 00000000..83ee74c5 Binary files /dev/null and b/ComputeGraph/examples/example7/docassets/graph7.png differ diff --git a/ComputeGraph/examples/example7/docassets/modelica.png b/ComputeGraph/examples/example7/docassets/modelica.png new file mode 100644 index 00000000..616707c4 Binary files /dev/null and b/ComputeGraph/examples/example7/docassets/modelica.png differ diff --git a/ComputeGraph/examples/example7/docassets/waveoutput.png b/ComputeGraph/examples/example7/docassets/waveoutput.png new file mode 100644 index 00000000..5b59e5dd Binary files /dev/null and b/ComputeGraph/examples/example7/docassets/waveoutput.png differ diff --git a/ComputeGraph/examples/example7/graph.py b/ComputeGraph/examples/example7/graph.py index 93f45b9b..f546a684 100644 --- a/ComputeGraph/examples/example7/graph.py +++ b/ComputeGraph/examples/example7/graph.py @@ -33,18 +33,11 @@ print("Generate graphviz and code") conf=Configuration() -#conf.dumpSchedule = True sched = g.computeSchedule(conf) -#print(sched.schedule) print("Schedule length = %d" % sched.scheduleLength) print("Memory usage %d bytes" % sched.memory) -# -# Pass the source and sink objects used to communicate with the VHT Modelica block -#conf.pyOptionalArgs="" -conf.pathToSDFModule="C:\\\\benchresults\\\\cmsis_docker\\\\CMSIS\\\\DSP\\\\SDFTools" -#conf.dumpFIFO=True -#conf.prefix="sched1" + sched.pythoncode(".",config=conf) with open("test.dot","w") as f: diff --git a/ComputeGraph/examples/example7/output.wav b/ComputeGraph/examples/example7/output.wav new file mode 100644 index 00000000..1485d01b Binary files /dev/null and b/ComputeGraph/examples/example7/output.wav differ diff --git a/ComputeGraph/examples/example8/AppNodes.h b/ComputeGraph/examples/example8/AppNodes.h index d818ba65..f115b36f 100644 --- a/ComputeGraph/examples/example8/AppNodes.h +++ b/ComputeGraph/examples/example8/AppNodes.h @@ -1,14 +1,11 @@ /* ---------------------------------------------------------------------- * Project: CMSIS DSP Library * Title: AppNodes.h - * Description: Application nodes for Example 1 - * - * $Date: 29 July 2021 - * $Revision: V1.10.0 + * Description: Application nodes for Example 8 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* + * -------------------------------------------------------------------- +* * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 diff --git a/ComputeGraph/examples/example8/CMakeLists.txt b/ComputeGraph/examples/example8/CMakeLists.txt index 83f7c875..8a9351ee 100644 --- a/ComputeGraph/examples/example8/CMakeLists.txt +++ b/ComputeGraph/examples/example8/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example8) add_executable(example8 main.cpp) -sdf(example8) +sdf(example8 graph.py test) add_sdf_dir(example8) target_include_directories(example8 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example8/README.md b/ComputeGraph/examples/example8/README.md new file mode 100644 index 00000000..4b995a64 --- /dev/null +++ b/ComputeGraph/examples/example8/README.md @@ -0,0 +1,54 @@ +# Example 8 + +This example is illustrating : + +* The `Duplicate` node to have a one-to-many connection at an output +* A structured datatype for the samples in the connections + +![graph8](docassets/graph8.png) + +## Structured datatype + +It is possible to use a custom datatype: + +```python +complexType=CStructType("complex","MyComplex",8) +``` + +This is defining a new datatype that is mapped to the type `complex` in C/C++ and the class `MyComplex` in Python. The last argument is the size in bytes of the struct in C. + +The type complex may be defined with: + +```c +typedef struct { + float re; + float im; +} complex; +``` + +**Note that:** + +- The value **must have** value semantic in C/C++. So avoid classes +- In Python, the classes have reference semantic which implies some constraints: + - You should never modify an object from the read buffer + - You should change the field of an object in the write buffer but not the object itself + - If you need a new object : copy or create a new object. Never use an object from the read buffer as it is if you intend to customize it + +The size of the C structure should take into account the padding that may be added to the struct. + +When no buffer sharing is used, the size of buffers is always expressed in number of samples. + +But in case of buffer sharing, the datatype of the buffer is `int8_t` and the size of the buffer must be computed by the Compute Graph taking into account ay padding that may exist. + +## Duplicate node + +In case of a one-to-many connections, the Python code will automatically add `Duplicate` nodes in the graph. Those `Duplicate` nodes do not appear directly in the graphviz but only as a stylized way : a dot. + +Currently it is limited to 3. If you need more that 3 outputs on an IO you'll have to insert the `Duplicate` nodes explicitly in the graph. + +In the generated code, you'll see the `Duplicate` nodes. For instance, in this example: + +```C++ +Duplicate3 dup0(fifo2,fifo3,fifo4,fifo5); +``` + diff --git a/ComputeGraph/examples/example8/appnodes.py b/ComputeGraph/examples/example8/appnodes.py index 67125505..992a326e 100644 --- a/ComputeGraph/examples/example8/appnodes.py +++ b/ComputeGraph/examples/example8/appnodes.py @@ -1,15 +1,13 @@ ########################################### # Project: CMSIS DSP Library # Title: appnodes.py -# Description: Application nodes for Example 4 +# Description: Application nodes for Example 8 # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/ComputeGraph/examples/example8/docassets/graph8.png b/ComputeGraph/examples/example8/docassets/graph8.png new file mode 100644 index 00000000..db3dd4cd Binary files /dev/null and b/ComputeGraph/examples/example8/docassets/graph8.png differ diff --git a/ComputeGraph/examples/example9/AppNodes.h b/ComputeGraph/examples/example9/AppNodes.h index b90c1a76..50091812 100644 --- a/ComputeGraph/examples/example9/AppNodes.h +++ b/ComputeGraph/examples/example9/AppNodes.h @@ -1,15 +1,12 @@ /* ---------------------------------------------------------------------- * Project: CMSIS DSP Library * Title: AppNodes.h - * Description: Application nodes for Example 1 - * - * $Date: 29 July 2021 - * $Revision: V1.10.0 + * Description: Application nodes for Example 9 * * Target Processor: Cortex-M and Cortex-A cores - * -------------------------------------------------------------------- */ -/* - * Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. + * -------------------------------------------------------------------- +* + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. * * SPDX-License-Identifier: Apache-2.0 * diff --git a/ComputeGraph/examples/example9/CMakeLists.txt b/ComputeGraph/examples/example9/CMakeLists.txt index 85d8e90b..25d61df2 100644 --- a/ComputeGraph/examples/example9/CMakeLists.txt +++ b/ComputeGraph/examples/example9/CMakeLists.txt @@ -6,7 +6,7 @@ project(Example9) add_executable(example9 main.cpp) -sdf(example9) +sdf(example9 graph.py test) add_sdf_dir(example9) target_include_directories(example9 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/ComputeGraph/examples/example9/README.md b/ComputeGraph/examples/example9/README.md new file mode 100644 index 00000000..0e9ae908 --- /dev/null +++ b/ComputeGraph/examples/example9/README.md @@ -0,0 +1,7 @@ +# Example 9 + +Thsi example is just checking that duplicate node insertion and delay on a connection are working well together. + +The Python script is able to schedule the graph. + +![graph9](docassets/graph9.png) \ No newline at end of file diff --git a/ComputeGraph/examples/example9/docassets/graph9.png b/ComputeGraph/examples/example9/docassets/graph9.png new file mode 100644 index 00000000..56f92fb0 Binary files /dev/null and b/ComputeGraph/examples/example9/docassets/graph9.png differ diff --git a/ComputeGraph/examples/simple/AppNodes.h b/ComputeGraph/examples/simple/AppNodes.h new file mode 100644 index 00000000..7edc0181 --- /dev/null +++ b/ComputeGraph/examples/simple/AppNodes.h @@ -0,0 +1,125 @@ +/* ---------------------------------------------------------------------- + * Project: CMSIS DSP Library + * Title: AppNodes.h + * Description: Application nodes for Example simple + * + * Target Processor: Cortex-M and Cortex-A cores + * -------------------------------------------------------------------- +* + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed under the Apache License, Version 2.0 (the License); you may + * not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#ifndef _APPNODES_H_ +#define _APPNODES_H_ + +#include + +template +class Sink: public GenericSink +{ +public: + Sink(FIFOBase &src):GenericSink(src){}; + + int prepareForRunning() final + { + if (this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final + { + IN *b=this->getReadBuffer(); + printf("Sink\n"); + for(int i=0;i +class Source: public GenericSource +{ +public: + Source(FIFOBase &dst):GenericSource(dst){}; + + int prepareForRunning() final + { + if (this->willOverflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + OUT *b=this->getWriteBuffer(); + + printf("Source\n"); + for(int i=0;i +class ProcessingNode; + + +template +class ProcessingNode: + public GenericNode +{ +public: + ProcessingNode(FIFOBase &src, + FIFOBase &dst):GenericNode(src,dst){}; + + int prepareForRunning() final + { + if (this->willOverflow() || + this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + printf("ProcessingNode\n"); + IN *a=this->getReadBuffer(); + IN *b=this->getWriteBuffer(); + for(int i=0;i +class Source; +``` + +The previous line is defining a new class template with two arguments: + +* A datatype `OUT` +* The number of samples `outputSize` + +This template can be used to implement different kind of `Source` classes : with different datatypes or number of samples. We can also (when it makes sense) define a `Source` implementation that can work with any datatype and any number of samples. + +You don't need to be knowledgeable in C++ template to start using them in the context of the compute graph. They are just here to define the plumbing. + +The only thing to understand is that: + +* `Source` is the datatype where the template argument has been replaced by the types `X` and `Y`. +* `Source` is a different datatype than `Source` if `X` and `X'` are for instance different types +* `X` and `Y` may be numbers (so a number is considered as a type in this context) + +When you have declared a C++ template, you need to implement it. There are two ways to do it: + +* You can define a generic implementation for `Source` +* And/or you can define specialized implementations for specific types (`Source`). + +For the `Source` we have defined a generic implementation so we need (like in Python case) to inherit from `GenericSource`: + +```C++ +template +class Source: GenericSource +``` + +Then, like in the Python case, we need to define a constructor. But contrary to the Python case, here we are defining an implementation. The constructor is not defining the IOs. The IOs are coming from the `GenericSource` template and its arguments. + +```C++ +public: + Source(FIFOBase &dst):public GenericSource(dst){}; +``` + +Our `Source` has only one IO : the output. It needs the FIFO for this output. The first argument, `dst`, of the `Source` constructor is the FIFO. This FIFO is coming from the scheduler. + +We also need to initialize the `GenericSource` parent since we are inheriting from it. `GenericSource` constructor is called with the `FIFO` argument `dst`. + +The constructor is here doing nothing more than initializing the parent and the implementation is empty `{}` + +The implementation of `Source` needs to provide an entry point to be usable from the scheduler. It is the `run` function. As said before, since the algorithm is very simple it has been implemented in `run`. In general, `run` is just calling an external function with the buffers coming from the FIFOs. + +```C++ +int run() final { + OUT *b=this->getWriteBuffer(); + + printf("Source\n"); + for(int i=0;igetWriteBuffer(); +``` + +We get a pointer to be able to write in the output FIFO. This pointer has the datatype OUT coming from the template so can be anything. + +**Those functions (`getWriteBuffer` and/or `getReadBuffer`) must always be used even if the node is doing nothing because FIFOs are only updated when those functions are used.** + +So for each IO, the corresponding function must be called even if nothing is read or written on this IO. Of course, in a synchronous mode it would not make sense to do nothing with an IO. But, sometimes, for debug, it can be interesting to have nodes like a `NullSink` that would just consume everything but do nothing. + +The code in the loop is casting an `int` (the loop index) into the `OUT` datatype. If it is not possible it won't typecheck and build. + +```C++ +for(int i=0;i +class ProcessingNode; +``` + +In this example we have decided to implement only a specific version of the processing node. We want to enforce the constraint that the output datatype must be equal to the input datatype and that the number of sample produced must be equal to the number of sample consumed. If it is not the case, it won't type check and the solution won't build. + +Remember from the Python definition that this constraint has not been enforced in the Python description of the processing node. + +Here is how we implement a specialized version of the template. + +First we define the arguments of the template. It is no more generic. We have to give all the arguments: + +```C++ +class ProcessingNode +``` + +This enforces that the `OUT` datatype is equal to the `IN` datatype since `IN` is used in both arguments. + +It also enforces that the input and output sizes are the same since `inputOutputSize` is used in the two arguments for the size. + +Since the arguments of the template are still not fully specified and there is some remaining degree of freedom, we need to continue to define some template parameters: + +```C++ +template +class ProcessingNode +``` + +And finally, like before, we inherit from `GenericNode` using the same template arguments: + +```C++ +template +class ProcessingNode: + public GenericNode +``` + +To be compared with the generic implementation: + +```C++ +template +class ProcessingNode: + public GenericNode +``` + +In a generic implementation, we do not use `<>` after `ProcessingNode` since we do not specify specific values of the template arguments. + +It is possible to have several specialization of the same class. + +One could also have another specialization like: + +```C++ +template +class ProcessingNode: + public GenericNode +``` + +Just working `q15_t` datatype + +The `run` function of the processing node has access to `getReadBuffer` and `getWriteBuffer` to access to the FIFO buffers. + +### The C++ wrapper for the Sink + +The definition of the `Sink` should be clear now: + +```C++ +template +class Sink: public GenericSink +{ +public: + Sink(FIFOBase &src):GenericSink(src){}; +``` + +## How to call the C++ scheduler + +The API to the scheduler is: + +```C +extern uint32_t scheduler(int *error); +``` + +It is a C API that can be used from C code. + +In case of error, the function is returning : + +* the number of schedule iterations computed since the beginning +* an error code. + +It is possible, from the Python script, to add arguments to this API when there is the need to pass additional information to the nodes. + +## How to build and run the example + +There is a very simple `Makefile` in the folder. It is for `MSVC` compiler on Windows but can be easily adapted. There are only 2 files to compile: + +* `generated/scheduler.cpp` +* `main.c` + +The directory to use for headers are: + +* `generated` +* `../../cg/src` +* `.` the current directory + +The headers required by the software are: + +* `generated/scheduler.h` + + * The is the C API to the scheduler function + +* `AppNodes.h` + + * `AppNodes.h` is where the implementation of the nodes is defined. This file could also just include nodes from a standard library. + +* `custom.h` + + * This is the first include in the `scheduler.cpp` and this file can contain whatever is needed or just be empty + * In this example, the datatype `float32_t` is defined in `custom.h` so that we don't needed to build the CMSIS-DSP for such a simple example + +* `GenericNodes.h` + + * It is coming from the `../../cg/src` folder. + * It provides the basic definitions needed by the framework like `GenericNode`, `GenericSink`,`GenericSource`, `FIFO` ... + + +### Expected output + +There are 7 executions of the `Sink` and `Source` and 5 executions of the `ProcessingNode`. + +``` +Start +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Sink +1 +2 +3 +4 +5 +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Source +ProcessingNode +Sink +1 +2 +3 +4 +5 +Sink +1 +2 +3 +4 +5 +``` + + + + + + + diff --git a/ComputeGraph/examples/simple/create.py b/ComputeGraph/examples/simple/create.py new file mode 100644 index 00000000..a21923b9 --- /dev/null +++ b/ComputeGraph/examples/simple/create.py @@ -0,0 +1,31 @@ +# Include definition of the nodes +from nodes import * +# Include definition of the graph +from graph import * + +# Create a configuration object +conf=Configuration() +# The number of schedule iteration is limited to 1 +# to prevent the scheduling from running forever +# (which should be the case for a stream computation) +conf.debugLimit=1 +# Disable inclusion of CMSIS-DSP headers so that we don't have +# to recompile CMSIS-DSP for such a simple example +conf.CMSISDSP = False + +# Compute a static scheduling of the graph +# The size of FIFO is also computed +scheduling = the_graph.computeSchedule(config=conf) + +# Print some statistics about the compute schedule +# and the memory usage +print("Schedule length = %d" % scheduling.scheduleLength) +print("Memory usage %d bytes" % scheduling.memory) + +# Generate the C++ code for the static scheduler +scheduling.ccode("generated",conf) + +# Generate a graphviz representation of the graph +with open("simple.dot","w") as f: + scheduling.graphviz(f) + diff --git a/ComputeGraph/examples/simple/custom.h b/ComputeGraph/examples/simple/custom.h new file mode 100644 index 00000000..41b6c6ee --- /dev/null +++ b/ComputeGraph/examples/simple/custom.h @@ -0,0 +1,5 @@ +#ifndef _CUSTOM_H_ + +typedef float float32_t; + +#endif \ No newline at end of file diff --git a/ComputeGraph/examples/simple/docassets/simple.png b/ComputeGraph/examples/simple/docassets/simple.png new file mode 100644 index 00000000..eaa33fb4 Binary files /dev/null and b/ComputeGraph/examples/simple/docassets/simple.png differ diff --git a/ComputeGraph/examples/simple/generated/scheduler.cpp b/ComputeGraph/examples/simple/generated/scheduler.cpp new file mode 100644 index 00000000..139e2f96 --- /dev/null +++ b/ComputeGraph/examples/simple/generated/scheduler.cpp @@ -0,0 +1,170 @@ +/* + +Generated with CMSIS-DSP Compute Graph Scripts. +The generated code is not covered by CMSIS-DSP license. + +The support classes and code is covered by CMSIS-DSP license. + +*/ + + +#include "custom.h" +#include "GenericNodes.h" +#include "AppNodes.h" +#include "scheduler.h" + +#if !defined(CHECKERROR) +#define CHECKERROR if (cgStaticError < 0) \ + {\ + goto errorHandling;\ + } + +#endif + +#if !defined(CG_BEFORE_ITERATION) +#define CG_BEFORE_ITERATION +#endif + +#if !defined(CG_AFTER_ITERATION) +#define CG_AFTER_ITERATION +#endif + +#if !defined(CG_BEFORE_SCHEDULE) +#define CG_BEFORE_SCHEDULE +#endif + +#if !defined(CG_AFTER_SCHEDULE) +#define CG_AFTER_SCHEDULE +#endif + +#if !defined(CG_BEFORE_BUFFER) +#define CG_BEFORE_BUFFER +#endif + +#if !defined(CG_BEFORE_FIFO_BUFFERS) +#define CG_BEFORE_FIFO_BUFFERS +#endif + +#if !defined(CG_BEFORE_FIFO_INIT) +#define CG_BEFORE_FIFO_INIT +#endif + +#if !defined(CG_BEFORE_NODE_INIT) +#define CG_BEFORE_NODE_INIT +#endif + +#if !defined(CG_AFTER_INCLUDES) +#define CG_AFTER_INCLUDES +#endif + +#if !defined(CG_BEFORE_SCHEDULER_FUNCTION) +#define CG_BEFORE_SCHEDULER_FUNCTION +#endif + +#if !defined(CG_BEFORE_NODE_EXECUTION) +#define CG_BEFORE_NODE_EXECUTION +#endif + +#if !defined(CG_AFTER_NODE_EXECUTION) +#define CG_AFTER_NODE_EXECUTION +#endif + +CG_AFTER_INCLUDES + + +/* + +Description of the scheduling. + +*/ +static unsigned int schedule[19]= +{ +2,2,0,1,2,0,1,2,2,0,1,1,2,0,1,2,0,1,1, +}; + +CG_BEFORE_FIFO_BUFFERS +/*********** + +FIFO buffers + +************/ +#define FIFOSIZE0 11 +#define FIFOSIZE1 11 + +#define BUFFERSIZE1 11 +CG_BEFORE_BUFFER +float32_t buf1[BUFFERSIZE1]={0}; + +#define BUFFERSIZE2 11 +CG_BEFORE_BUFFER +float32_t buf2[BUFFERSIZE2]={0}; + + +CG_BEFORE_SCHEDULER_FUNCTION +uint32_t scheduler(int *error) +{ + int cgStaticError=0; + uint32_t nbSchedule=0; + int32_t debugCounter=1; + + CG_BEFORE_FIFO_INIT; + /* + Create FIFOs objects + */ + FIFO fifo0(buf1); + FIFO fifo1(buf2); + + CG_BEFORE_NODE_INIT; + /* + Create node objects + */ + ProcessingNode processing(fifo0,fifo1); + Sink sink(fifo1); + Source source(fifo0); + + /* Run several schedule iterations */ + CG_BEFORE_SCHEDULE; + while((cgStaticError==0) && (debugCounter > 0)) + { + /* Run a schedule iteration */ + CG_BEFORE_ITERATION; + for(unsigned long id=0 ; id < 19; id++) + { + CG_BEFORE_NODE_EXECUTION; + + switch(schedule[id]) + { + case 0: + { + cgStaticError = processing.run(); + } + break; + + case 1: + { + cgStaticError = sink.run(); + } + break; + + case 2: + { + cgStaticError = source.run(); + } + break; + + default: + break; + } + CG_AFTER_NODE_EXECUTION; + CHECKERROR; + } + debugCounter--; + CG_AFTER_ITERATION; + nbSchedule++; + } + +errorHandling: + CG_AFTER_SCHEDULE; + *error=cgStaticError; + return(nbSchedule); +} diff --git a/ComputeGraph/examples/simple/generated/scheduler.h b/ComputeGraph/examples/simple/generated/scheduler.h new file mode 100644 index 00000000..d98d9e63 --- /dev/null +++ b/ComputeGraph/examples/simple/generated/scheduler.h @@ -0,0 +1,26 @@ +/* + +Generated with CMSIS-DSP Compute Graph Scripts. +The generated code is not covered by CMSIS-DSP license. + +The support classes and code is covered by CMSIS-DSP license. + +*/ + +#ifndef _SCHEDULER_H_ +#define _SCHEDULER_H_ + +#ifdef __cplusplus +extern "C" +{ +#endif + + +extern uint32_t scheduler(int *error); + +#ifdef __cplusplus +} +#endif + +#endif + diff --git a/ComputeGraph/examples/simple/graph.py b/ComputeGraph/examples/simple/graph.py new file mode 100644 index 00000000..ad116e11 --- /dev/null +++ b/ComputeGraph/examples/simple/graph.py @@ -0,0 +1,39 @@ +# Include definitions from the Python package to +# define datatype for the IOs and to have access to the +# Graph class +from cmsisdsp.cg.scheduler import * +# Include definition of the nodes +from nodes import * + +# Define the datatype we are using for all the IOs in this +# example +floatType=CType(F32) + +# Instantiate a Source node with a float datatype and +# working with packet of 5 samples (each execution of the +# source in the C code will generate 5 samples) +# "source" is the name of the C variable that will identify +# this node +src=Source("source",floatType,5) +# Instantiate a Processing node using a float data type for +# both the input and output. The number of samples consumed +# on the input and produced on the output is 7 each time +# the node is executed in the C code +# "processing" is the name of the C variable that will identify +# this node +processing=ProcessingNode("processing",floatType,7,7) +# Instantiate a Sink node with a float datatype and consuming +# 5 samples each time the node is executed in the C code +# "sink" is the name of the C variable that will identify +# this node +sink=Sink("sink",floatType,5) + +# Create a Graph object +the_graph = Graph() + +# Connect the source to the processing node +the_graph.connect(src.o,processing.i) +# Connect the processing node to the sink +the_graph.connect(processing.o,sink.i) + + diff --git a/ComputeGraph/examples/simple/main.cpp b/ComputeGraph/examples/simple/main.cpp new file mode 100644 index 00000000..a1fd4028 --- /dev/null +++ b/ComputeGraph/examples/simple/main.cpp @@ -0,0 +1,11 @@ +#include +#include +#include "scheduler.h" + +int main(int argc, char const *argv[]) +{ + int error; + printf("Start\n"); + uint32_t nbSched=scheduler(&error); + return 0; +} \ No newline at end of file diff --git a/ComputeGraph/examples/simple/main.obj b/ComputeGraph/examples/simple/main.obj new file mode 100644 index 00000000..05de417d Binary files /dev/null and b/ComputeGraph/examples/simple/main.obj differ diff --git a/ComputeGraph/examples/simple/nodes.py b/ComputeGraph/examples/simple/nodes.py new file mode 100644 index 00000000..6f5a5987 --- /dev/null +++ b/ComputeGraph/examples/simple/nodes.py @@ -0,0 +1,77 @@ +# Include definitions from the Python package +from cmsisdsp.cg.scheduler import GenericNode,GenericSink,GenericSource + +### Define new types of Nodes + +class ProcessingNode(GenericNode): + """ + Definition of a ProcessingNode for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the input and output + inLength : int + The number of samples consumed by input + outLength : int + The number of samples produced on output + """ + def __init__(self,name,theType,inLength,outLength): + GenericNode.__init__(self,name) + self.addInput("i",theType,inLength) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "ProcessingNode" + +class Sink(GenericSink): + """ + Definition of a Sink node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the input + inLength : int + The number of samples consumed by input + """ + def __init__(self,name,theType,inLength): + GenericSink.__init__(self,name) + self.addInput("i",theType,inLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Sink" + +class Source(GenericSource): + """ + Definition of a Source node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the output + outLength : int + The number of samples produced on output + """ + def __init__(self,name,theType,outLength): + GenericSource.__init__(self,name) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Source" + diff --git a/ComputeGraph/examples/simple/scheduler.obj b/ComputeGraph/examples/simple/scheduler.obj new file mode 100644 index 00000000..1b429716 Binary files /dev/null and b/ComputeGraph/examples/simple/scheduler.obj differ diff --git a/ComputeGraph/examples/simple/simple.dot b/ComputeGraph/examples/simple/simple.dot new file mode 100644 index 00000000..031c4181 --- /dev/null +++ b/ComputeGraph/examples/simple/simple.dot @@ -0,0 +1,48 @@ + + + + +digraph structs { + node [shape=plaintext] + rankdir=LR + edge [arrowsize=0.5] + fontname="times" + + +processing [label=< + + + + +
processing
(ProcessingNode)
>]; + +sink [label=< + + + + +
sink
(Sink)
>]; + +source [label=< + + + + +
source
(Source)
>]; + + + +source:i -> processing:i [label="f32(11)" +,headlabel=<
7 +
> +,taillabel=<
5 +
>] + +processing:i -> sink:i [label="f32(11)" +,headlabel=<
5 +
> +,taillabel=<
7 +
>] + + +} diff --git a/ComputeGraph/examples/simple/simple.exe b/ComputeGraph/examples/simple/simple.exe new file mode 100644 index 00000000..e4712075 Binary files /dev/null and b/ComputeGraph/examples/simple/simple.exe differ diff --git a/ComputeGraph/examples/simple/simple.ilk b/ComputeGraph/examples/simple/simple.ilk new file mode 100644 index 00000000..23f7f194 Binary files /dev/null and b/ComputeGraph/examples/simple/simple.ilk differ diff --git a/ComputeGraph/examples/simple/simple.pdb b/ComputeGraph/examples/simple/simple.pdb new file mode 100644 index 00000000..5bc7a223 Binary files /dev/null and b/ComputeGraph/examples/simple/simple.pdb differ diff --git a/ComputeGraph/examples/simple/simple.pdf b/ComputeGraph/examples/simple/simple.pdf new file mode 100644 index 00000000..19c89724 Binary files /dev/null and b/ComputeGraph/examples/simple/simple.pdf differ diff --git a/ComputeGraph/examples/simple/simple.png b/ComputeGraph/examples/simple/simple.png new file mode 100644 index 00000000..eaa33fb4 Binary files /dev/null and b/ComputeGraph/examples/simple/simple.png differ diff --git a/ComputeGraph/examples/simple/vc140.pdb b/ComputeGraph/examples/simple/vc140.pdb new file mode 100644 index 00000000..142c4bf9 Binary files /dev/null and b/ComputeGraph/examples/simple/vc140.pdb differ diff --git a/ComputeGraph/examples/simpledsp/AppNodes.h b/ComputeGraph/examples/simpledsp/AppNodes.h new file mode 100644 index 00000000..728310d3 --- /dev/null +++ b/ComputeGraph/examples/simpledsp/AppNodes.h @@ -0,0 +1,125 @@ +/* ---------------------------------------------------------------------- + * Project: CMSIS DSP Library + * Title: AppNodes.h + * Description: Application nodes for Example simpledsp + * + * Target Processor: Cortex-M and Cortex-A cores + * -------------------------------------------------------------------- + * + * Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed under the Apache License, Version 2.0 (the License); you may + * not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#ifndef _APPNODES_H_ +#define _APPNODES_H_ + +#include + +template +class Sink: public GenericSink +{ +public: + Sink(FIFOBase &src):GenericSink(src){}; + + int prepareForRunning() final + { + if (this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final + { + IN *b=this->getReadBuffer(); + printf("Sink\n"); + for(int i=0;i +class Source: public GenericSource +{ +public: + Source(FIFOBase &dst):GenericSource(dst){}; + + int prepareForRunning() final + { + if (this->willOverflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + OUT *b=this->getWriteBuffer(); + + printf("Source\n"); + for(int i=0;i +class ProcessingNode; + + +template +class ProcessingNode: + public GenericNode +{ +public: + ProcessingNode(FIFOBase &src, + FIFOBase &dst):GenericNode(src,dst){}; + + int prepareForRunning() final + { + if (this->willOverflow() || + this->willUnderflow()) + { + return(CG_SKIP_EXECUTION_ID_CODE); // Skip execution + } + + return(0); + }; + + int run() final{ + printf("ProcessingNode\n"); + IN *a=this->getReadBuffer(); + IN *b=this->getWriteBuffer(); + for(int i=0;i fifo0(buf1); + FIFO fifo1(buf2); + + CG_BEFORE_NODE_INIT; + /* + Create node objects + */ + Sink sink(fifo1); + Source source(fifo0); + + /* Run several schedule iterations */ + CG_BEFORE_SCHEDULE; + while((cgStaticError==0) && (debugCounter > 0)) + { + /* Run a schedule iteration */ + CG_BEFORE_ITERATION; + for(unsigned long id=0 ; id < 19; id++) + { + CG_BEFORE_NODE_EXECUTION; + + switch(schedule[id]) + { + case 0: + { + + { + + float32_t* i0; + float32_t* o2; + i0=fifo0.getReadBuffer(7); + o2=fifo1.getWriteBuffer(7); + arm_offset_f32(i0,OFFSET_VALUE,o2,7); + cgStaticError = 0; + } + } + break; + + case 1: + { + cgStaticError = sink.run(); + } + break; + + case 2: + { + cgStaticError = source.run(); + } + break; + + default: + break; + } + CG_AFTER_NODE_EXECUTION; + CHECKERROR; + } + debugCounter--; + CG_AFTER_ITERATION; + nbSchedule++; + } + +errorHandling: + CG_AFTER_SCHEDULE; + *error=cgStaticError; + return(nbSchedule); +} diff --git a/ComputeGraph/examples/simpledsp/generated/scheduler.h b/ComputeGraph/examples/simpledsp/generated/scheduler.h new file mode 100644 index 00000000..d98d9e63 --- /dev/null +++ b/ComputeGraph/examples/simpledsp/generated/scheduler.h @@ -0,0 +1,26 @@ +/* + +Generated with CMSIS-DSP Compute Graph Scripts. +The generated code is not covered by CMSIS-DSP license. + +The support classes and code is covered by CMSIS-DSP license. + +*/ + +#ifndef _SCHEDULER_H_ +#define _SCHEDULER_H_ + +#ifdef __cplusplus +extern "C" +{ +#endif + + +extern uint32_t scheduler(int *error); + +#ifdef __cplusplus +} +#endif + +#endif + diff --git a/ComputeGraph/examples/simpledsp/graph.py b/ComputeGraph/examples/simpledsp/graph.py new file mode 100644 index 00000000..e6e9c5dc --- /dev/null +++ b/ComputeGraph/examples/simpledsp/graph.py @@ -0,0 +1,41 @@ +# Include definitions from the Python package to +# define datatype for the IOs and to have access to the +# Graph class +from cmsisdsp.cg.scheduler import * +# Include definition of the nodes +from nodes import * + +# Define the datatype we are using for all the IOs in this +# example +floatType=CType(F32) + +# Instantiate a Source node with a float datatype and +# working with packet of 5 samples (each execution of the +# source in the C code will generate 5 samples) +# "source" is the name of the C variable that will identify +# this node +src=Source("source",floatType,5) +# Instantiate a Processing node using a float data type for +# both the input and output. The number of samples consumed +# on the input and produced on the output is 7 each time +# the node is executed in the C code +# "processing" is the name of the C variable that will identify +# this node +processing=Binary("arm_offset_f32",floatType,7) +offsetValue=Constant("OFFSET_VALUE") +# Instantiate a Sink node with a float datatype and consuming +# 5 samples each time the node is executed in the C code +# "sink" is the name of the C variable that will identify +# this node +sink=Sink("sink",floatType,5) + +# Create a Graph object +the_graph = Graph() + +# Connect the source to the processing node +the_graph.connect(src.o,processing.ia) +the_graph.connect(offsetValue,processing.ib) +# Connect the processing node to the sink +the_graph.connect(processing.o,sink.i) + + diff --git a/ComputeGraph/examples/simpledsp/main.cpp b/ComputeGraph/examples/simpledsp/main.cpp new file mode 100644 index 00000000..a1fd4028 --- /dev/null +++ b/ComputeGraph/examples/simpledsp/main.cpp @@ -0,0 +1,11 @@ +#include +#include +#include "scheduler.h" + +int main(int argc, char const *argv[]) +{ + int error; + printf("Start\n"); + uint32_t nbSched=scheduler(&error); + return 0; +} \ No newline at end of file diff --git a/ComputeGraph/examples/simpledsp/nodes.py b/ComputeGraph/examples/simpledsp/nodes.py new file mode 100644 index 00000000..bb82c160 --- /dev/null +++ b/ComputeGraph/examples/simpledsp/nodes.py @@ -0,0 +1,49 @@ +# Include definitions from the Python package +from cmsisdsp.cg.scheduler import GenericNode,GenericSink,GenericSource + +class Sink(GenericSink): + """ + Definition of a Sink node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the input + inLength : int + The number of samples consumed by input + """ + def __init__(self,name,theType,inLength): + GenericSink.__init__(self,name) + self.addInput("i",theType,inLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Sink" + +class Source(GenericSource): + """ + Definition of a Source node for the graph + + Parameters + ---------- + name : str + Name of the C variable identifying this node + in the C code + theType : CGStaticType + The datatype for the output + outLength : int + The number of samples produced on output + """ + def __init__(self,name,theType,outLength): + GenericSource.__init__(self,name) + self.addOutput("o",theType,outLength) + + @property + def typeName(self): + """The name of the C++ class implementing this node""" + return "Source" + diff --git a/ComputeGraph/examples/simpledsp/simpledsp.dot b/ComputeGraph/examples/simpledsp/simpledsp.dot new file mode 100644 index 00000000..a268450a --- /dev/null +++ b/ComputeGraph/examples/simpledsp/simpledsp.dot @@ -0,0 +1,65 @@ + + + + +digraph structs { + node [shape=plaintext] + rankdir=LR + edge [arrowsize=0.5] + fontname="times" + + + +arm_offset_f321 [label=< + + + + + + + + + + + + +
iaarm_offset_f32
(Function)
o
ib
>]; + +sink [label=< + + + + +
sink
(Sink)
>]; + +source [label=< + + + + +
source
(Source)
>]; + + + +source:i -> arm_offset_f321:ia [label="f32(11)" +,headlabel=<
7 +
> +,taillabel=<
5 +
>] + +arm_offset_f321:o -> sink:i [label="f32(11)" +,headlabel=<
5 +
> +,taillabel=<
7 +
>] + +OFFSET_VALUE [label=< + + + + +
OFFSET_VALUE
>]; + +OFFSET_VALUE:i -> arm_offset_f321:ib + +} diff --git a/ComputeGraph/examples/simpledsp/simpledsp.pdf b/ComputeGraph/examples/simpledsp/simpledsp.pdf new file mode 100644 index 00000000..1a519ed2 Binary files /dev/null and b/ComputeGraph/examples/simpledsp/simpledsp.pdf differ diff --git a/cmsisdsp/cg/nodes/CFFT.py b/cmsisdsp/cg/nodes/CFFT.py index fc710b03..31d94e6b 100644 --- a/cmsisdsp/cg/nodes/CFFT.py +++ b/cmsisdsp/cg/nodes/CFFT.py @@ -3,13 +3,11 @@ # Title: CFFTF.py # Description: Node for CMSIS-DSP cfft # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/Duplicate.py b/cmsisdsp/cg/nodes/Duplicate.py index aa977999..063bd4ef 100644 --- a/cmsisdsp/cg/nodes/Duplicate.py +++ b/cmsisdsp/cg/nodes/Duplicate.py @@ -3,12 +3,11 @@ # Title: Duplicate.py # Description: Duplicate nodes # -# $Date: 08 September 2022 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/ICFFT.py b/cmsisdsp/cg/nodes/ICFFT.py index 575b1b22..95b243bc 100644 --- a/cmsisdsp/cg/nodes/ICFFT.py +++ b/cmsisdsp/cg/nodes/ICFFT.py @@ -3,13 +3,11 @@ # Title: ICFFT.py # Description: Node for CMSIS-DSP icfft f32 # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/InterleavedStereoToMono.py b/cmsisdsp/cg/nodes/InterleavedStereoToMono.py index 10e9d126..14d5f6b6 100644 --- a/cmsisdsp/cg/nodes/InterleavedStereoToMono.py +++ b/cmsisdsp/cg/nodes/InterleavedStereoToMono.py @@ -3,13 +3,11 @@ # Title: InterleavedStereoToMono.py # Description: Interleaved Stereo to mono in Q15 # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/MFCC.py b/cmsisdsp/cg/nodes/MFCC.py index e90f2345..e424835b 100644 --- a/cmsisdsp/cg/nodes/MFCC.py +++ b/cmsisdsp/cg/nodes/MFCC.py @@ -3,13 +3,11 @@ # Title: MFCC.py # Description: Node for CMSIS-DSP MFCC # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/NullSink.py b/cmsisdsp/cg/nodes/NullSink.py index 27133448..d5d23aae 100644 --- a/cmsisdsp/cg/nodes/NullSink.py +++ b/cmsisdsp/cg/nodes/NullSink.py @@ -3,13 +3,11 @@ # Title: NullSink.py # Description: Null sink doing nothing for debug # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/ToComplex.py b/cmsisdsp/cg/nodes/ToComplex.py index d8d696e1..b4b9e68a 100644 --- a/cmsisdsp/cg/nodes/ToComplex.py +++ b/cmsisdsp/cg/nodes/ToComplex.py @@ -3,13 +3,11 @@ # Title: ToComplex.py # Description: Node to convert real to complex # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/ToReal.py b/cmsisdsp/cg/nodes/ToReal.py index 9a65ac61..a83adc70 100644 --- a/cmsisdsp/cg/nodes/ToReal.py +++ b/cmsisdsp/cg/nodes/ToReal.py @@ -3,13 +3,11 @@ # Title: ToReal.py # Description: Node to convert complex to real # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/Unzip.py b/cmsisdsp/cg/nodes/Unzip.py index f3f98395..062e8003 100644 --- a/cmsisdsp/cg/nodes/Unzip.py +++ b/cmsisdsp/cg/nodes/Unzip.py @@ -3,13 +3,11 @@ # Title: Unzip.py # Description: Unzip streams # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/Zip.py b/cmsisdsp/cg/nodes/Zip.py index dedfa023..40e620e6 100644 --- a/cmsisdsp/cg/nodes/Zip.py +++ b/cmsisdsp/cg/nodes/Zip.py @@ -3,13 +3,11 @@ # Title: Zip.py # Description: Zip two streams # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/__init__.py b/cmsisdsp/cg/nodes/__init__.py index a6459239..60cbc121 100644 --- a/cmsisdsp/cg/nodes/__init__.py +++ b/cmsisdsp/cg/nodes/__init__.py @@ -3,13 +3,11 @@ # Title: __init__.py # Description: CG default nodes # -# $Date: 30 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/FileSink.py b/cmsisdsp/cg/nodes/host/FileSink.py index 612fd282..45477eeb 100644 --- a/cmsisdsp/cg/nodes/host/FileSink.py +++ b/cmsisdsp/cg/nodes/host/FileSink.py @@ -3,13 +3,11 @@ # Title: FileSink.py # Description: Node for creating file sinks # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/FileSource.py b/cmsisdsp/cg/nodes/host/FileSource.py index 348684e3..0a1288d0 100644 --- a/cmsisdsp/cg/nodes/host/FileSource.py +++ b/cmsisdsp/cg/nodes/host/FileSource.py @@ -3,13 +3,11 @@ # Title: FileSource.py # Description: Node for creating file source # -# $Date: 30 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/NumpySink.py b/cmsisdsp/cg/nodes/host/NumpySink.py index c69eacde..e8f1408b 100644 --- a/cmsisdsp/cg/nodes/host/NumpySink.py +++ b/cmsisdsp/cg/nodes/host/NumpySink.py @@ -3,13 +3,11 @@ # Title: NumpySink.py # Description: Sink node for displaying a buffer in scipy # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/WavSink.py b/cmsisdsp/cg/nodes/host/WavSink.py index 2ba6519c..f0a6df93 100644 --- a/cmsisdsp/cg/nodes/host/WavSink.py +++ b/cmsisdsp/cg/nodes/host/WavSink.py @@ -3,13 +3,11 @@ # Title: WavSink.py # Description: Sink node for creating a wav # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/WavSource.py b/cmsisdsp/cg/nodes/host/WavSource.py index 271bdf3d..70dbe548 100644 --- a/cmsisdsp/cg/nodes/host/WavSource.py +++ b/cmsisdsp/cg/nodes/host/WavSource.py @@ -3,13 +3,11 @@ # Title: WavSource.py # Description: Source node for reading wave files # -# $Date: 06 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/host/message.py b/cmsisdsp/cg/nodes/host/message.py index f7c77704..317cdc84 100644 --- a/cmsisdsp/cg/nodes/host/message.py +++ b/cmsisdsp/cg/nodes/host/message.py @@ -1,5 +1,5 @@ # -------------------------------------------------------------------------- -# Copyright (c) 2020-2022 Arm Limited (or its affiliates). All rights reserved. +# Copyright (c) 2021-2023 Arm Limited (or its affiliates). All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/nodes/simu.py b/cmsisdsp/cg/nodes/simu.py index 5afef587..5b91fd7a 100644 --- a/cmsisdsp/cg/nodes/simu.py +++ b/cmsisdsp/cg/nodes/simu.py @@ -3,13 +3,11 @@ # Title: simu.py # Description: Support Python classes for the Python static scheduler # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/config.py b/cmsisdsp/cg/scheduler/config.py index 04558a02..41bd2eef 100644 --- a/cmsisdsp/cg/scheduler/config.py +++ b/cmsisdsp/cg/scheduler/config.py @@ -3,13 +3,11 @@ # Title: config.py # Description: Configuration of the code generator # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/description.py b/cmsisdsp/cg/scheduler/description.py index 24cd3a25..90f71fd7 100644 --- a/cmsisdsp/cg/scheduler/description.py +++ b/cmsisdsp/cg/scheduler/description.py @@ -3,13 +3,11 @@ # Title: description.py # Description: Schedule generation # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2023 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/graphviz.py b/cmsisdsp/cg/scheduler/graphviz.py index 242b47f3..f4a268d3 100644 --- a/cmsisdsp/cg/scheduler/graphviz.py +++ b/cmsisdsp/cg/scheduler/graphviz.py @@ -3,13 +3,11 @@ # Title: graphviz.py # Description: Graphviz generation for the CG Static scheduler # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/node.py b/cmsisdsp/cg/scheduler/node.py index 117a08e0..0889f779 100644 --- a/cmsisdsp/cg/scheduler/node.py +++ b/cmsisdsp/cg/scheduler/node.py @@ -3,13 +3,11 @@ # Title: node.py # Description: Node class for description of dataflow graph # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/pythoncode.py b/cmsisdsp/cg/scheduler/pythoncode.py index 9d0dddb0..544b1122 100644 --- a/cmsisdsp/cg/scheduler/pythoncode.py +++ b/cmsisdsp/cg/scheduler/pythoncode.py @@ -3,13 +3,11 @@ # Title: pythoncode.py # Description: Generation of Python code for the static scheduler # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/scheduler/standard.py b/cmsisdsp/cg/scheduler/standard.py index 844800c6..20556ae1 100644 --- a/cmsisdsp/cg/scheduler/standard.py +++ b/cmsisdsp/cg/scheduler/standard.py @@ -3,13 +3,11 @@ # Title: standard.py # Description: Standard nodes to describe a network # -# $Date: 02 August 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 # diff --git a/cmsisdsp/cg/types.py b/cmsisdsp/cg/types.py index d18fe052..a5041639 100644 --- a/cmsisdsp/cg/types.py +++ b/cmsisdsp/cg/types.py @@ -3,13 +3,11 @@ # Title: types.py # Description: Description of the basic CMSIS-DSP types # -# $Date: 29 July 2021 -# $Revision: V1.10.0 # # Target Processor: Cortex-M and Cortex-A cores # -------------------------------------------------------------------- */ # -# Copyright (C) 2010-2021 ARM Limited or its affiliates. All rights reserved. +# Copyright (C) 2021-2023 ARM Limited or its affiliates. All rights reserved. # # SPDX-License-Identifier: Apache-2.0 #