[fiber] Stackful cooperative scheduling via Fibers #743

salkinium · 2021-10-04T11:51:24Z

This continues the work in #439. The changes/additions so far:

Main function

The main function is not a fiber anymore, the developer has to manually start the scheduler. modm::fiber::yield() will return if scheduler is not running, therefore looping in the main functions. This will most likely work fine for interrupt/flags driven peripheral access like I2C, SPI and UART, however, won't work for any class that requires an update() function to be polled (but that didn't work with RF_CALL_BLOCKING either).

The scheduler::run() function will return to the main function if all fibers stop running (or if all functions have yielded). This can be used for sleeping and idle thread functionality.

This also treats the main function consistently across all platforms and carries no performance overhead if not using fibers.

I've also moved all classes into the modm::fiber namespace, since I'm worried about compatiblity with FreeRTOS and I don't want people believing that modm::yield() makes their code magically reentrant or something stupid like that. In general users should not have to manually yield, but rather use some higher-level primitive or modm API anyways.

Cortex-M Context Switch

I've optimized the context switch assembly by using the push LR, pop PC pair correctly. This significantly speeds up the context switch. For FPU enabled devices, we only need to store the upper 16 floating point registers, since the lower ones are not saved across subfunction calls according to the EABI. This does double the switching time, perhaps in future we can investigate optimizations via the FPU flags register.

Fibers run on the PSP, while the main function and the interrupts use the MSP whose size is defined by the existing lbuild option modm:platform:cortex-m:main_stack_size.

Guard Option

There is an invisible guard option modm:__fibers which prevents the fibers from being shown in modm.io modules and the doxygen docs until it is ready. This is particularly important for adding new peripheral drivers that may use a completely different API without resumable functions and shouldn't be exposed to the users right now.

Stack Placement

On STM32F4 with CCM, the main stack has been moved into RAM, since the CCM is not DMA-able. On all other STM32 the main and modm_faststack are placed in the fastest DMA-able memory. We will have to see how well the Cortex-M7 DTCM is really accessible for DMA, we'll deal with that later.

TODO:

rleh · 2021-10-04T23:54:37Z

The main stack will be reused for the IRQ stack, and the scheduler::start() function will return to the main function if all fibers stop running (or if all functions have yielded). This can be used for sleeping and idle thread functionality.

Shouldn't the function rather be named scheduler::run() (or scheduler::update())?

salkinium · 2021-10-05T07:39:01Z

Yes, but I'm still working to port fibers to AVR, and haven't gotten to the API yet.

salkinium · 2021-10-06T07:52:37Z

I've added a context switch based on avr-fibers and the GCC calling convention and register layout, however, it doesn't work entirely and as I cannot debug my AVR hardware, I've only been poking around in darkness as to why. In my mind it should just work, but I cannot even jump into the beginning of a fiber call. I'm going to use simavr to try and debug it.

salkinium · 2021-10-10T23:01:57Z

I fixed the AVR fibers, turns out the ATmega2560 is using a 3 byte PC. 🙄
Validated the example and unit tests in hardware on ATmega2560 (3B PC) and ATmega328 (2B PC).

salkinium · 2021-10-13T23:59:56Z

I'm not sure the x86_64 context switch is working correctly on Windows, I didn't understand what the previous implementation was doing with %rax. The examples and unittests work locally on macOS and in the Linux CI, but Windows seems to not like it? x86 seems to be a bit of a train wreck tho.

chris-durand · 2021-10-14T17:19:07Z

I compared the x86_64 context switch to the implementation in boost context and found big differences for the Microsoft ABI. They also save XMM6-XMM15 floating point registers and other things. It looks like we are also missing some x87 floating point state for Unix x86_64 platforms.

There is some info in the boost context docs.

salkinium · 2021-10-14T17:33:30Z

Oh boy… I'll check out if we can use setjmp/longjmp instead on Hosted. That would also give us ARM64 Linux support.

examples/avr/fiber/main.cpp

chris-durand · 2021-10-19T23:17:10Z

I'll review tomorrow.

salkinium · 2021-10-19T23:45:07Z

I've moved all stacks into the fastest DMA-able memory and added a guard option modm:__fibers so that we can build a parallel, very much not backwards compatible API of peripheral drivers without confusing everyone. This also prevents the docs scripts from including the unfinished fibers docs.

We'll have to see how annoying it is to have two incompatible API next to each other…

I will also restructure the one commit into a few different ones, they are all squashed together until I got things working.

src/modm/processing/fiber/channel.hpp

src/modm/processing/fiber/context.h

src/modm/processing/fiber/mutex.hpp

src/modm/processing/fiber/context_arm_m.cpp.in

src/modm/processing/fiber/waitable.hpp

chris-durand · 2021-10-20T23:01:44Z

There is still something wrong with the Waitable implementation. Having two fibers block on an acquired mutex leads to weird behaviour and a segfault. I tested this code:

#include <modm/debug.hpp>
#include <modm/processing.hpp>

modm::fiber::Mutex m;

template<int index>
void test()
{
        for(int ii=0; ii<10; ii++)
        {
                m.acquire();
                MODM_LOG_INFO << "test" << index << "\n";
                modm::fiber::yield();
                m.release();
        }
}

modm::fiber::Stack<1024> stack[3];
modm::Fiber fiber1(stack[0], test<1>);
modm::Fiber fiber2(stack[1], test<2>);
modm::Fiber fiber3(stack[2], test<3>);

int
main(void)
{
        MODM_LOG_INFO << "Start" << modm::endl;
        modm::fiber::Scheduler::run();
        MODM_LOG_INFO << "End" << modm::endl;

        return 0;
}

It also crashes when I correctly initialize Waitable::lastWaiter as nullptr.

rleh

Nice!

salkinium · 2021-10-28T07:53:59Z

I think the waitable implementation is too clever for me, it tries to reorder the list of fibers for low latency and I think it does it wrongly?

I would prefer to have a dumber polling based implementation first to explore the usefulness of the API. The current implementation also doesn't work from an ISR context, which is what is needed to implement interrupt driven peripheral drivers.

I've removed the Waitables for now, since we also don't have a Protothread/Resumables version of those and therefore aren't necessary to replace them.

chris-durand

Let's merge it without the waitables. They can be added later if required.

ghost · 2021-10-29T14:45:16Z

Just an idea. If the fiber knows the stack size it only costs a few instructions extra to check for overflow during stack swap. It would be a nice option to have.

salkinium added advanced 🤯 feature 🚧 labels Oct 4, 2021

salkinium force-pushed the feature/fiber branch 3 times, most recently from cb86910 to 403a232 Compare October 4, 2021 20:43

salkinium force-pushed the feature/fiber branch from 403a232 to 267fbea Compare October 5, 2021 20:09

salkinium force-pushed the feature/fiber branch 3 times, most recently from f5d372f to 55c7da2 Compare October 10, 2021 23:00

salkinium force-pushed the feature/fiber branch 8 times, most recently from 9fb5836 to 5b3b722 Compare October 13, 2021 23:53

salkinium marked this pull request as ready for review October 13, 2021 23:54

salkinium requested review from rleh and chris-durand October 13, 2021 23:57

salkinium force-pushed the feature/fiber branch from 5b3b722 to 6cbc12e Compare October 14, 2021 12:59

salkinium commented Oct 14, 2021

View reviewed changes

examples/avr/fiber/main.cpp Outdated Show resolved Hide resolved

salkinium force-pushed the feature/fiber branch from 6cbc12e to e0039f8 Compare October 15, 2021 18:22

salkinium requested a review from chris-durand October 19, 2021 22:10

salkinium mentioned this pull request Oct 20, 2021

[fiber] Adds a fiber processing library #439

Closed

16 tasks

salkinium added this to the 2021q4 milestone Oct 20, 2021

chris-durand requested changes Oct 20, 2021

View reviewed changes

rleh mentioned this pull request Oct 22, 2021

Ethernet for STM32F4 #615

Merged

rleh approved these changes Oct 22, 2021

View reviewed changes

salkinium force-pushed the feature/fiber branch 2 times, most recently from f0719bd to a8e5c18 Compare October 28, 2021 07:53

salkinium force-pushed the feature/fiber branch 2 times, most recently from baa8246 to dfdddc9 Compare October 28, 2021 09:09

salkinium requested a review from chris-durand October 28, 2021 22:50

chris-durand approved these changes Oct 29, 2021

View reviewed changes

salkinium added 7 commits October 29, 2021 02:56

[ci] Always execute all steps on macOS and Windows

e945f35

[hosted] Unify core peripheral type naming

52e0d64

[test] Split ATmega2560 tests into two parts

cf8dd49

[cortex-m] Remove unused process stack pointer

09ef00e

[cortex-m] Add .faststack section

def26a4

[fiber] Add cooperative, stackful fibers

b22c89a

[examples] Add fiber examples

3936a28

salkinium force-pushed the feature/fiber branch from dfdddc9 to 3936a28 Compare October 29, 2021 00:57

salkinium added the ci:hal Triggers the exhaustive HAL compile CI jobs label Oct 29, 2021

salkinium merged commit 3936a28 into modm-io:develop Oct 29, 2021

salkinium deleted the feature/fiber branch October 29, 2021 03:29

salkinium mentioned this pull request Jan 19, 2022

Add support for RP2040 #806

Merged

2 tasks

hshose mentioned this pull request Mar 13, 2024

Implement Protothreads as Fibers #1001

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fiber] Stackful cooperative scheduling via Fibers #743

[fiber] Stackful cooperative scheduling via Fibers #743

salkinium commented Oct 4, 2021 •

edited

Loading

rleh commented Oct 4, 2021

salkinium commented Oct 5, 2021

salkinium commented Oct 6, 2021

salkinium commented Oct 10, 2021 •

edited

Loading

salkinium commented Oct 13, 2021

chris-durand commented Oct 14, 2021

salkinium commented Oct 14, 2021 •

edited

Loading

chris-durand commented Oct 19, 2021

salkinium commented Oct 19, 2021 •

edited

Loading

chris-durand commented Oct 20, 2021 •

edited

Loading

rleh left a comment

salkinium commented Oct 28, 2021

chris-durand left a comment

ghost commented Oct 29, 2021

[fiber] Stackful cooperative scheduling via Fibers #743

[fiber] Stackful cooperative scheduling via Fibers #743

Conversation

salkinium commented Oct 4, 2021 • edited Loading

Main function

Cortex-M Context Switch

Guard Option

Stack Placement

rleh commented Oct 4, 2021

salkinium commented Oct 5, 2021

salkinium commented Oct 6, 2021

salkinium commented Oct 10, 2021 • edited Loading

salkinium commented Oct 13, 2021

chris-durand commented Oct 14, 2021

salkinium commented Oct 14, 2021 • edited Loading

chris-durand commented Oct 19, 2021

salkinium commented Oct 19, 2021 • edited Loading

chris-durand commented Oct 20, 2021 • edited Loading

rleh left a comment

Choose a reason for hiding this comment

salkinium commented Oct 28, 2021

chris-durand left a comment

Choose a reason for hiding this comment

ghost commented Oct 29, 2021

salkinium commented Oct 4, 2021 •

edited

Loading

salkinium commented Oct 10, 2021 •

edited

Loading

salkinium commented Oct 14, 2021 •

edited

Loading

salkinium commented Oct 19, 2021 •

edited

Loading

chris-durand commented Oct 20, 2021 •

edited

Loading