
This is the 4th post in this series on the 0.3 version of the Merg-E language spec. The previous posts are available here.
- part 1 : coding style, files, merging, scoping, name resolution and synchronisation
- part 2 : reverse markdown for documentation
- part 3 : Actors and pools.
In this post we look at semantic locks, scheduling (superficially) and hazardous blocker usage.
For info on the context of the why of Merg-E and how it fits into a the bigger picture of the Innuendo Stack :
- My Insane 20-Project Pet Stack : An outline of my blue-sky Innuendo Stack
Continuation points.
As we discussed in part 1, main,function,def/merge and "lock" are all defined by their execution scope. They are callables for which the execution scope is the only scope. In contrast, an actor also has an execution scope, but it is an embedded overlayed part of the long lived actor scope. The executable body bound to the execution scope in many cases can be seen as a task that gets queued for scheduling and execution, and when it's done it's done, and the only part of it that *might remain is the fact that it has ended, and possibly any uncouth exception, but both of these only if on s-invocation the execution scope was added to a blocker. No blocker link? Then end of execution implies end of lifetime for all of the execution scope.
There is however a situation where an execution scope isn't just as simple as that, where it isn't just a simple task on a queue waiting to be scheduled and executed, that then gets executed till the end.
Sometimes an execution body itself interacts with a blocking that has ties to execution scopes still scheduled for completion or with awaitable lang or ambient operations. In such cases an execution scope will contain a so called continuation point. While the user normally shouldn't care about these, there are situations where they become relevant. A continuation point is basically a collection of parameters that allows the execution body execution to be halted and the execution context to be re-submitted to a scheduling queue for later continuation.
In many cases continuation points are simple:
int max_prime = 1000;
mutable int counter = 0;
shared mutable int ok_count = 0;
mutable blocker all_primes;
mutable merge utils.is_prime as is_prime(x int)::{ok_count: ok_count}{max_prime: max_prime};
while counter < 100 {
counter += 1;
all_primes += is_prime(counter);
}
-> await_all all_primes;
cout "counted " ok_count " prime numbers" endl;
In this example the scheduler is free to do a rest of body reschedule with blocker erasure. This makes for more efficient re-scheduling.
Or it can be deeper:
int max_prime = 1000;
mutable int counter = 0;
borrowed mutable int ok_count = 0;
mutable blocker all_primes;
mutable merge utils.is_prime as is_prime(x int, c int)::{}{max_prime: max_prime};
while counter < 100 {
counter += 1;
all_primes += is_prime(counter, ok_count);
--> lang.await.reclaim.all all_primes;
}
cout "counted " ok_count " prime numbers" endl;
In this case no rest of body reschedule is possible as the scheduler isn't expected to loop deep enough to know how to cut a rest-of-body, so the continuation point happens inside of the loop.
Blockers
As we discussed in part 1, blockers are an abstraction for a set of awaitables. There exists no user space awaitable type that can ever be assigned to, for a user awaitables by default are ignored, making the scope of what could be awaited completely ephemeral, but that can be added to a blocker so we can actually await something completing. It is important to realize that the very act of defining a blocker will change how the scheduler will handle an execution scope. If there is no blocker defined, then the scheduler will interpret it as a simple task that will get completed in one go. If there is a blocker defined in the execution scope body, then an empty continuation point will get added to the scope and the scheduler will handle it appropriately.
int max_prime = 1000;
mutable int counter = 0;
borrowed mutable int ok_count = 0;
-> mutable blocker all_primes;
mutable merge utils.is_prime as is_prime(x int, c int)::{}{max_prime: max_prime};
while counter < 100 {
counter += 1;
all_primes += is_prime(counter, ok_count);
lang.await.reclaim.all all_primes;
}
cout "counted " ok_count " prime numbers" endl;
Hazardous blockers
Before we get to discussing semantic locks, we need to discuss *hazardous blockers, the need for this will remain a bit hazy untill we get to the semantic locks but please hang in there for a bit, things will fall into place in the next section.
There is a modifier possible on blockers that changes a number of things:
- It allows an executable body (function, def/merge/etc) with a blocker in it to be awaited from a lock
- It aggressively changes scheduling both before and after the continuation points.
- It makes instantiating a non-hazardous blocking in the same code body prohibited
- It disables the ability to declare a lock within the same scope where it is defined.
int max_prime = 1000;
mutable int counter = 0;
borrowed mutable int ok_count = 0;
-> mutable hazardous blocker all_primes;
mutable merge utils.is_prime as is_prime(x int, c int)::{}{max_prime: max_prime};
while counter < 100 {
counter += 1;
all_primes += is_prime(counter, ok_count);
lang.await.reclaim.all all_primes;
}
cout "counted " ok_count " prime numbers" endl;
We name the modifier hazardous to make usage of the modifier stand out in a negative way. Why ? Because while it might sometimes be needed because of locks (as we will discuss next), its usage is discouraged because its aggressive re-prioritization in scheduling.
Semantic locks
In Merg-E, locks are semantic in nature. With normal usage they act exactly the way a real lock would, but instead of real locks they are scheduling conditions for the lock's body.
Let;s reexamine the example from part one:
reentrant mutable function report (count int)::{cout: cout, endl: endl}{
lock(cout) {
cout "counted " ok_count " prime numbers" endl;
};
};
In this case it is important that cout, a shared ambient authority functionality, isn't used by any other scope concurrently. Basically what this code tells us isn't what it semanticly implies. This code doesn't say: "lock cout and then run the code", instead it says: "If cout is available, schedule the body with high priority, if it isn't, delay scheduling."
This is all straight forward for so far, but what if we want to await something within the body:
reentrant mutable function report (count int)::{cout: cout, endl: endl, cerr: cerr}{
lock(cout, cerr) {
mutable hazardous blocker flushed;
if ok_count < 1 {
cerr "ERROR: negative count" endl;
flushed += cerr.flush();
await_all flushed;
};
cout "counted " ok_count " prime numbers" endl;
};
};
In this sample we are calling flush on cerr and want it to finish before writing anything to cout, so we do an await inside of the lock. We aren't allowed to define a blocker inside of a lock, but we can define a hazardous blocker, denoting we know what we are doing is needed but hazardous from a scheduling perspective.
We can do the same with our own callables, but if the invoked callable in turn holds a blocker, that blocker needs to be marked as hazardous too, and so on. As a developer consider high numbers of hazardous blockers in your code a code smell at best and a real hazard to scheduling and concurrency at worst. Remember that what hazardous tells us is: all of this is likely to happen within a lock that should ideally be resolved as soon as possible and it's likely not!
The hazardous modifier tells the exact same to the scheduler. The scheduler is going to cut corners to get rid of the lock as soon as possible, and this means prioritizing the continuation points of execution scopes that hold a hazardous blocker, with all the unfair scheduling that flows from that.
Merg-E basicly puts the scheduler in pannic mode whenever an await is done on a hazardous blocker, prioritizing any scheduling that might help hazardous blockings to resolve. While there should be no risk of deadlocks, critical resources can still hold progress for a long time by prohibiting other lock scopes from getting scheduled in hte first place, but the pannic mode itself that tries to fight this might in rare cases end up starving resolved non hazardous blockings from continuation.
Coming up
In this post we discussed continuation points, blockers, hazardous blockers and locks. Semantic locks without awaits feel and act just like real locks would, but when mixed with awaits, things start interacting with scheduling in a way that requires the user to be aware that the locks are only locks semantically. This is possibly one of the hardest parts cognitively of the Merg-E language, to understand. I'll need a few more posts to talk about things like freezing, attenuation and decomposition, parallelism models, operators, capability patterns, and a few more.