Epigraph:
Homeless idea will always find refuge in the home of some human.
Stanisław Jerzy Lec
Contents
Introduction
This is the first article from the short series devoted to thread wrappers:
- Present article.
- Conveyor Thread Wrapper for Modern C++.
Both articles share the same downloadable source code.
Motivation
By mentioning “modern C++” in the article abstract, I mean that C++11 or later version is required, as threading and thread synchronization features of the standard are essentially used. The standardization of threading is a great progress, it leads to fix of C++ to become a real cross-platform technology; but this standardization activity is still not so close to completion.
I put forward the thread wrapper conception and basic design years ago and first published some relevant code samples on CodeProject, in response to some Q&A questions. I answered, first of all, to the following questions on .NET: How to pass ref parameter to the thread, Change parameters of thread (producer) after it is started, Multithreading in C#, Passing arguments to a threaded Long-Running Process, and later to many other related questions. The CodeProject member VSNetVbHarry was so kind to translate one of my code fragments into VB.NET. In these answers, I covered passing the reference to the wrapper object and hence all its instance members, to make them accessible from the thread code, encapsulation of thread interlocking, used for data exchange between threads, throttling of the thread, joining, thread abortion and related issues, which clearly demonstrated the benefits of the conception.
Some main ideas date back to the design of the Ada tasks mechanism.
Now we have C++11 std::thread
and other standard library features related to multithreading. GCC 4.8.1 was the first feature-complete implementation of C++11 in 2013, Clang was also ready in 2013; and Microsoft started to support C++11 only in 2015. Since then, the development of analogous thread wrapper in C++ really makes a lot of sense. Other factors stimulating this publication are maturing of the concept and getting more experience with it, and, importantly enough, lots of related questions CodeProject members keep asking, as well of my old idea to write articles and blog posts trying to answer many questions at once.
Why Thread Wrapper?
Thread wrapper class is based on std::thread
. This class is standard and is perfectly fine. Another, more flexible option could be boost::thread
found in boost set of libraries, but this is a big library beyond the official C++ standard.
As to the std::thread
, this is a raw standard class which allows doing all the threading the standard threading library is capable of. Naturally, it provides maximum flexibility achievable with this library, but there are tons of subtleties and subliminal or not very obvious usage techniques. In contrast to that, thread wrapper offers extreme simplicity yet covering majority of typical or not so typical applications. From the other hand, thread wrapper design is fundamentally based on programming by extension, so each and every more advanced behavior can be naturally added to the wrapper in the derived classes.
Thread Wrapper Usage
The idea is: the use creates a derived class and overrides the virtual function ThreadWrapper::Body
. When the instance is already constructed, the thread is not yet started. It can be started any time later, on the call to the separate function, ThreadWrapper::Start(bool)
. The parameter offers the option to specify detached (background) mode, which is explained in detail in the section Joining Non-Joinable Thread. A separate start is a very important flexibility feature badly missing in raw std::thread
.
There are more virtual functions to override, but this is optional, because the rest of them are pseudo-abstract Aborted()
, Terminated()
, ExceptionCaught(std::exception)
and UnknownExceptionCaught()
. I don’t show their usage, because it’s quite trivial; their calls are shown here.
The complete usage sample is shown as the very last code sample.
Derived Thread Wrapper Classes and Thread Body
There is one little problem in implementing of the separate method ThreadWrapper::Start
: the instance of std::thread
has to be a member of ThreadWrapper
, so it is constructed with the trivial constructor without parameters. The thread body comes into play only later.
One way to implement such behavior is using the move semantics of the std::thread
function thread& operator=(thread&& other)
:
class ThreadWrapper {
public:
void Start(bool background = false) {
thread = std::thread(BodyWrapper, this);
if (background)
thread.detach();
}
};
This is the idea behind thread operator =
with move semantic: the constructor-call expression std::thread(BodyWrapper, this)
is recognized by the compiler as a r-value expression, which is temporary; so r-value reference &&
parameter is expected, but not l-value, because all other assignment operators are absent (deleted). Therefore, the implementation of the operator can safely modify the temporary thread instance created on the right of the assignment — it is guaranteed that it won’t be used later. The thread handle and all properties of the newly constructed thread instance are moved to the left operant, the instance member ThreadWrapper::thread
.
The similar behavior could be achieved in a different way, via the function swap
:
class ThreadWrapper {
public:
void Start(bool background = false) {
std::thread tmp(BodyWrapper, this);
thread.swap(tmp);
if (background)
thread.detach();
}
};
I wonder if anyone was confused by this seemingly weird function, swap
&dmash; why would some developers ever need to swap two thread handlers? The above code sample is just one example of its use.
The behavior of detached thread is a very special problem; dealing with it is described in the section Joining Non-Joinable Thread.
The function BodyWrapper
is needed to handle exceptions and the join. This is how it is done:
class ThreadWrapper {
private:
static void BodyWrapper(ThreadWrapper* instance) {
try {
instance->Body();
} catch (ThreadAbortException&) {
instance->Aborted();
} catch (std::exception& ex) {
instance->ExceptionCaught(ex);
} catch (...) {
instance->UnknownExceptionCaught();
} std::lock_guard<std::mutex> lock(instance->joinMutex);
instance->done = true;
instance->joinEvent.notify_all();
instance->Terminated();
}
std::thread thread;
std::condition_variable stateEvent, joinEvent;
std::mutex stateMutex, joinMutex;
Action state = Action::wakeup; bool done = false;
};
This way of binding the thread with the thread wrapper instance is the actually the main idea of the wrapper. Once we have this binding, we can safely encapsulate thread control (throttling, abortion, exception handling) and synchronized data exchange between threads.
Let’s start with thread throttling.
Put to Sleep and Wake Up
Is some threading APIs, one could find some functions for suspending and resuming a thread. The example of such API is Windows SuspendThread
and ResumeThread
. In modern thread APIs, such as C++ std::thread
or CLI, such functions are never included. They are no less dangerous than TerminateThread
(we will discuss thread termination below).
The problem with such API is that they are totally asynchronous to the thread execution. One possible trouble is suspending a thread when it owns a mutual exclusion object. In this case, suspension of one thread will indirectly suspend all threads trying to acquire the same mutual exclusion object — those threads will be put in a wait state, until the mutex is released by the suspended thread. But this is not the worse situation. Worse, the same very thread which is supposed to eventually resume the thread holding the mutex, can try to acquire the same mutex later after suspending but before releasing. It will create a deadlock, when two (or more) threads are in wait state and are waiting for each other. And even worse, it can happen with some very low probability, so the testing could not reveal the problem; and the deadlock could happen to the customer in the most embarrassing situation, in perfect agreement with the Murphy’s law.
At the same time, suspending and resuming threads is a very important feature. In some applications, it is critically important. And it can be perfectly safe. It’s enough to synchronize suspension with the thread execution. A thread to be suspended should repeatedly check call some function used to put it to the wait state, conditionally. In the present thread wrapper implementation, this function is the protected function ThreadWrapper::SyncPoint
. It uses the mechanism based on std::condition_variable
. In combination with std::mutex
, it provides thread throttling effect in a way similar to the CLI System.Threading.EventWaitHandle
.
This is how it works:
class ThreadWrapper {
protected:
void SyncPoint(bool autoReset = false) {
std::unique_lock<std::mutex> ul(stateMutex);
stateEvent.wait(ul, [=] {
auto result = ((int)state & (int)Action::wakeup) > 0;
if (state == Action::deepAbort)
throw ThreadAbortException();
else if (state == Action::shallowAbort)
throw ShallowThreadAbortException();
if (autoReset)
state = Action::sleep;
return result;
});
}
};
This function is also used for thread termination discussed below. The declarations of the relevant thread synchronization fields (stateEvent
, stateMutex
, state
) was already shown in the code sample showing BodyWrapper
.
The thread wrapper methods used to throttle a thread are intentionally named differently, PutToSleep
and WakeUp
. Even if the naming is a bit ugly, it emphasizes that the function are not associated with dreaded “suspend” and “resume”. From the other hand, this naming suggests what really happens to the thread: when a thread calls SyncPoint
when its status set to suspend
, it is put to the wait state, that is, it is switched off and not scheduled back to execution until it is waken up. Wait state does not mean spin wait, there is no polling; the thread is really waken up by the event notification mechanism.
This is how the thread is throttled:
class ThreadWrapper {
public:
void PutToSleep() {
std::lock_guard<std::mutex> lock(stateMutex);
state = Action::sleep;
stateEvent.notify_one();
}
void WakeUp() {
std::lock_guard<std::mutex> lock(stateMutex);
state = Action::wakeup;
stateEvent.notify_one();
}
};
Thread Termination
With std::thread
, there is nothing similar to the CLI System.Threading.Thread.Abort, which is the thread termination mechanism fully asynchronous to the thread execution. The use of this mechanism sometimes sparks flame wars, so I don’t discuss it here.
In contrast to that, the thread wrapper’s thread can be aborted by another thread in sync with the thread execution, through the same SyncPoint
function shown above. Note that there are two Note that there are two levels of abort: deep and shallow. In the class ThreadWrapper
only the deep abort is used, but the shallow abort is reserved the use in descendant classes and actually used in the class ConveyorThreadWrapper
described in separate article. As to the class ThreadWrapper
, it is important that two exceptions thrown by SyncPoint are derived one from another:
class ThreadWrapper {
private:
class ThreadAbortException : std::exception {};
protected:
class ShallowThreadAbortException : ThreadAbortException {};
};
As ThreadAbortException
is caught in the function bodyWrapper
, it also catches ShallowThreadAbortException
. It means that if some descendant class throws this exception (indirectly, as shown below and does not handle it, it will be caught anyway. See also the section Exceptions.
The function ThreadWrapper::Abort
is quite similar to PutToSleep/WakeUp
(shown above):
class ThreadWrapper {
public:
void Abort() {
SetAbort(true);
}
private:
enum class Action {
sleep = 0, wakeup = 1,
shallowAbort = wakeup | 2, deepAbort = wakeup | 4
};
void SetAbort(bool set, bool shallow = true) {
std::lock_guard<std::mutex> lock(statusMutex);
if (set) {
if (shallow)
state = Action::shallowAbort;
else
state = Action::deepAbort;
} else state = Action::wakeup;
statusEvent.notify_one();
}
};
Note that the abort
status is bit-mapped value combined with wakeup
. The implementation of SyncPoint shows that the thread is waken up based on the wakeup
bit extracted from the status. This is done to make sure that a thread can be aborted when it is in a wait (sleep) state. Further execution with the abort
status throws the exception of the type ThreadWrapper::ThreadAbortException
, which is caught on the top stack frame of the thread body.
Using exception handling for thread abortion is perfectly safe. Apparently, it is important to bring the thread execution to the top stack frame where is will exit its body function unconditionally. It the body function is written correctly, the stack unwinding mechanism involves proper destruction of all the objects constructed to the point of abortion and complete clean-up.
Exceptions
All thread exceptions should be handled on the very top stack frame of the thread stack, at the latest. If this is not done, std::terminate
is called; and eventually the whole process will be terminated.
C++ does not have a single base class for all exception types, but there is one standard base class for some standard exceptions, std:exception
. All exceptions of the types derived from , std:exception
can be caught by handling this exception. Therefore, this exception type should be handled first; and all other exceptions should be handled at the end of the try
block. This is what is done in the method BodyWrapper
shown above — this method makes the very top stack frame of the thread, as it can be seen from the first code sample.
By the same reasons, there are three different hook functions to the exception handler: Aborted()
, ExceptionCaught(std::exception)
and UnknownExceptionCaught()
. Even though ThreadAbortException
is technically an exception class, throwing and propagation of this exception should not be considered as abnormal situation.
Joining Non-Joinable Thread
One problems of raw std::thread
which causes a lot of mistakes is its join
function. It simply blocks the calling thread until the target thread terminates. It can be used only when the thread is joinable, which is true, unless the thread is detached. The detached thread is analogous to the CLR background thread: a detached or background thread does not keep a process running if all foreground threads have terminated. When join
is used just to make sure a thread is terminated before the application is finally closed, it might not be needed for a background thread.
But what to do if we still need to synchronize the calling thread with the termination of a given thread? The status of the std::thread
object (or at its wrapper) is still accessible, so first thing which can come to one’s mind is polling of this status, spin wait. Of course, such solution could not be considered acceptable.
This problem can be solved with the same mechanism of std::condition_variable
:
class ThreadWrapper {
public:
void Join() {
if (thread.joinable())
thread.join();
else { std::unique_lock<std::mutex> ul(joinMutex);
joinEvent.wait(ul, [=] { return done; });
} }
};
Naturally, on the top stack frame of the thread body, the notification is sent on the thread termination. Importantly, in contrast to the thread throttling methods, all threads which could possibly synchronize themselves with thread termination are notified. This is shown above in the code of the BodyWrapper
.
Interlocked Properties
In general case, the developer of the class derived from ThreadWrapper
can easily organize the interlocking between threads in encapsulated manner; the wrapper class makes is quite convenient. All public members of such class potentially can be accessed from any other thread. If they are also used inside the wrapped thread, thread synchronization should apply. The derived wrapper can add all thread synchronization primitives needed, in the form of some private wrapper members, and use them for interlocking in the implementation of the public members.
However, on top of that, I wanted to add a supplementary template class to cover most typical mutual exclusive execution patterns which would cover vast majority of application. The simplest way to start describing these techniques is probably showing the whole class:
#include <mutex>
template<typename T>
class InterlockedProperty {
public:
InterlockedProperty() : InterlockedProperty(nullptr, nullptr) { }
InterlockedProperty(const T &value) : InterlockedProperty(nullptr, &value) { }
InterlockedProperty(std::mutex& sharedMutex) : InterlockedProperty(&sharedMutex, nullptr) { }
InterlockedProperty(std::mutex& sharedMutex, const T &value) : InterlockedProperty(&sharedMutex, &value) { }
InterlockedProperty& operator=(InterlockedProperty& other) {
this->mutex = other.mutex;
this->value = other.value;
return *this;
}
void UseSharedMutex(std::mutex& mutext) {
this->mutex = mutex;
}
operator T() const {
std::lock_guard<std::mutex> lock(*mutex);
return value;
}
T operator=(const T &value) {
std::lock_guard<std::mutex> lock(*mutex);
return this->value = value;
}
private:
InterlockedProperty(std::mutex * sharedMutex, const T * value) {
if (sharedMutex == nullptr)
mutex = &uniqueMutex;
else
mutex = sharedMutex;
if (value != nullptr) this->value = *value;
}
std::mutex uniqueMutex;
std::mutex * mutex;
T value;
};
The template class InterlockedProperty
simply defines two operators for reading and writing of the property value and wraps the access to the value in the same mutex object.
Of course, not all types can be used as a template parameter; the class with deleted default constructor could not be used.
The initialization the InterlockedProperty
instance requires some understanding.
Initialization of InterlockedProperty
Look at the set of the public InterlockedProperty
constructors. Two of them use the mutex object uniqueMutex
constructed in the class, the other two use some external mutex instance supplied by the constructor. The simpler constructors, those without the external mutex parameter, implement the simplest case when the access to the property is interlocked between the thread wrapper’s thread, and other threads. It probably covers most of the most typical applications.
However, for a different sort of applications, also a very typical one, this is not enough. The thread wrapper may have more than one property, and some properties have to be synchronized together. For example, a thread wrapper may operate two properties which values are not random, but they obey some invariant; that is, not all combination of the two property values is valid. In other words, it can be incorrect to modify one property in one thread and another one in another thread, because intermediate state of the thread wrapper may eventually become incorrect. The modification of the whole set of some properties should be mutually exclusive. Actually, this is a general purpose of the mutex.
That said, all such properties should simply share the same instance of the mutex, which is possible to do by using the constructors with the sharedMutex
parameters. Another ways to share a mutex are: using the InterlockedProperty& operator=(InterlockedProperty&)
(only works for two instances of InterlockedProperty
of the same template parameter type) or the UseSharedMutex(std::mutex&)
(can be called, for example, from the body of a thread wrapper constructor).
Now, two of the constructors can be used to initialize property value. Not that the modification of the property value in the constructors is not interlocked. First, it is not needed. More importantly, it won’t work in all cases. One such case is shown on the usage sample shown below. In this sample, the shared mutex is the member of the thread wrapper class, and it is passed to the property instances in a constructor’s initializer list, where the InterlockedProperty
constructors are called, but the mutex
object is not yet fully constructed at that time. It can be easily observed under the debugger; and the class std::mutex
is the one which cannot be used for locking in this state. In general, take extra care trying to call the functions of objects passed in a constructor’s initializer list.
Usage, All Put Together
Now when all parts are explained, I can show some more or less comprehensive usage sample, complete with interlocked properties synchronized together:
using natural = unsigned long long int;
class MyThread : public ThreadWrapper {
public:
MyThread() : id(mutex, 2), name(mutex) {}
InterlockedProperty<int> id;
InterlockedProperty<const char*> name, help;
InterlockedProperty<natural> delayMs;
protected:
void Body() override {
auto sleep = [=] {
std::this_thread::sleep_for(std::chrono::milliseconds(delayMs));
}; int count = 0;
name = oldName;
while (true) {
this->SyncPoint();
std::cout << count++ << help;
std::cout << "id: " << id << ", name: " << name << std::endl;
sleep();
} } private:
const char* oldName = "none";
std::mutex mutex;
};
class ThreadWrapperDemo {
enum class command {
abort = 'a', quit = 'q', sleep = 's',
wakeUp = 'w',
};
static const char* help() { return " a, q: quit, s: sleep, w: wake up, else: change property; "; }
static bool commandIs(char c, command cmd) { return (int)cmd == (int)c; }
public:
static void Run(natural delayMs) {
const char* newName = "new";
MyThread thread;
thread.help = help();
thread.delayMs = delayMs;
thread.Start(); char cmd;
while (true) {
std::cin >> cmd;
if (commandIs(cmd, command::abort) || commandIs(cmd, command::quit)) {
thread.Abort();
break;
} else if (commandIs(cmd, command::sleep))
thread.PutToSleep();
else if (commandIs(cmd, command::wakeUp))
thread.WakeUp();
else {
thread.id = thread.id + 1; thread.name = newName;
} } thread.Join();
} };
This code fragment also shows thread termination and the access to thread members from two threads. Two comments show the option of starting a thread in the wait (sleep) state and the option of using the thread in background (detached) mode, still being able to join it.
Compatibility and Build
All the thread wrapper solution is contained in just two files:
- “ThreadWrapper.h”,
- “InterlockedProperty.h”,
they can be added to any project.
The compiler should support C++11 or later standard. For GCC, this is an option which should be set to -std=c++11
or, say, -std=c++14
.
The demo project is provided in two forms: 1) Visual Studio 2015 solution and project using Microsoft C++ compiler and Clang — see “ThreadWrapper.sln” and 2) Code::Blocks project using GCC — “ ThreadWrapper.cbp”. For all other options, one can assemble a project or a make file by adding all “*.h” and “*.cpp” files in the code directory “Cpp”, a subdirectory of the directory of the solution file.
I tested the code with Visual Studio 2015, Clang 4.0.0, GCC 5.1.0.
The C++ options included “disable language extensions” (/Za
for Microsoft and Clang), which seems to be essential for Microsoft.
Versions
Initial version
March 20, 2017
1.0
March 21, 2017
Added ConveyorThreadWrapper
derived from ThreadWrapper
. This class is described in detail in a separate article, Conveyor Thread Wrapper for Modern C++. Re-designed demo application.
2.0
March 24, 2017
Changed design of ConveyorThreadWrapper
. In ThreadWrapper
, extended the set of protected members, in order to support ConveyorThreadWrapper
. Thread state synchronization separated from ConveyorThreadWrapper
blocking queue synchronization.
2.1
October 29, 2017
Fixed a bug in ThreadWrapper::ExceptionCaught
function signature. Must be: virtual void ExceptionCaught(std::exception& exception) {}
Final Notes
C# project with Threading.ThreadWrapper
class is added for reference purposes. It is possible that I decide to write more articles on related topics. In this case, I’ll probably upgrade and share the source code downloadable from the present article page.
I hope for informative feedback, criticism and suggestions on this article. Thank you for your time and patience.