Concurrency, Part 2 - Avoiding the problem
[Yesterday's article on concurrency discussed the basic concepts of concurrency. Now I'd like to start talking about how you deal with concurrency...
The first, and most important thing to realize about concurrent programming is that it's all about two things: your data and your threads. If you only have one thread, then you don't have to worry about concurrency issues. If you have more than one thread, then you only have to worry about concurrency issues if more than one thread can simultaneously access that data. And that's my first principle of concurrent programming: If your data is never accessed on more than one thread, then you don't have to worry about concurrency. Again, the guys who get concurrency are cringing with this principle- the reality is (of course) more complicated than that, I'll get back to why it's more complicated later (I need to introduce some more concepts beforehand).
In Win32, in general, there are three ways that you can guarantee that your thread is the only one executing the data.
The first is your stack. On Win32, the data on your stack is owned by the thread (this might not be true for other architectures, I don't know :(). Unless you explicitly pass pointers to your stack to another thread, then you don't have to worry about other threads messing with your stack data, so you don't need to worry about protecting the data.
The second way of ensuring that only one thread can access your data is to use
](https://weblogs.asp.net/larryosterman/archive/2005/02/14/372508.aspx)ThreadLocalStorage, or TLS. The idea behind TLS is that when your process starts, it allocates a "slot" in TLS. That allocation returns you an index into a table, and you can stick whatever value you want to into that table. When your thread starts up, you can allocate a block of memory, stick it into the table, and then, later on during the execution of the thread, you can go back and query the value of that block. The block remains per-thread, and can be accessed without protecting the data. This allows you to maintain per-thread context blocks which can be used to hold state that's more global than the stack. Btw, the C runtime library allows you to declare variables in TLS by simply decorating them with __declspec(thread) - there are some caveats about using this, but the facility is available...
The third way of ensuring that only one thread can access your data is simply to be careful in how you write your code. As an example, in my last "What's wrong with this code" article, I purposely allocated the FileCopyBlock structures in one thread, put them on a queue and executed them in worker threads. As a result, I didn't have to protect the FileCopyBlock fields - since only one thread could ever access the data at a time, they didn't need to be protected. Now more than one thread accessed the data (the block was constructed on the main thread and destructed on the worker threads). But at any given time, the blocks weren't accessed by more than one thread. This principle can be applied in a number of different ways - my example was quite simple, but it wouldn't be difficult to imagine a FSM where the state was kept in a block that was enqueued and dequeued based on state transitions - the block would only ever be accessed by one thread at a time and thus wouldn't have to be protected.
It turns out that you can write some fairly sophisticated multithreaded code without ever having to ever worry about synchronizing your shared data, just by being careful and setting up your data structures appropriately, you can do pretty amazing things.
But, of course, there are times that you can't avoid having more than one thread accessing your data. Tomorrow, I'll talk about some of the ways around that problem.
Edit: Principal->Principle (thanks Mike :))
Comments
Anonymous
February 15, 2005
"the C runtime library allows you to declare variables in TLS by simply decorating them with __declspec(thread)"
I think you mean the Microsoft C Compiler, rather than the "C runtime library", as it is not a function that you can call, but rather a compiler attribute that generates code to call TLS functions automatically.Anonymous
February 15, 2005
The comment has been removedAnonymous
February 15, 2005
Being picky - you're getting your principals and your principles mixed up again. :-)
Contributing to the conversation: I'm listening pretty hard. I don't get to do a lot of concurrent work and haven't had much training in it. Most of what I know I've learned from Jeff Richter's excellent books "Programming Applications for Microsoft Windows" and "Programming Server-Side Applications for Microsoft Windows". They were written around 2000-2001, but are still hugely relevant.
This is happening around the right time as I'm currently working on a service to adapt from a narrow-band RF hand-held computer (accessed from a base station over TCP/IP or serial cable) to our custom UDP-based application server. I aim to write it correctly this time, with much use made of asynchronous I/Os and thread pooling where possible.Anonymous
February 15, 2005
Wouldn't the notes previously linked to about the double-check locking paradigm in Java possibly also apply to the create-enqueue-dequeue paradigm you talk about? I.e, in an architecture with non-ordered writes (like x64?), you can't be sure that the pointer you put into the queue is really filled with appropriate data.Anonymous
February 15, 2005
CN: Ah - Did you notice how I queued the request? I called QueueUserWorkItem - that's a Win32 API call that handles all the concurrency issues for me.
I'm a firm believer of letting the OS get the concurrency issues right - they're almost certainly more likely to get them right than I am (more on this one in a later article in the series).Anonymous
February 15, 2005
The comment has been removedAnonymous
February 15, 2005
Good point Andrew.
In fact, IIRC the original UPnP bug was a result of code that attemped to solve a concurrency issue.Anonymous
February 15, 2005
Most of the points you make in this article are true in the general case, and not just C/C++ under win32. Of note: While you can't rely on TLS existing in any specific way, most C runtimes have some sort of TLS. Most, I think, make you manage it somewhat more explicitly, though.
I think your stack is always private -- at least from the point your thread is started onward.
Of course, this only applies at all to languages of the same basic sort as C/C++ -- for example, in current Perl, /everything/ is thread-local, unless you explicitly make it otherwise. This means less worrying about concurrency issues, at the price of making starting threads very slow, and communication between threads somewhat cumbersome.Anonymous
February 15, 2005
> On Win32, the data on your stack is owned by the thread (this might not be true for other architectures, I don't know :().
Me neither, but, yikes, how would this work? Assuming the stack stores return addresses, each thread <b>needs</b> to have a private stack or you lose coherent function calls. I think this is reliably true everywhere.Anonymous
February 15, 2005
Tom,
I was thinking of some RISC or mainframe-like architectures where the "stack" wasn't really a stack in the conventional sense of the word, but instead a pointer to some kind of per task memory, and thus wasn't necessarily shared.
There are some really wierd computer architectures out there.Anonymous
February 15, 2005
The comment has been removedAnonymous
February 15, 2005
The comment has been removedAnonymous
February 15, 2005
Larry,
Sometime in your series on concurrency, could you also cover the following:
1) Techniques to partition your code so that concurrency issues become more apparent. It is easy to get get confused between the "object view" and the "thread view" in a program -- threads weave paths through objects in a way that is not obvious at first sight when reading the code.
2) Coding and commenting conventions that might help highlight concurrency issues. An obvious thing is to explicitly mark all shared variables in comments (variables shared across threads, that is). Perhaps also use a naming convention for these variables.
3) Deadlock prevention techniques. (Years ago, Ruediger Asche wrote an article for MSDN that used Petri Nets to detect deadlocks. It was a bit over the top for me at the time though. Note to self: Read and understand it sometime.)
http://msdn.microsoft.com/library/en-us/dndllpro/html/msdn_deadlock.asp
Thanks,
-KAnonymous
February 15, 2005
Larry, please publicize this. Apologies for offtopic, but this is VERY IMPORTANT.<br><br><a href="<a target="_new" href="http://schneier.com/">Bruce">http://schneier.com/">Bruce</a> Schneier</a> reports that SHA-1, a commonly used cryptographic hashing protocol, <a href="<a target="_new" href="http://www.schneier.com/blog/archives/2005/02/sha1_broken.html">has">http://www.schneier.com/blog/archives/2005/02/sha1_broken.html">has</a> reportedly been broken</a> by a prestigious research team from Shanghai University. Together with recent attacks on MD5, as <a href="<a target="_new" href="http://developers.slashdot.org/developers/04/12/07/2019244.shtml?tid=93&tid=172&tid=8">previously">http://developers.slashdot.org/developers/04/12/07/2019244.shtml?tid=93&tid=172&tid=8">previously</a> covered by /.</a>, we need new hashing functions as a matter of urgency, and we need them now.Anonymous
February 15, 2005
The comment has been removedAnonymous
February 15, 2005
LocalService and NetworkService have their own profiles so they shouldn't use %systemroot%temp.
A lot of potential issues with %systemroot%temp are mitigated by the strong DACLs that are by default assigned to files created there. So as long as you specify CREATE_NEW when creating the file you should be fine.Anonymous
February 15, 2005
Hi, Larry
IMHO, TLS in Win32 is nearly unusable. At least in EXEs (as opposed to DLLs).
The the problem is that in EXE it's impossible to free data referred by TLS slot.
TLS neither provides "destructors" similar to pthread, nor any other means to intercept thread exit. So, when, say, ExitThread() is called, there is no way to free data stored in TLS.Anonymous
February 15, 2005
Good book for other platforms:
http://www.amazon.com/exec/obidos/tg/detail/-/0131900676/qid=1108564092/sr=8-8/ref=sr_8_xs_ap_i8_xgl14/104-2578695-3800761?v=glance&s=books&n=507846Anonymous
February 15, 2005
Vassili,
If it's an EXE, then you wrote the code to create the threads, you wrote the code to signal that the threads are going to terminate, and you wrote the code that cleans up for the threads.
Since you wrote the thread routine in the first place, why can't you free the memory? THat's what the C runtime library does (that's why the C runtime library recommends/requires that you use __beginthread() - it's to do per-thread initialization and tear down of data).
In a DLL, you don't get to control the threads, but in an EXE, you do.Anonymous
February 16, 2005
Actually you can register callbacks to be run during thread startup or termination with EXEs. It just so happens that VC++ doesn't support this, but it's in the PE spec and (IIRC) LDR supports calling the callbacks at the appropriate times.Anonymous
February 16, 2005
Yes, In EXE I have control over thread entry point. But when I call ExitThread(), thread dies "here and now" - without getting to the point where TLS stuff gets freed.
Comparing PTHREADS and Win32 threading makes Win32 threading to look slightly "underdone". It concerns TLS destructors, cleanup handlers - a stuff obviously implemented via some sort of "thread exit callback". On Win32, the presence of "DLL_THREAD_ATTACH" in DllMain hints that some sort of such callback is available for "private use". And I wonder why this callback wasnt made public...Anonymous
February 16, 2005
From the doc for ExitThread:
> However, in C++ code, the thread is
> exited before any destructors can be
> called or any other automatic cleanup
> can be performed. Therefore, in C++
> code, you should return from your
> thread function.
It sounds like what you really want to do is have some kind of flag or IPC that tells your thread when to return from the thread proc. Then you would be able to do all your cleanup prior to dying, and you wouldn't even need to call ExitThread.Anonymous
February 19, 2005
The comment has been removedAnonymous
June 13, 2009
PingBack from http://outdoordecoration.info/story.php?id=1390Anonymous
June 16, 2009
PingBack from http://workfromhomecareer.info/story.php?id=8777