Jaa


Putting your synchronization at the correct level -- solution

Of course when I say "solution", I really mean "my solution" because there's not really just one way to do this.  But this is an approach I like and I'll go over why I like it.

Recall the code from the original problem: (there was some very interesting discussion so it's worth clicking to see what people thought) 

     public class Counter
    {
        private string name;
        private int reads;
        private int writes;

        public Counter(string _name)
        {
            name = _name;
            reads = writes = 0;
        }

        public void AddRead()
        {
            lock(this) // point #1 -- do we like the three lines marked point #1 ?
            {
                reads++;
            }
        }

        public void AddWrite()
        {
            lock(this) // point #1 -- do we like the three lines marked point #1 ?
            {
                writes++;
            }
        }

        public void WriteCounts(TextWriter tw)
        {
            lock(this) // point #1 -- do we like the three lines marked point #1 ?
            {
                // point #4 -- do we even like this method for this class?
                tw.WriteLine("{0} {1} {2}", name, reads, writes);
            }
        }
    }

    public class Updater
    {
        private static Counter c1 = new Counter("Category 1");
        private static Counter c2 = new Counter("Category 2");
        private static Object myLock = new Object();

        // this method is called by many threads
        public static void DoWork(bool br1, bool bw1, bool br2, bool bw2)
        {
            lock (myLock)  //  point #2, do we need this? does it even belong?
            {
                if (br1) c1.AddRead();
                if (br2) c2.AddRead();
                if (bw1) c1.AddWrite();
                if (bw2) c2.AddWrite();

                // point #3 could we change the lock so that it didn't include these two slowish statements?

                c1.WriteCounts(Console.Out);
                c2.WriteCounts(Console.Out);
            }
        }
    }

Point #1

I think it's a very bad idea to introduce locking into a low level class like this counter.  It has no awareness of what is going on in the overall scheme of things.  Generally classes like this with synchronization end up having redundant features that add complexity and cost for no benefit.  This is exactly what is happening here.   We have every expectation of having to use several of these counters in tandem and therefore synchronization within any one counter is worthless.

Point #2

The lock here is inescapable.  Indeed it is fundamentally the purpose of this class to coordinate three resources -- the two counters and the output stream.  Removing this lock would put the onus on some higher level code which might otherwise remain blissfully ignorant of the coordination.   Now the opposite could be true: if the higher level code has to do a variety of its own synchronization then perhaps even this level is "too low" but it seems likely that this is a good fit.  Note the static entry point and the presence of high level actions such as logging.

Point #3

As it is written the code is trying to coordinate three resources.  Without changing the format we would be unable to decode the output in any meaningful way.  Furthermore we need some form of synchronization on the output because the API itself is not threadsafe.  We could lament that the writing might be slow, but since we are fundamentally required to write and it must be coordinated, just splitting the lock won't help us -- we'd simply move the blocking elsewhere.  However perhaps there something that can be done to mitigate the cost.  I offer one such suggestion in my solution below.

Point #4

I certainly do not like the Writing functionality at this level -- it seems rather orthogonal to the problem of keeping read and write counts.  Does this really fit?  See this other article for more thinking on these lines.

So what would I suggest instead?  Perhaps something like this (though there are other good options)

 

     public class Counter
    {
        private string name;
        private int reads;
        private int writes;

        public Counter(string _name)
        {
            name = _name;
            reads = writes = 0;
        }

        public void AddRead()
        {
            reads++;
        }

        public void AddWrite()
        {
            writes++;
        }

        public void FormatCounts(StringBuilder sb)
        {
            sb.AppendFormat("{0} {1} {2}", name, reads, writes);
        }
    }

    public class Updater
    {
        private static Counter c1 = new Counter("Category 1");
        private static Counter c2 = new Counter("Category 2");
        private static Object myLock = new Object();
        private static Object myOutputLock = new Object();
        private static StringBuilder sb = new StringBuilder(4096);
        private static long sequenceNumber = 0;

        // this method is called by many threads
        public static void DoWork(bool br1, bool bw1, bool br2, bool bw2)
        {
            StringBuilder sbOut = null;

            lock (myLock)
            {
                if (br1) c1.AddRead();
                if (br2) c2.AddRead();
                if (bw1) c1.AddWrite();
                if (bw2) c2.AddWrite();

                sb.AppendFormat("{0} ", sequenceNumber++);

                c1.FormatCounts(sb);
                sb.Append(" ");
                c2.FormatCounts(sb);

                sb.Append("\r\n");

                if (sb.Length >= 4000)
                {
                    sbOut = sb;
                    sb = new StringBuilder(4096);
                }
            }

            // note other threads can keep doing the main job
            // this thread becomes the output worker temporarily
            // you could do real async i/o here if you wanted
            // output could be out of order but sequence number allows decoding
            if (sbOut != null)
            {
                lock(myOutputLock)
                {
                    Console.WriteLine(sbOut.ToString());
                }
            }
        }
    }

And can you think of three reasons why I like that better? :)

(Also it needs a simple flush method to handle shutdown -- left as an exercise for the reader :))

Comments

  • Anonymous
    May 03, 2006
    I was just thinking, why not make the "FormatCounts" method an overload of "ToString()", or are we not assuming .NET here?

    The most immediate reason I see for you liking this option better is that the counting operation does not get slowed down by the output operation. Other threads can come in one after the other and increment the counter as the output operation may not be as critical as counting.

    Still pondering additional reasons...
  • Anonymous
    May 03, 2006
    Before I get to Rico's Three Reasons, isn't there a bug here?

    After sbOut is set, it's never nulled again. So every subsequent call to DoWork() will result in a (possibly redundant) write. Or am I missing something?

    3 reasons...
    1) Writing is not inside to lock, so you'll have less lock contention
    2) The output is buffered, so there are less writes overall
    3) The locking is pushed up to a higher, more reasonable level

  • Anonymous
    May 03, 2006
    Ack, I just noticed that sbOut is nulled at the top of the function, and it's local to the method anyway. So my bad.

    Why do I always notice this stuff right AFTER hitting submit? :)
  • Anonymous
    May 03, 2006
    I gather from this example that local StringBuilder variables are thread-safe.  Is that correct?

    Are local variables in general thread-safe?

    If one thread could read/write from another's sbOut, you could have the following behavior:

    INIT: sbOut = null;
    PULL/RESET: sbOut = sb; sb = new StringBuilder(...);
    PRINT: sbOut.ToString();

    Thread A: INIT (sbOut == null)
    Thread A: PULL/RESET (sbOut = { blah blah }, sb = {})
    Thread B: INIT (sbOut = null)
    Thread A: PRINT (A prints null.ToString() -- NPE)
    Thread B: PULL/RESET (sbOut = {}, sb = {})
    Thread B: PRINT (B prints "")
  • Anonymous
    May 03, 2006
    >> If one thread could read/write from another's sbOut, you could have the following behavior:

    Lucky it can't then. A variable declared at method level scope can't be accessed from another thread (how would it get access to it?).
  • Anonymous
    May 04, 2006
    > how would it get access to it?

    Well, I was worried in particular about shared representations (reference counts and copy-on-write semantics) not being thread-safe.  But after some thought I've convinced myself that

    lock (...) { sbOut = sb; sb = new ...; }

    is sufficient to protect against that.

    I had to worry about sbOut falling out of scope and decrementing the reference count of its representation.  But I convinced myself that was OK too, for the same reason.
  • Anonymous
    May 04, 2006
    In case there is any doubt, CLR objects are not reference counted.  Although one could imagine a CLI implementation that used reference counts in some way, it would be problematic to say the least as there would be interlocked operations all over the place.
  • Anonymous
    May 04, 2006
    > CLR objects are not reference counted

    So "sbOut = sb;" is an en masse copy of all the stringlets in the StringBuilder?
  • Anonymous
    May 04, 2006
    It's a straight pointer assignment, nothing more.  Since StringBuilder is a reference type the contents of the stringbuilder are not copied.  Also Stringbuilder do not have "stringlets" in them it's only a char array that you append to internally.

    See http://blogs.msdn.com/ricom/archive/2003/12/15/43628.aspx and http://blogs.msdn.com/ricom/archive/2003/12/02/40778.aspx
  • Anonymous
    May 04, 2006
    > It's a straight pointer assignment, nothing more

    Ah, I get it (sorry to be obtuse)
    After "sbOut = sb;", both variables point to the same object.

    StringBuilder a = new StringBuilder("A");
    StringBuilder ab = a;

    ab.Append("B");

    a.ToString(); // "AB", not "A"

    I was overthinking the situation.

    > Stringbuilder ... is only a char array that you append to internally

    I feel silly, now.  I assumed that...

    StringBuilder was a wrapper around a list<char*> l;
    .Append(s) mapped to l.push_back(copy_of(s.c_str()));
    and that .ToString() added up the sizes of all the little buffers, made one big new one, and copied each little buffer into the big buffer sequentially.
  • Anonymous
    May 05, 2006
    Continuing in what's turned into my series on locking advice, I have a new example for you of a typical...
  • Anonymous
    May 08, 2006
    This solution has nice granularity - without introducing a deadlock opportunity.  

    You will always need to lock myLock but myOutputLock much less frequently.  This allows maximum concurrency (protecting only that which needs protected, when it needs protected).

    Deadlocks are avoided since both locks are never held - you are never hanging on to one as you try to acquire the other.  Many fine-grained schemes go too fine and run into this.

    There could be a vulnerable time between the two locks where a thread switch could allow another thread to fill and write the new StringBuilder before this one gets written to the console.  None of the object in memory would get trashed, but the console output might be confusing.