the blog for developers

Concurrency Rant: Different Types of Concurrency and Why Lots of People Already use ‘Erlang’ Concurrency

People talk a lot about concurrency. With the rise of multi-core processors, concurrency becomes more important. It’s sad that developers don’t know much about concurrency – and most of them just parrot what they have read in other blogs. I wanted to write this post for quite some time to shed more light into concurrency.

There are mainly two different types of concurrency

  • One Task, many workers: For example parallel Fibonacci Numbers
  • Many Tasks, many workers: For example web requests

They have two different characteristics:

  • First: Need to share data
  • Second: No need to share data (or “not to share in real time”)

Most people think of the first kind, when they discuss concurrency. Although most applications are of the second kind. I wish people would not confuse those two and try to fix the second problem with their solutions, because the second problem is solved sufficiently (think FaceBook). So this post will only discuss the first type of concurrency.

The breakthrough for concurrency of the first kind came with threads and the synchronizd keyword in Java 15 years ago. Before that concurrency was an esoteric topic for niches, with Java it became a topic for every developer. Today most people recognize that threads and synchronized are too low level though and create quite some problems if you don’t know what you do. One half of the blogosphere damns Java for this and favors Erlang style concurrency: Message passing between objects, each object has an inbox (a queue) of messages which it works through and objects send messages to the inbox of other objects, where messages are immutable. The benefits are clear: No shared state, no need to regulate the access to shared state and the freedom to implement the scheduler of the objects the way it works best (not being limited to thread libraries).

concurrency actor Concurrency Rant: Different Types of Concurrency and Why Lots of People Already use Erlang Concurrency

This half proposing Erlang over everything usually doesn’t know what the other half does and think they still use synchronized. But this is only the most basic technique (and not even the best basic one with the advent of atomics). Surprise, enterprise Java developers don’t use threads (directly) and synchronized. I’ll tell you what they do.

There are lots of better methods for concurrency nowadays like Futures, Executor Services and especially Doug Leas Fork/Join framework in Java JSR 166y. The FJ framework splits a task into subtasks, distributes them, solves them and joins the result (in a very clever way with queues which are written on one side and read on the other and tasks which can steal work from others). The algorithm for forking and joining looks like this:

Result solve(Problem problem) {
  if (problem is small)
    directly solve problem
  else {
    split problem into independent parts
    fork new subtasks to solve each part
    join all subtasks
    compose result from subresults
  }
}

As an example to calculate Fibonacci numbers:

class Fib extends FJTask {
  static final int threshold = 13;
  volatile int number; // arg/result

  Fib(int n) { number = n; }
  int getAnswer() {
    if (!isDone())
      throw new IllegalStateException();
    return number;
  }

  public void run() {
    int n = number;
    if (n <= threshold) // granularity ctl
      number = seqFib(n);
    else {
      Fib f1 = new Fib(n − 1);
      Fib f2 = new Fib(n − 2);
      coInvoke(f1, f2);
      number = f1.number + f2.number;
    }
  }
}

(you can use threads or a different scheduler for this to work)

A similar approach to concurrent work is MapReduce as implemented by Hadoop. It also splits work into sub tasks, does a mapping step for transforming data and then reduces the result. It works best for data crunching and reducing input data to output data.

Other techniques developers often use instead of concurrency primitives are communication abstractions, like communicating over concurrent access queues, for example java.util.concurrent.ArrayBlockingQueue (ah queues again! or call them inboxes).

concurrency queue Concurrency Rant: Different Types of Concurrency and Why Lots of People Already use Erlang Concurrency

Threads talk to each other by adding (hopefully immutable) messages to a queue. Sounds familiar? Those can even be distributed (also see Hazelcast for distributed queues). This is very similar to Erlang concurrency, just imagine input queues for all your workers, aka actors.

Scala has an Erlang like message passing actor concurrency implementation. When looking into the Scala code, Scala uses the Fork/Join Framework of Java, the cirlce closes. And Scala uses different schedulers for it's implementation with one thread per actor only being one option.

We can abstract concurrency even more. Jonas writes an excellent piece about abstractions on top of actors in a piece about fault tolerant, Asynchronous concurrency but misses one crucial point:

Actors can simplify concurrent programming and reasoning immensely and I believe that Scala Actors is a key piece in the future Java concurrency puzzle. However, programming with actors and with explicit message passing and message dispatch loops can feel a bit unnatural and unnecessary verbose for Java developers that are used to regular OO method invocations and synchronous control flow.

As pointed out before, people use queues and are used to work with asynchronous flows. Java enterprise developers in particular as they use Enterprise Service Buses (ESBs). There are many implementations like ServiceMix, OpenESB, Camel, OpenMQ, ActiveMQ, Mule and others. They range from pure message buses to routing and integration solutions. Because ESBs are fault-tolerant, asynchronous concurrency, the developers who use them know about asynchronous flows. Companies like LinkedIn uses ESBs to distribute tasks in a fault tolerant and parallel way. Nothing new there. Java developers think in asynchronous messaging already.

Stand back a thousand feet and ESBs look like the Erlang model. Worker cling to the bus and wait for work. Each is listening to different messages, the waiting messages for each worker are a virtual input queue similar to Erlang. There isn't as much difference between concurrency in different languages as people want you to believe there is!

As a final word: It's interesting that the first and second kind of concurrency start to merge. With applications like Twitter or identi.ca, the second type is sometimes becoming the first type of concurrency because different requests need to share data with each other (hopefully you don't use the database for this as Twitter did in the beginning). One can argue that more and more applications need to share data between sessions. You can use an actor model for this (Lift does this). You could use ESBs. Or you could go a very different way with distributed method calls and objects - Terracotta to the rescue.

Thanks for listening. Hope you've learned something. I did by writing this post. As ever, please do share your thoughts and additional tips in the comments below, or on your own blog (I have trackbacks enabled).

Update: Actors Guild for Java looks also interesting

Update 2: If you find this post offending, go back read it again without you favorite programming language in mind which you need to defend. There is nothing to defend. You've chosen the programming language to the best of your knowledge, as have others (and if they haven't but just chose the language because of some hype or others told them to, well, not a good idea). Spread your knowledge, don't feel offended. Be happy if others find solutions to their problems. This is not some kind of football game which you win if you gain enough points against a different language. If the other one is better, use it. If not then don't. Sounds awfully common sense, but we sometimes seem to forget this. Merry xmas.

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

About the author: Stephan Schmidt is head of development at brands4friends. He has more than 15 years of internet technology experience and 10 years experience in agile. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own.
Leave a reply.

Comments

Sounds like some Java advocate has Erlang envy…

Yes, if you stand back far enough, you start to see common concurrency patterns in a variety of different technologies. That’s a no-brainer; there are only so many ways to do this sort of stuff. But to then draw the conclusion that language X concurrency primitives are therefore really no different than those of language Y is way off the mark. You’re incorrectly mixing very, very different levels of abstraction.

There are _lots_ of other kinds of parallelism, and you’ve completely missed the most important part of Erlang parallelism: the pi calculus. If you were to pick up a language like Ada or Fortress, you’d know about three more kinds of parallelism overnight. Offhand, I can name nine, and I bet there are quite a few I don’t know about.

It’d be nice if people saying “there are only N kinds” knew what kinds there were.

stephan

@Steve: Envy? As a Scala user? Probably not.

“[...] concurrency primitives are therefore really no different than those of language Y is way off the mark.”

Am I mixing message passing with threads? I don’t think so. I’ve been comparing communicating with immutable messages over queues to Erlang, and to me this looks comparable (scheduler and OTP aside).

stephan

“[...] and you’ve completely missed the most important part of Erlang parallelism: the pi calculus.”

As this is only a blog post and the Doug book is several hundreds of pages long (and the concurrency chapter in the Erlang book on my dessk is much longer than the post), obviously the post is not complete.

It would be nice if you would fill in the gaps and write a post on your own for all of us to learn. Thanks.

“It’d be nice if people saying “there are only N kinds” knew what kinds there were.”

Update: Compare this to “There are mainly two different types of concurrency”. I let the reader find the difference as an exercise.

@Stephan: FWIW the posting definitely comes across as being defensive of Java and its offspring, and as being envious of Erlang.

“I’ve been comparing communicating with immutable messages over queues to Erlang, and to me this looks comparable (scheduler and OTP aside).”

But why is that comparison useful? Message passing is message passing. Is the fact that message passing in one language is comparable to message passing in another language enough to declare that “there isn’t as much difference between concurrency in different languages as people want you to believe there is!” Of course not. And that’s the problem I have with this piece — it glosses over way too many details and as a result reaches conclusions that are inaccurate and unsound.

J. Betancourt

Though I am far from expert in this topic, the responses seem on point.

Yes there are similarities between message based asynchronous architectures and Actor-like systems, but they smell different. A while back I was looking at the JCSP library that puts PI-calculus primitives into Java, it did not look like many of the things you mentioned in Java, though SomnifugiJMS looks interesting as an alternate approach.

I hope you post a followup. I look forward to learning more.

It would be nice if you would fill in the gaps and write a post on your own for all of us to learn. Thanks.

I have been. For a long time. http://sc.tri-bit.com/ and http://fullof.bs/ are both me. I’ve written dozens of complete tutorials on half a dozen languages and serious topics.

Before you get smug about what other people haven’t done, be sure that they haven’t done it.

“It’d be nice if people saying “there are only N kinds” knew what kinds there were.”

Update: Compare this to “There are mainly two different types of concurrency”. I let the reader find the difference as an exercise.

Should it be the case that these were the main two, you might have a point. However, given that neither of the mechanisms you describe adequately explain the Windows model of threads and ropes inside processes, meaning that you’ve missed something like nine tenths of the real world market, you’re really only lampooning yourself by here.

stephan

@Steve: “[...] being defensive of Java and its offspring, and as being envious of Erlang.”

Of course it is defensive that was the goal. There have been dozens sof blogs posts 2008 from people using Erlang with no clue about what other people do.

“[...] them just parrot what they have read in other blogs.”

About Java? Well queues, ESBs, Fork/Join, MapReduce are concepts, not primarily language features. And some Erlang developers seem to forget that message passing, supervisor hierarchies et al are concepts too.

“Message passing is message passing. Is the fact that message passing in one language is comparable to message passing in another language enough to declare that “there isn’t as much difference between concurrency in different languages as people want you to believe there is!”

Yes.

I know it makes Erlang not that wonder language some people think it is. I’m sorry for that.

stephan

@Steve: Of course you’re entitled to have another opinion. That’s mine.

Coming back to the envy thing. I’ve never understood the concept. If I am envy, why shouldn’t I change to another language? Wouldn’t that be the best thing to do? Staying with one programming language and being envy about another doesn’t make sense to me.

stephan

“Should it be the case that these were the main two, you might have a point. However, given that neither of the mechanisms you describe adequately explain the Windows model of threads and ropes inside processes [...]”

I think you might have missed the point again. There are different types of concurrency, one where the parallel running processes/actors/objects/things share information and one where they don’t exchange information. If you know more, you could fill me in.

The other thing is mechanisms to solve the concurrency problems that arise from the first and second kind of concurrency.

But solutions or mechanisms or models and concurrency types are different things.

“I have been. For a long time. http://sc.tri-bit.com/ and http://fullof.bs/ are both me. I’ve written dozens of complete tutorials on half a dozen languages and serious topics.”

From a first look I can’t find relevant tutorials to concurrency using the search on both pages. Perhaps I don’t use the right key words.

@J: SomnifugiJMS is nice, especially if you already know JMS (used if for an intra application bus for Swing applications). I also liked Werx as a bus system – but the open source project is discontinued (It had a nice hierarchy system for messages an was fast).

If you’re interested in concurrency you need to read the Doug Lea book “Java Concurrency in Practice”.

Other than that if you need high performance concurreny you best use:

- Erlang (buy the Erlang book)
- If you have Java knowledge then it would be good to look into the Scala implementations of actors and OTP.
- If you need Erlang concurrency in Java you could take a look at Kilim

http://codemonkeyism.com/archives/2008/06/21/interesting-picture-benchmarking-erlang-versus-java-concurrency/

I’m not sure how Actors Guild performs, I need to look into that

http://actorsguildframework.org/

Write your own implementation :-) What people can learn from Erlang is the inbox model (and thinking in supervisors and restarting processes) and take it into the language of their choice.

stephan

As John wasn’t kind enough to provide some information on the pi calculus in Erlang, as a service to all readers:

http://lamp.epfl.ch/~cremet/publications/pilib.pdf

A library for the pi calculus. As a bonus: Implemented in Scala.

“Should it be the case that these were the main two, you might have a point. However, given that neither of the mechanisms you describe adequately explain the Windows model of threads and ropes inside processes [...]”

I think you might have missed the point again. There are different types of concurrency, one where the parallel running processes/actors/objects/things share information and one where they don’t exchange information. If you know more, you could fill me in.

I tried to; you were too busy feeling correct to look into the hints you were given. Windows parallelism fits neither model. Windows processes do not share memory (at least not without significant contrarian gymnastics, but the same can be said of every pure implementation the average programmer can name); Windows threads do. The reason I brought up the Windows system was supposed to be a counterexample, but you were too busy telling me I’d missed the point to realize that you were missing the point of said counterexample.

I realize it’s very comfortable to repeat what you’ve been told, but try this again. There’s more to the world than “processes can communicate” and “processes can’t communicate”. Parallel channels, identified channels, proxying, masquerading, voluntary yielding, continuations; there’re all sorts of approaches out there which do not map to your very black and white world, and the fact that the world’s dominant parallelism model is one of them should be something of a red flag to you.

It really isn’t as cut and dried as “can or cannot message”, even if you want it to be, and even if you don’t know about the alternatives. I wasn’t at all cryptic about my counterexample; you should have read up on it before assuming you were correct.

From a first look I can’t find relevant tutorials to concurrency using the search on both pages. Perhaps I don’t use the right key words.

Indeed.

http://lamp.epfl.ch/~cremet/publications/pilib.pdf

This was a legitimate mistake to make. That isn’t honestly the pi calculus, even though its authors present it as such. Many of the most important guarantees of the Pi Calculus derive from isolation, which cannot be provided on the Java virtual machine for reasons that I’m sure you’ll ask me to explain at length instead of just researching, since this is apparently my tutorial and my blog, and since you feel comfortable refuting essentially any comment by saying “you haven’t written a book hard enough, therefore you are wrong.”

This may help elucidate the point.

http://www.cs.cmu.edu/~wing/publications/Wing02a.pdf

Frankly, your “as John wasn’t kind enough” comment reeks of contempt; if you had been less standoffish about the help you were offered in the first place, you might have received more. As stands you’ve got a last read and a disinterested goodbye.

stephan

Ugh. Personal attacks.

“The reason I brought up the Windows system was supposed to be a counterexample, but you were too busy telling me I’d missed the point to realize that you were missing the point of said counterexample.”

I’ve read it again and again, but I can’t see how this is a counter example.

Windows process do share state, threads don’t. What am I missing? I’m sorry your responses are to vague to my level of understanding. There might be hints but I’m too stupid to see them.

On the other side the comments just look like name dropping,

“Parallel channels, identified channels, proxying, masquerading, voluntary yielding, continuations;”

It would me nice to elaborate one point instead. With

“can or cannot message”

it seems to me that you change my words all the time:

“* First: Need to share data
* Second: No need to share data (or “not to share in real time”)”

Instead of providing me with some key words to find concurrency examples on your site your only comment is:

“Indeed.”

Thanks for your comments, they might have helped some readers. They haven’t helped me.

We both have different views on the topic. I want to share knowledge, you want to prove I’m stupid. Well I am. Merry xmas.

Hello Stephan,
I am not a specialist of concurrency, just a mere mortal trying sometimes to use it :-) I am wondering where Software Transactional Memory fits, and what do you think about it. Iknow about the Haskell implementation, and it seems there exists JVM based implementations, but I did not try them.

STM based transactions do not exchange data per se, running in isolation, yet by virtue of retry and conditional, they can be synchronized with one another.

Regards,
Arnaud

PS: and don’t feed the troll ;-)

stephan

@Arnaud: I think STM is quite interesting, Clojure seems to have nailed it in it’s implementation. It looks clean and easy to understand.

STM is a mechanism to share state between processes and fits into the first category.

I haven’t tried STM in a real world project but would be eager to, perhaps 2009 I find a suitable project.

There have been some articles along the lines of STM not scaling though.

http://queue.acm.org/detail.cfm?id=1454466

Yes, I am aware of that article. I think however that the overhead observed by the authors and introduced by code instrumentation should be balanced by the ‘abstraction comfort’ STMs provide. FWIW, there is a good overview of STM in CACM.

As yourself, I have only toyed with STM and never used it on real projects, but I feel they are conceptually higher-level than other concurrency mechanisms and hence maybe easier to work with. Moreover, the fact that, at least in Haskell, you can compose STM computations together while keeping its properties is really interesting.

Of course, this is only a ‘feeling’, having dealt mostly with threads and message-passing style concurrency ni my past programming experience.

Merry christmas,
Arnaud

stephan

Merry christmas
Stephan

“completely missed the most important part of Erlang parallelism: the pi calculus”.

Comparing an implementation to a turing-complete formalism is strange. Just because Erlang has syntactic support for message passing doesn’t mean it is an embodiment of pi. Remote object refs perform the same function as erlang pids, FWIW.

Also, pi-calculus is about concurrency, not parallelism.

On a slightly related note, here’s Wil van der Aalst on why he thinks the term “pi-calculus” is thrown around a bit too liberally.
http://is.tm.tue.nl/staff/wvdaalst/publications/p257.pdf

Great post Stephan!
I fully agree.

@Steve V. Why do you think that the message passing feature has to be hardcoded into the language?

If we can get message passing implementing in Scala as a library, I think that could be an advantage over something that is hardcoded into the language.

qedisk

> Threads talk to each other by adding (hopefully immutable) messages to a queue.
Message put in concurrent collections/queues needn’t be immutable, because there is a happens-before relationship between putting and taking the message. It means that the state the message was in when put in queue is fully visible to receiving thread.

I’m getting the feeling that all web apps are going to head into the concurrent access / update class to facilitate gen y friendlyness. For example, Google Reader’s X people also liked this post link, which could easily morph into community discussion area…

Leave a Reply

What people wrote somewhere else:

Additional comments powered by BackType

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?