Java Posts

The top posts on CodeMonkeyism about Java, the past, the future and best practices.

  1. Go Ahead: Next Generation Java Programming Style
  2. Is Java Dead?
  3. Be careful with magical code
  4. All variables in Java must be final
  5. Never, never, never use String in Java
  6. Bending Java: More readable code with methods that do nothing?

And some posts about Java interview questions:

  1. Java Interview questions: Write a String Reverser (and use Recursion!)
  2. Java Interview questions: Multiple Inheritance
  3. Another Good (Java) Interview Question

ActiveMQ vs. Jabber

If you have or plan an application with synchronous communications over an external API, it will sooner or later break. Why do we need asynchronous communications? Matt Tucker is clear about that:

Take, for example, Twitter. High Scalability recently covered the load stats on Twitter reporting that they average 200-300 connections per second with spikes that climb to 800 connections per second. Their MySQL server handles 2,400 requests per second! Recently, the [2008] Macworld keynote became the most recent culprit for causing Twitter to cut off its API, which has 10x the load of their website.

When one of my web pet projects needed a messaging backbone which extends to the browser. Whenever a resource did change on the server, all users watching the resource should get a notification without need to reload their browser. Two candidates are Javascript for ActiveMQ, which uses Comet

ActiveMQ supports Ajax which is an Asychronous Javascript And Xml mechanism for real time web applications. This means you can create highly real time web applications taking full advantage of the publish/subscribe nature of ActiveMQ.

ActiveMQ is a messaging bus, often used as an Enterprise Service bus as mentioned in my recent concurrency rant. Components can send messages to the bus and subscribe to topics.


smokin
Creative Commons License photo credit: mudpig

The other unsuspected contender is Javascript for Jabber. Jabber with the XMPP protocol is usally used for sending chat messages. Comparing these two and my thoughts:

ActiveMQ

  • Standard solution, JMS based
  • Routing solutions like Camel available
  • Easy access for different languages via Stomp
  • Attach Jabber as a service
  • Notification easily over topics

Jabber

  • Free OpenFire server
  • Messaging with only one user with UUID for resource which did change
  • Messaging with many users, who join one chat room
  • Chat rooms as topics
  • Server side filtering? How to make it secure, that people only get their own messages?

In the end I decided to go with Jabber/XMPP. The main points for me have been:

  • Server does scale to connections
  • Chat client can be used for debugging
  • Very easy to use with different programming languages
  • Presence protocol to detect services
  • Easy to implement additional chat solution

This worked quite well as a spike. I followed a similar mode as Adrian Sutton, who had good experiences with Jabber/XMPP too when spiking a cache solution:

We grabbed the Smack API and started playing with it and quickly discovered that sending and receiving messages was ridiculously easy. It turns out that the absolute simplest way you can minimize stale data in your caches is to simply have all the servers join a preconfigured chat room. Whenever they save a change to a resource they send a message to the room with the unique ID of that resource and whenever they receive a message from the room they assume it’s a unique ID and remove any cached versions of that resource.

Though I had some major problems accessing Jabber consistently from Javascript. With more on messaging in the backend, I would have went with ActiveMQ as a message bus. And perhaps I might move to ActiveMQ in the backend and then I’m still free to attach Jabber on top of that and keep the frontend code. Best from two worlds.

Think innovative, use technologies in a way to help you. Jabber/XMPP is more than a chat protocol.

Concurrency Rant: Different Types of Concurrency and Why Lots of People Already use ‘Erlang’ Concurrency

People talk a lot about concurrency. With the rise of multi-core processors, concurrency becomes more important. It’s sad that developers don’t know much about concurrency – and most of them just parrot what they have read in other blogs. I wanted to write this post for quite some time to shed more light into concurrency.

There are mainly two different types of concurrency

  • One Task, many workers: For example parallel Fibonacci Numbers
  • Many Tasks, many workers: For example web requests

They have two different characteristics:

  • First: Need to share data
  • Second: No need to share data (or “not to share in real time”)

Most people think of the first kind, when they discuss concurrency. Although most applications are of the second kind. I wish people would not confuse those two and try to fix the second problem with their solutions, because the second problem is solved sufficiently (think FaceBook). So this post will only discuss the first type of concurrency.

The breakthrough for concurrency of the first kind came with threads and the synchronizd keyword in Java 15 years ago. Before that concurrency was an esoteric topic for niches, with Java it became a topic for every developer. Today most people recognize that threads and synchronized are too low level though and create quite some problems if you don’t know what you do. One half of the blogosphere damns Java for this and favors Erlang style concurrency: Message passing between objects, each object has an inbox (a queue) of messages which it works through and objects send messages to the inbox of other objects, where messages are immutable. The benefits are clear: No shared state, no need to regulate the access to shared state and the freedom to implement the scheduler of the objects the way it works best (not being limited to thread libraries).

This half proposing Erlang over everything usually doesn’t know what the other half does and think they still use synchronized. But this is only the most basic technique (and not even the best basic one with the advent of atomics). Surprise, enterprise Java developers don’t use threads (directly) and synchronized. I’ll tell you what they do.

There are lots of better methods for concurrency nowadays like Futures, Executor Services and especially Doug Leas Fork/Join framework in Java JSR 166y. The FJ framework splits a task into subtasks, distributes them, solves them and joins the result (in a very clever way with queues which are written on one side and read on the other and tasks which can steal work from others). The algorithm for forking and joining looks like this:

As an example to calculate Fibonacci numbers:

(you can use threads or a different scheduler for this to work)

A similar approach to concurrent work is MapReduce as implemented by Hadoop. It also splits work into sub tasks, does a mapping step for transforming data and then reduces the result. It works best for data crunching and reducing input data to output data.

Other techniques developers often use instead of concurrency primitives are communication abstractions, like communicating over concurrent access queues, for example java.util.concurrent.ArrayBlockingQueue (ah queues again! or call them inboxes).

Threads talk to each other by adding (hopefully immutable) messages to a queue. Sounds familiar? Those can even be distributed (also see Hazelcast for distributed queues). This is very similar to Erlang concurrency, just imagine input queues for all your workers, aka actors.

Scala has an Erlang like message passing actor concurrency implementation. When looking into the Scala code, Scala uses the Fork/Join Framework of Java, the cirlce closes. And Scala uses different schedulers for it’s implementation with one thread per actor only being one option.

We can abstract concurrency even more. Jonas writes an excellent piece about abstractions on top of actors in a piece about fault tolerant, Asynchronous concurrency but misses one crucial point:

Actors can simplify concurrent programming and reasoning immensely and I believe that Scala Actors is a key piece in the future Java concurrency puzzle. However, programming with actors and with explicit message passing and message dispatch loops can feel a bit unnatural and unnecessary verbose for Java developers that are used to regular OO method invocations and synchronous control flow.

As pointed out before, people use queues and are used to work with asynchronous flows. Java enterprise developers in particular as they use Enterprise Service Buses (ESBs). There are many implementations like ServiceMix, OpenESB, Camel, OpenMQ, ActiveMQ, Mule and others. They range from pure message buses to routing and integration solutions. Because ESBs are fault-tolerant, asynchronous concurrency, the developers who use them know about asynchronous flows. Companies like LinkedIn uses ESBs to distribute tasks in a fault tolerant and parallel way. Nothing new there. Java developers think in asynchronous messaging already.

Stand back a thousand feet and ESBs look like the Erlang model. Worker cling to the bus and wait for work. Each is listening to different messages, the waiting messages for each worker are a virtual input queue similar to Erlang. There isn’t as much difference between concurrency in different languages as people want you to believe there is!

As a final word: It’s interesting that the first and second kind of concurrency start to merge. With applications like Twitter or identi.ca, the second type is sometimes becoming the first type of concurrency because different requests need to share data with each other (hopefully you don’t use the database for this as Twitter did in the beginning). One can argue that more and more applications need to share data between sessions. You can use an actor model for this (Lift does this). You could use ESBs. Or you could go a very different way with distributed method calls and objects – Terracotta to the rescue.

Thanks for listening. Hope you’ve learned something. I did by writing this post. As ever, please do share your thoughts and additional tips in the comments below, or on your own blog (I have trackbacks enabled).

Update: Actors Guild for Java looks also interesting

Update 2: If you find this post offending, go back read it again without you favorite programming language in mind which you need to defend. There is nothing to defend. You’ve chosen the programming language to the best of your knowledge, as have others (and if they haven’t but just chose the language because of some hype or others told them to, well, not a good idea). Spread your knowledge, don’t feel offended. Be happy if others find solutions to their problems. This is not some kind of football game which you win if you gain enough points against a different language. If the other one is better, use it. If not then don’t. Sounds awfully common sense, but we sometimes seem to forget this. Merry xmas.