Code Monkeyism

the blog for developers

Using Cassandra with Scala and Akka

With all this talk about NoSQL and new programming languages, I though I’d try getting Cassandra to work with Scala. Always being interested in productivity, I wanted to know how easy and concise an integration would be. One option was to use the Java client for Cassandra, as using Java libraries in Scala is rather easy. Obviously more concise would be a library written for Scala, so I tried Akka.

What is Cassandra? Taken from the homepage

Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google’s BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.

Cassandra has a steep learning curve as it’s more than a key/value store. The article WTF is a SuperColumn helped me tremendously.

After installing Cassandra you need to configure your storage. For this short example only one entry is necessary.

<Keyspaces>
    <Keyspace Name="AweSomeApp">
      <ColumnFamily CompareWith="UTF8Type" Name="ShoppingList"/>
    </Keyspace>
</Keyspaces>

What is Akka?

What is Akka? Akka is a framework stack for highly scalable, concurrent applications in Scala and Java. Akka supports different concurrency paradigms like actors and software transactional memory (STM) and easy and automatic storage to backends, including Cassandra and MongoDB.

As I’m using Maven – still I know – I wanted to get Akka running with Maven. Getting Akka to work with Maven was easy. I compiled Akka to get a 0.6 JAR which I put in an embedded repository in Maven. For Cassandra I’ve used the 0.4 version. I took a clue from the Akka Maven description and created an embedded repository for my Akka JAR.

<repositories>
       <repository>
            <id>project.embedded.module</id>
            <name>Project Embedded Repository</name>
            <url>file://${basedir}/../embedded-repo</url>
        </repository>
<repositories>

Then adding a dependency for Akka. The generated Akka JAR contains all dependencies which is nice for prototyping but would be needed to change in a production enviroment.

<dependency>
    <artifactId>akka</artifactId>
    <groupId>se.scalablesolutions.akka</groupId>
    <version>0.6</version>
</dependency>

Now your application should build with an Akka dependency.

The Scala code

The example application I’ve written is managing a shopping list. I’ve used this example before to illustrate AJAX, REST, JSON and XML. The core is a domain class called ShoppingList. It’s modeled after the domain classes in Lift and uses a composable, domain driven design. The class is short and does only contain a name, an id an a list of items.

class ShoppingList extends Entity
                with Nameable[ShoppingList]
                with Idable[ShoppingList, String] {
  object items extends OneToMany[ShoppingList, Item](this)
}

Now we want to store an instance of ShoppingList. In which format to write data to the Cassandra Storage? I decided to go with a more key/value storage approach at first and do not use Cassandra for structuring data. This leaves the question how to store the value. Beside binary formats there are two obvious contenders: JSON and XML. Both have pros and cons.

JSON has the benefit of storing – after checking – data from AJAX calls directly to the storage and retrieving data and delivering data without processing to AJAX calls. The downside of JSON is it does contain less semantic information and could lead to problems down the road. XML is a better format for long term storage and enterprise integration as it contains more semantic information and is easy accessible. The downside: Very few people use XML for AJAX and it’s much more verbose than JSON.

Let’s try both approaches with Scala. Converting an object to XML is easy with Scala, as Scala has builtin XML support.

  def toNode(list: ShoppingList) = {
    import list._
    <shopping-list>
      <id>{id.stringify}</id>
      <name>{name}</name>
      <list:items>{items.list.map(Item.toNode(_))}</list:items>
    </shopping-list>
  }

With LiftJson (which is usable without Lift) it’s nearly as easy to generate JSON:

    def toJson(list: ShoppingList) = {
    import list._
    JsonAST.render(
      ("id" -> id.stringify) ~
      ("name" -> name.get) ~
      ("items" -> items.list.map(Item.toJson(_)))
    )
  }

which would result in

{
  "id":"3",
  "name":"Steffis list",
  "items":[{
    "description":"First"
  },{
    "description":"Really Second"
  },{
    "description":"Third"
  }]
}

Reading and writing to Cassandra with Akka

Now to the juicy bits, reading and writing to Cassandra with Akka.

val ENCODING = "UTF-8"
val columnName = "list".getBytes(ENCODING)

def find(id: String) = {
 val column:Option[ColumnOrSuperColumn] =
   CassandraStore.sessions.withSession {
     session =>
       session | (id, new ColumnPath("ShoppingList", null, columnName))
    }

    if (column.isDefined) {
    // Convert to ShoppingList
    ...
        Some(shoppingList)
    } else {
		None
    }
}

where the CassandraStore is a static wrapper around the Akka Cassandra sessions. Obviously this would need to go into a injected dependency or a base class. Storing data into Cassandra is just as easy.

def store(list: ShoppingList) {
    // insert a column
    val value = ShoppingList.toNode(list).toString.getBytes(ENCODING)

    CassandraStore.sessions.withSession {
      session =>
        session ++| (list.id.stringify,
          new ColumnPath("ShoppingList", null, columnName),
          value,
          System.currentTimeMillis)
    }
  }

Up to now I’m still quite satisfied with the code, how Akka worked with Cassandra. Cassandra setup in a VirtualBox and Ubuntu was easy and is a sane approach for working on Windows. With this code it should be easy for you to get Cassandra going with Scala. Any hints, opinions and suggestions, as usual in the comments.

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?