David Pollak was right about XML and JSON

David Pollak was right about XML and JSON, but perhaps in a different way. XML cannot be converted to (clean) JSON.

Suppose we have a shopping cart in XML which we want to convert to JSON:

<cart>
  <items>
    <item><name>one</name></item>
    <item><name>two</name></item>
  </items>
</cart>

One representation in JSON would be (cart could be omitted):

{ cart: { "items": [ { "name": "one" }, {"name": "two"} ] } }

We convert a list of nodes with the same name to a JSON array, xml2json-xslt does this for example. What happens if we only have one item in our shopping cart?

<cart>
  <items>
    <item><name>one</name></item>
  </items>
</cart>

Then our converter cannot detect that items is a list and will convert the XML to:

{ cart: { "items": { "name": "one" } } }

which is semantically something completely different. And very unpleasent for the receiver of our JSON code, because sometimes he gets an array and sometimes an object.

One way to solve the problem is to annotate the XML (looks ugly but works):

<cart>
  <items type="list">
    <item><name>one</name></item>
  </items>
</cart>

and adding an additional condition to the XSLT

or ../@type[.='list']]

or namespacing (doesn't work yet, we get lots of namespaces and misuse XML namespaces) ?

<cart>
  <list:items>
    <item><name>one</name></item>
  </list:items>
</cart>

So David was right, I'm not sure he new why 😉

Thanks for listening.

David Pollak (from Lift): “There’s no way to convert from XML to JSON because XML contains sequences not expressible in JSON”

Hmm[*]. Not sure if this is true (with CDATA, #Text and @attributes handled in some converters). For me the problem is more that there are too many ways to convert XML to JSON. For exampe the Badgerfish convention. Or the the Google and Yahoo versions. Or the XML.com way. And the Parker convention.

But the ways in Javascript to convert XML to JSON are either slow, very basic, use XSLT, use nasty Regex or cannot create simple JSON which feels JSON like.

* Note to self: Should start using Twitter for this [**].
** Did start Twitter

Update: Any ideas for a good XML to JSON conversion which feels JSON like (no need to be bidirectional)?

Update 2: I currently use XSLT with nice results, Safari doesn't work yet and neither does Chrome. More to come.

scala.xml.Node, text/xml and Jersey: How to do REST with Scala

Scala has native support for xml in the language via scala.xml.Node

val message = <message>Hello world</message>

There is an excellent book called scala.xml on XML and Scala so this post won't go into more detail.

Using the XML support in Scala makes it easy to write REST applications with Jersey. I've shown how to create JSON with a JsonBuilder in Scala before.

We need a Resource to handle our REST requests. As a response to a GET request on /helloWorld/xml we create a Scala XML node:

@Path("/helloWorld")
class HelloWorld {
  @Path("/xml")
  @GET
  @Produces(Array("text/xml"))
  def helloWorld() = {
    Hello World
  }
}

Jersey needs to know how to translate an object of scala.xml.Node to a HTTP response. This is usually done by implementing a MessageBodyWriter that maps an object and a mime type - scala.xml.Node and text/xml in this case - to a response.

@Provider
@Produces(Array("text/xml"))
class ScalaNodeAdapter extends MessageBodyWriter[scala.xml.Node] {

  def isWriteable(dataType:java.lang.Class[_], 
                       typ:Type, 
                       annotations:Array[Annotation]) = {
    classOf[scala.xml.Node].isAssignableFrom(dataType);
  }

  def writeTo(node:scala.xml.Node, writer:Writer) {
    writer.write(node.toString)
  }
  
  def getSize(node:scala.xml.Node) = -1L

  def writeTo(node:scala.xml.Node, aClass:java.lang.Class[_],
                  typ:Type, 
                  annotations:Array[Annotation],
                  mediaType:MediaType,
                  stringObjectMultivaluedMap:MultivaluedMap[String,Object],
                  outputStream:OutputStream) {
    val writer = new OutputStreamWriter(outputStream);
    writeTo(node, writer)
    writer.close()
  }
}

Voila, we now get <message>Hello world</message> when calling /helloWorld/xml.

Thanks for listening.

REST: Lean JSON and XML from the same code

Generating JSON and XML with the same code is difficult. One can create the semantically richer XML and convert it to JSON, but JSON notations for XML like Badgerfish look quite ugly to JSON advocates.

The problem at the core is that XML is typed whereas JSON is not. Every node in XML needs a type - it's name - for example <item><id>123</id><item>. JSON doesn't need such a type, { id: 123 } is fine for an item. { item: {id: 123}} looks too verbose. Especially getting to the data in Javascript: var id = item.item.id. The same goes for accessing arrays with var id = items[0].item.id; instead of var id = items[0].id;. The problem exists with other dynamic languages and data structures too, see Cobra vs. Mongoose for Ruby.

As I currently develop a REST based Jersey application in Java I needed a way to generate lean JSON and XML. Wouldn't it be best to have one code for both? DRY. My previous solution for generation JSON worked fine. The $(...) method calls create a node tree with nodes and lists. With a JsonRenderer and the Visitor pattern I generate JSON from the node tree. The problem was that this Java code

$( 
  $("id", listId),
  $("items", 
    ...
   )
);

creates nice JSON like { id: 123, items: [ ... ] }, but was unable to generate XML. As written above, the outer list has no type and a XmlRender therefor cannot render <shoppinglist><id>123</id>...</shoppinglist>.

The solution I thought about is to add type information to nodes which have no names.

$( type("shoppinglist"),
  $("id", listId),
  $("items",
    ...
  )
);

The implementation uses a simple static method and a Type class.

public static Type type(String name) {
    return new Type(name);
}

The type is attached to the node and if the node has no name but a type, the XmlRender uses the type instead of the name. The JsonRender doesn't use the type information and renders the same JSON as before. The piece of Java code now generates XML

<shoppinglist>
  <id>123</id>
  <items>
    <item><id>234</id><price></price><shop></shop>
      <description>Apple</description></item>
    <item><id>233</id><price></price><shop></shop>
      <description>Banana</description></item>
    </items>
</shoppinglist>

and lean JSON where neither shoppinglist nor item has a type

{ id: "123", items: [ { id: 234, price: "", shop: "", description: "Apple"}, { id: 233, price: "", shop: "", description: "Banana"} ]}

Next thing is to automatically apply the right renderer, toXml and toJson from within Jersey. The content negotiation then choses the accepted format for the client. Attributes (Meta-Information?) are not solved yet and I'm not sure if they are needed, or how to nicely add meta information to the $(...) tree. There is some discussion in the context of markup builders and attributes on James blog.

Probably the code will be released as an open source RESTkit if someone is interested.

Thanks for listening.