the blog for developers

David Pollak was right about XML and JSON

David Pollak was right about XML and JSON, but perhaps in a different way. XML cannot be converted to (clean) JSON.

Suppose we have a shopping cart in XML which we want to convert to JSON:

<cart>
  <items>
    <item><name>one</name></item>
    <item><name>two</name></item>
  </items>
</cart>

One representation in JSON would be (cart could be omitted):

{ cart: { "items": [ { "name": "one" }, {"name": "two"} ] } }

We convert a list of nodes with the same name to a JSON array, xml2json-xslt does this for example. What happens if we only have one item in our shopping cart?

<cart>
  <items>
    <item><name>one</name></item>
  </items>
</cart>

Then our converter cannot detect that items is a list and will convert the XML to:

{ cart: { "items": { "name": "one" } } }

which is semantically something completely different. And very unpleasent for the receiver of our JSON code, because sometimes he gets an array and sometimes an object.

One way to solve the problem is to annotate the XML (looks ugly but works):

<cart>
  <items type="list">
    <item><name>one</name></item>
  </items>
</cart>

and adding an additional condition to the XSLT

or ../@type[.='list']]

or namespacing (doesn’t work yet, we get lots of namespaces and misuse XML namespaces) ?

<cart>
  <list:items>
    <item><name>one</name></item>
  </list:items>
</cart>

So David was right, I’m not sure he new why ;-)

Thanks for listening.

About the author

stephan Stephan Schmidt has been working with internet technologies for the last 20 years. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own. You can find him on Google +

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

Leave a reply.

Comments

Interesting mismatch. You could go the other way though, right? Since JSON does have the notion of an array/list, you would always be able to convert a JSON representation into an XML representation.

stephan

No it won’t go the other way, see for example

http://stephan.reposita.org/archives/2008/04/04/rest-lean-json-and-xml-from-the-same-code/

{ items: [ 1, 2, 3] } cannot be converted to XML, because the type of “1″ etc is missing.

One would need { items: { item: [1,2,3] } } to generate XML. Which is in itself no nice JSON.

André

Maybe you can solve this problem by using some conventions, e.g. if the root of parent node is pretty much the same as the child node this has to be a list.
I thing those conventions are necessary to describe and support the understanding of data structures. Otherwise there is no need for XML schema ;-)

Perhaps I’m missing something in this discussion, but I think the point this problem illustrates is that you can’t do a clean conversion without the XML schema. The schema will define the “maxOccurs” attribute of the “item” element in the “items” element. If it’s greater than one or unbounded, then it can make it an array. If not, then it won’t.

stephan

@Andre: Yes thought about that, looks very fragile though.

@David: Yes, a schema is the same as “type” or a namespace like list:items – some meta information about the structure.

But when transforming XML to JSON within the browser using a schema

a.) the schema is slow, needs another request and isn’t very easy to implement
b.) is not generic, item needs to be defined. When not being generic but specific it would be easier to write a list_xml2json.xsl file

See my posting Converting Restricted XML to Good-Quality JSON for something useful that can be done, given some assumptions and some schema information.

stephan

@John: Nice post, informative, thanks

MrZ

Hi EveryOne! :-D
This Post it’s very interesting and switch on a very important problem for who, like me, want some kind of light web-services-protocol… And it’s a very difficult discussion, so what follow are only ideas, may be wrong so…

i think that we must separate problem:

1] XML is extensible because one document don’t have a real semantic, i want to say that a text like this means nothing:

one
two

when i say ‘this tag means that’ every things it’s ok and this is the task of DTD
and XSD

2] Namespaces born to avoid conflict and give a way to express some kind of semantic relations…

3] XML has a DOM and this is important: it’s the description of ‘HOW’ and not ‘WHAT’
Also for this XML it’s useful… Parsing…Parsing…

4] since other points, we can try with a simulation of DOM not of XML it’s self!
But one point of attention: this is a way to make a traslation… So it’s right
to say:”there’s no simple way to traslate XML to JSON”… Preserve the value
of X of XML it’s the real problem…

{
name:’cart’,
type:ROOT,
attributes:[],
value:”
childs:[
{
name:'items',
type:NODE,
attributes:[],
value:”
childs:[
{
name:'item',
type:NODE,
attributes:[],
value:”
childs:{
name:’name’,
type:NODE,
attributes:[],
value:’one’
childs:{}
}
} ,
{
name:’item’,
type:NODE,
attributes:[],
value:”
childs:{
name:’name’,
type:NODE,
attributes:[],
value:’one’
childs:{}
}
}
]
}
]
}

Now i can generate also schema or apply schema…
About me the summary is that XML is a textual specification with XSD as meta-specification and if i have some software that use it, complexity can’t be eliminated and must percolate to JSON…
All The Best

MrZ

Hello,
I am sure completely dumb and missing something very important, but I really don’t understand the problem. To me, I would have converted

one

to at least

{ cart: { “items”: { item: { “name”: “one” } } } }

Otherwise, you are losing some information in the translation and there is no more structural equivalence between the XML and JSON.

As other people pointed out, converting a one element nested tag to a singleton list is a matter of 1) conventions or 2) typing. I think in common XML based tools I use like Ant or Maven, it is a common assumption that the pattern … denotes a collection.

Regards.

PS; thanks for your excellent articles !

stephan

@Arnaud:

“Otherwise, you are losing some information in the translation and there is no more structural equivalence between the XML and JSON.”

Yes.

There are two problems though: is the child of items an array or an object? If it depends on the number of children, this is very error prone for the client.

People don’t want to write large code in JS to access data. For example compare:

mycart = ...
mycart.cart.items.item[0].name

vs.

mycart = ....
mycart.items[0].name

Many people consider the second one more readable.

Leave a Reply

What people wrote somewhere else:

Additional comments powered by BackType

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

NoSQL Guy

NoSQL: The Dawn of Polyglot Persistence

The dark side of NoSQL

Essential storage tradeoff: Simple Reads vs. Simple Writes

Sharding destroys the goals of your relational database

The unholy legacy of databases

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?