the blog for developers

Problems with Jersey, REST, JSON and UTF-8 [Update]

UTF-8 is always a problem. Unbelievable. 2008 and we still haven’t fixed this. One of my current projects is a Javascript frontend with a REST backend. The backend stores to MySQL (a famous UTF-8 trouble maker) and creates JSON to REST calls. The problems starts with UTF-8 characters. Somewhere in the callchain – as always – characters don’t get correctly written. MySQL and the JDBC driver should work, the JSP page is UTF-8 (@page and meta-equiv), jQuery – which does the AJAX – and JS do know UTF-8 and Jersey should be UTF-8 too. But with some experiments now I’m quite sure that Jersey (JSR 311 REST framework) is to blame. I’m not sure how to specify UTF-8, this

  @ProduceMime("text/plain;charset=UTF-8")

doesn’t help. Funny, every major project with several frameworks along the call chain and several languages (JS, C, Java) makes UTF-8 problems somehow. I’m so fed up with this, it’s 2008.

Update: Jersey uses InputStreams for all encodings, especially StringProvider is relevant to me (se above). Does this work with Unicode?

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

About the author: Stephan Schmidt is head of development at brands4friends. He has more than 15 years of internet technology experience and 10 years experience in agile. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own.
Leave a reply.

Comments

[...] No signal, no noise. « Problems with Jersey, REST, JSON and UTF-8 [Update] [...]

Hi Stephan,

Yes, i think this is a problem with Jersey, thanks for reporting it.

The EG recently found this issue as well and we have updated the JSR-311 specification, version 0.7 [1], to state:

When writing responses, implementations SHOULD respect application-supplied character set
metadata and SHOULD use UTF-8 if a character set is not specified by the application or if the
application specifies a character set that is unsupported.

and this will be implemented in the 0.7 release of Jersey (scheduled for April 18th).

I should be able to provide you with a specific solution for StringProvider fairly quickly if you are happy working with the latest builds or the trunk.

Paul.

[1] https://jsr311.dev.java.net/

New medication like plavix….

Plavix verses generic. Plavix….

Leave a Reply

What people wrote somewhere else:

Additional comments powered by BackType