the blog for developers

Anatomy of a Flawed Clojure vs. Scala LOC Comparison

Update: The author of the post did the same biased LOC counting with Python vs. Clojure – Dhananjay Nene has a nice post about the Clojure/Python comparison Update 2: He did it with Java and PHP too. What a diservice to Clojure.

I’ve came across an article comparing Clojure and Scala. As I’ve did one too (Part 2) I’m naturally interested in the topic. But the article turned out to be FUD. Wikipedia says about FUD:

Fear, Uncertainty, and Doubt (FUD) is a tactic of rhetoric and fallacy used in sales, marketing, public relations, politics and propaganda. FUD is generally a strategic attempt to influence public perception by disseminating negative information designed to undermine the credibility of their beliefs.

Does the appearence of FUD articles mean the language war did start? I think both Scala and Clojure appeal to different audiences, but both seem in the race for replacing Java as the next mainstream thing (@headius no I don’t think JRuby is the next Java thing, although you do excellent work with JRuby – I believe it’s the next Ruby thing).

In the sense of Hacknots Anecdotal Evidence and Other Fairy Tales essay, I try to give some thorough criticisms and new facts concerrning the original blog post. Hacknot writes:

“As software developers we place a lot of emphasis upon our own experiences. This is natural enough, given that we have no agreed upon body of knowledge to which we might turn to resolve disputes or inform our opinions. Nor do we have the benefit of empirical investigation and experiment to serve as the ultimate arbiter of truth, as is the case for the sciences and other branches of engineering – in part because of the infancy of Empirical Software Engineering as a field of study; in part because of the difficulty of conducting controlled experiments in our domain. Therefore much of the time we are forced to base our conclusions about the competing technologies and practices of software development upon our own (often limited) experiences and whatever extrapolations from those experiences we feel are justified. An unfortunate consequence is that personal opinion and ill-founded conjecture are allowed to masquerade as unbiased observation and reasoned inference.”

1. Are the lines of code really that different?

I’m very interested in Lines of Code (LOC) comparisons, as I’ve learned a lot from my LOC comparison of Java and Python (2x to 4x more LOC in Java). The numbers the author gives are:

  1. Scala 54
  2. Clojure 19

The difference in LOC looks to be on a big part due to missing library calls in Scala. Scala is missing a using and writing/reading to a file. While the using is part of my standard toolkit in Scala project, I must admit I do not know of a widely used library which provides such a method (perhalps scalaz or scalax?).

The reading of files is something different. Reading a little about Guava, would have made it clear that it doesn’t make sense – at least in Javaland – to compare LOC based on missing library functionality.

Using Guava

List[String] lines = Files.readLines(file, Charsets.UTF_8);

instead of

implicit def file2String(file: File): String = {
  val builder = new StringBuilder
  using (new BufferedReader(new FileReader(file))) { reader =>
    var line = reader.readLine
    while (line != null) {
      builder.append(line).append('\n')
      line = reader.readLine
    }
  }
  builder.toString
}

results in 1 line of code compared to 11 – 10 lines too much in the Scala example. Or as some have pointed out, using fromFile in Scala io makes the example as short.

Beside that, there is plainly code missing in the Clojure example compared to the Scala one. I’ve thought I might be wrong, comparing LOC and then just leaving out lines looked too apparent a flaw, but the code isn’t the same it seams. Time measurements, comments and println are just left out: at least 7 lines of code.

2. Micro benchmarks are most often flawed

Everything that has been said before about micro benchmarking, and it seems has to be said to each new generation of JVM developers, applies:

Toward the end of writing high-performance code, developers often write small benchmark programs to measure the relative performance of one approach against another. Unfortunately [...] assessing the performance of a given idiom or construct in the Java language is far harder than it is in other, statically compiled languages.

The Java VM is a strange beast, with startup times, hot spot inlining, escape analysis and compiler optimizations. Writing the code a little different, makeing loops a little larger might make the code much faster – or slower. So all micro benchmarks on the JVM must sceptically analyzed and questioned.

3. Too much type annotations in Scala?

The next claim about Scala is the excessive usage of type annotations, or said differently:

As a Scala developer you spent much of your time writing out variables types which must be tedious, [...]

When counting all type annotations (not the object constructors) I end with 150 characters of source code.

File
Closeable
Unit
Closeable
Closeable
B
B
B
Map[K, V]
[K, V]
Tuple2[K, V]
Tuple2[K, V]
Boolean
Map
[String, Int]
File
File
Unit

Compared to 2039 characters of the whole script, this results in 7.4% type annotations in Scala. Not something I would call “much of my time” – indeed only 7.4% of my time writing code, which in itself is only a small part of development.

4. Other criticisms

There are more lines of code missing in the comparison. As one comment noted:

More … the Scala code does som checking to see if the file exists or not. Your Clojure code does not do that. That will bump up the LOC by 4 – 5 lines i guess

to which the author replies

In regards to checking if the directory exists, its a matter of adding (when-let (dir (file-seq …)) :)

But don’t read this as the authoritative comparison between Ruby and Clojure, it’s just a fun little exercise.

Two remarks: Exactly, the line is missing and should be have been added from the beginning. The title, tone and content of the post – especially on the insistence that Clojure is both faster and has less LOC – doesn’t reflect the “fun little exercise” but looks quite serious.

Conclusion

The conclusion is simple. Comparing languages based on LOC is a dangerous thing and shouldn’t be taken lightly. The comparison is flawed, the Scala example could be reduced by 17 LOC (an astonishing 31%) and the type annotations are not “much of the time” but only 7.4%. Keep comparing languages, but based on facts and rigorous maths, not on FUD. Said that, if you find flaws in my post I would be glad to correct them.

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

About the author: Stephan Schmidt is head of development at brands4friends. He has more than 15 years of internet technology experience and 10 years experience in agile. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own.

16 Tweets

Leave a reply.

Comments

[...] — the Scala version was intentionally obfuscated for chuckles. For a much more eloquent rebutal of flawed LOC comparisions read Stephan Schmidt’s latest post on the subject. ↩ No Comments, Comment or [...]

The problem wasn’t that the OP was FUd… it was that it wasn’t FUD *enough*! See my more frightening Scala version for the right way to make my favorite language look better.

http://blog.fogus.me/2010/01/04/comparing-lines-of-code-scala-and-clojure-fud-version/

-m

Jarkko Oranen

In Lau’s defense, I think calling his article FUD is a bit unfair. It might be biased, but I don’t think that’s “strategic” or intentional. These kinds of comparisons are easy to carry out, but it’s also easy to forget to be impartial and ensure that you’re actually comparing apples to apples.

The unfortunate side-effect (hah!) of “vs” articles is that they don’t reflect well on the Clojure community, as someone is bound to take them the wrong way.

As a Clojure person myself, I believe Clojure can stand on its own, and I hope it will, without any need to downplay the merits of other languages. I’m sure most of the community would agree with me.

@Jarkko: Might be it isn’t intentional, but then why compare? Why does a Clojure enthusiast compare some code to Scala in a deeply flawed why? Something which will stay in Google for a long time? Is there something like unintentional FUD? ;-)

Zach Cox

I agree that Scala and Clojure are both great Java-replacements and that they appeal to different developers. Regarding the missing library calls in my Scala version, I wanted to use only core libraries, nothing 3rd party. I just commented on my original post (that spawned Lau’s Clojure vs Ruby/Scala post) that gives some more background as well as revised Ruby and Scala versions:

http://blogs.sourceallies.com/2009/12/word-counts-example-in-ruby-and-scala/comment-page-1/#comment-280

Further improvements are always welcome! :)

As said, Scala.io reads a file into a String (via mkString) vs. file2String :-)

Lau hasn’t been so restricted,

“(note: spit and slurp* are from contrib.duck-streams and str-join are from str-utils)”

Zach Cox

When I originally wrote that Scala version, I tried using:

scala.io.Source.fromFile(file).mkString

but got java.nio.BufferUnderflowExceptions, I guess due to Source.DefaultBufSize=2048 maybe? But specifying the character encoding does seem to work:

scala.io.Source.fromFile(file, “ISO-8859-1″).mkString

So knock a few more off the Scala LoC count. :)

I think that the point well made in this post is that comparisons using LOC better be made with a clear focus on keeping it an apples to apples comparison to the extent possible, and with an injection of an reasonably equal level of capability in writing both (or all) versions of code.

I agree the comparison is flawed. However I am not sure if it tantamounts to FUD. Perhaps in my mind the threshold for getting that moniker is a little higher. I suspect I would look for some clear, deliberate sales pitch or alternatively an unnecessarily defensive approach or picking on some really silly unimportant issues for me to feel comfortable with calling it FUD.

I did look up other articles and there’s some evidence of similar flaws in there as well. eg. Python vs Clojure – Reloaded. If you take a look at the #3 Top Rank per group example, it talks of 12 lines of python to 6 lines of clojure, yet another link in the comments http://gist.github.com/214369 shows an alternative implementation in 6 lines with excellent readability. Just to push the envelope I was able to implement exactly the same solution in exactly 1 line with atrocious readability.

I think some of the programming styles require a substantial tradeoffs between readibility and brevity. Moreover writing concise code with high readability may often require a substantial skill and even awareness of the language libraries and ecosystem, a capability the person conducting the analysis may not be equally blessed with in various languages.

I think LOCs are important. Perhaps one needs to be very careful about using results of such comparisons, unless the results are consistently observed across many different code samples and across different programmers.

@Nene: Thanks, good insights.

I also think LOCs are important. I hope we will learn more about different languages, apply more scientific rigor and learn how to interpret facts and results (e.g. how less lines of code lead – perhaps – to denser lines which are – perhaps – less readable. Though K advocates claim to just “sense” the code from looking at it).

But this would start with an honest will to find the best example in each language, not makeing the “other” language look delibertly worse.

And I’d call that FUD – it seems I have lower standards than you :-)

[...] There is another nice post by Stephan Schmidt – Anatomy of a Flawed Clojure vs. Scala LOC Comparison which does reflect an opinion in the context of Scala and a post authored on the same blog as the [...]

Like Nene I think FUD is a little too strong a name here, but I think it could pass under “licentia poetica” here :-).

But seriuosly, I think Clojure will be compulsorily less verbose than Scala, as Scala is a statically typed language and Clojure isn’t, so you’d sometimes need explicit type annotations. Moreover, Scala has all the OO-bagage and Clojure doesn’t. But all that isn’t relevant when comparing languages, this are 2 diffrent ideas about how to leverage Java infrastructure and they are both interesting.

Which said, I like Clojure better…

@Marek: As the author did the same (with Python and Java) several times and didn’t improve his comparisons, made the same errors over and over though, I do no longer believe it’s accidental.

Leave a Reply

What people wrote somewhere else:

Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL Please RT :-)

This comment was originally posted on Twitter

Code Monkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://ff.im/-dLgqr

This comment was originally posted on Twitter

+1 RT @codemonkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://bit.ly/4obKaL #Scala #Clojure #FUD

This comment was originally posted on Twitter

RT @codemonkeyism Published a new blog post:
http://codemonkeyism.com/scala-vs-clojure-flawed-loc-comparison/

This comment was originally posted on Twitter

RT @codemonkeyism: Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL Please RT :-)

This comment was originally posted on Twitter

RT @codemonkeyism: Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL Please RT :-)

This comment was originally posted on Twitter

RT @codemonkeyism: Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL

This comment was originally posted on Twitter

Nice critique , applies to Ruby too RT @codemonkeyism: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL

This comment was originally posted on Twitter

RT @codemonkeyism: Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL Please RT :-)

This comment was originally posted on Twitter

Code Monkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://ff.im/-dLuhw

This comment was originally posted on Twitter

Code Monkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://ff.im/-dLKW5

This comment was originally posted on Twitter

RT @codemonkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://bit.ly/4obKaL < My Ruby/Scala word counts post lives on! :)

This comment was originally posted on Twitter

RT “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL

This comment was originally posted on Twitter

Anatomy of a Flawed Clojure vs. Scala LOC Comparison – http://su.pr/3QilZm

This comment was originally posted on Twitter

RT @fogus: RT @codemonkeyism: Published a new blog post: “Anatomy of a Flawed Clojure vs. Scala LOC Comparison” http://bit.ly/4obKaL

This comment was originally posted on Twitter

Code Monkeyism: Anatomy of a Flawed Clojure vs. Scala LOC Comparison http://ff.im/-dOGAQ

This comment was originally posted on Twitter

Additional comments powered by BackType

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?