JavaRebel supports Guice now

Good news:

"Google Guice Plugin 1.0 M1. Supports discovering new Google Guice (implicit) components, adding/removing implicit setter and field dependencies and reconfiguring @Singleton’s."

6 reasons why my VC funded startup did fail

Sorry for the inconvenience, got slashdotted by reddit. Never thought so many were interested. And no, scaling was not one of the reasons the startup failed 😉 No ssh from my work so it took some time to fix it. Thanks for coming (back)

During the dot com boom I founded a software startup with some friends - with me as the CTO. We developed a software for knowledge management. It was a combination of blogs, wikis, a document management system, link managment, skill managment and more.

Our Product

We started in 1999 which was quite early for wikis and blogs (Moveable Type was announced 2001). The link management system was essentialy the same as Delicious later. Beside all those new ideas (for 1999 at least) there were three great features:

  • Everything could be tagged. Skills, people, links, documents, blog posts, wikis, something which is today called folksonomies. Tags could refer to other tags to form onthologies. Tags could link to other documents, blog posts, persons.
  • Everything could be rated from 1 to 5
  • We had a clever fuzzy search based on tags and ratings. Searching for "people with oracle knowledge" would also reveal experts for SQL Server - for example to staff your project if an Oracle guru wasn't available

We've got quite some money as a seed investment from a VC and were quite happy and successfully developing our application. We did show it to many users and received very favorable feedback from big companies. Why did the startup fail and I'm no millionaire? There are a myriad of reasons, but as I wrote in "Rules for a successful business", the rules for a sucessful business are easy:

  • The customer is the most important thing in your business
  • The best business plan is to sell people the things they want
  • Your business is successful if your earnings are higher than your spendings

So the most important thing is to sell - a fact lots of startups forget. And we did too. After much thought it comes down to these six reasons why we failed (beside the obvious one that the VC market imploded when we needed money and noone was able to get any funding):

  1. We didn't sell anything
  2. We didn't sell anything
  3. We didn't sell anything
  4. The market window was not yet open
  5. We focused too much on technology
  6. We had the wrong business model

In more detail:

We didn't sell anything, Part 1

We didn't sell anything because we didn't have a product to sell. As good engineers we wanted to wait until the product is finished and then start selling. Midway we started selling nevertheless with a nearly finished 1.0. This led to too much focus on development and not enough on sales. Without a finished product we thought we couldn't go to customers and try to sell it. We've slowly learned two things:

  • You don't need a product to start selling if it's software. The first sales meetings with managment were perfectly possible with screenshots, mockups and slides. As the topic was new to our customers, we first needed to convince them on the concepts (wikis, blogs, tagging) and that could be done without a finished product.
  • Start selling before you found the company. Start selling now! You do not need to have a company to sell new ideas to customers. Start selling now! When people really want to buy your product, start the company.

We didn't sell anything, Part 2

We didn't sell anything because we had no sales person. Bummer. Of course we were looking for one and the business plan said: High priority is finding a sales person for the founding team. This took time and resources and we didn't have that. If you want to sell something, get a sales founder or hire someone from the start.

We didn't sell anything, Part 3

We didn't sell something because customers wouldn't buy. The product was great, the customers favorable, but they took too long to decide. We wanted to sell a knowledge managment tool bottom up to project managers and through them to companies. But everytime a superior heard of knowledge managment he decided it should be on his agenda. So the topic of knowledge managment moved up the chain of command and we had no real decision maker. We talked to irrelevant people (because the topic became strategic fast for our customers) and lost lots of time. Ask people if they are allowed to buy your product. Get to the one who says "Yes" fast. We had several big companies in our sales queue and I'm sure they would have bought in the end, but as a startup we couldn't wait. SAP for example compared to us could have waited and sold the product after 12 months. Selling enterprise software takes a lot of time.

The market window was not yet open

The market window was not yet open. Noone heard of blogs, wikis and tagging. We had to educate our customers on the benefits of wikis (Everyone can edit! Everyone! How dare they!) and blogs (Everyone can have an opinion and write about it! Everyone!) and tagging (They can build ontologies! We need a comittee to define an ontology for everyone! Otherwise chaos will eat us!) Several years later it would have been much easier to sell a blog, wiki and tagging plattform.

We focused too much on technology

All the founders were interested in technology. We've worked with EJBs (not mature back then), we made everything spit out XML and rendered that to HTML with XSLT (not fast enough), wrote our own OR-Mapper - what a stupid idea (Hibernate not available), tried CSS driven websites (not enough knowledge available back then). This lead to rewrites and took lots of our time. Discussions about technology - we should have talked about customers - took time too and led to frustration.

We had the wrong business model

Plain and simple: we had the wrong business model. Selling software can eventually reap lots of money, but it takes time. We had upfront costs, making sales deals took a lot of time and we burned money without income.

The better model would have been: Do consulting on knowledge management and start with an open source product.

We did consulting to companies on how to do knowledge managment, use wikis etc. But we didn't take any money for it, because it was part of our sales process. Focusing on cosulting and billing people would have created a steady income.

I did get into open source later with SnipSnap. SnipSnap took (a small part of) the ideas from the startup (wiki and blog) and was relased as an open source tool. Lots of people downloaded the software and installed it on their desktops. We really made installing SnipSnap easy, so it spread fast. I've talked to a boss of a very big software and consulting shop and he told me, wikis would never work for them: too chaotic and not structured enough. Well - in fact I knew that there were several SnipSnap installations in his company 🙂 As others are practicing now we could have gotten our foot in the door with an open source project, then sell support and enterprise features on top. Companies later paid us money to put features into SnipSnap, make it scale better and for other enterprise thingies. But in 1999 we didn't know as much about software business models as we know now.

What can you learn from my mistakes? Not sure, but start selling. I've learned a lot though about software managment, products, business models, money and being a CTO.

Thanks for listening.

Scalaris?

Looks good on paper, my first try results in

Crash dump was written to: erl_crash.dump
init terminating in do_boot()

without any understandable error message. Should try CouchDB.

PS: Rebuilding everything on Debian from source. We'll see
PPS: Didn't help 🙁

PPPS: CouchDB won't compile, it's not finding an Erlang kernel header file. This is why I love Java and hate C. I forgot how painful compiling and installing builds with make and .h-files was.

PPPPS: Found a version which did compile. Java API doesn't work.

PPPPPS: Back to MySQL which works. There seems a lot work needed to dethrone MySQL

PPPPPPS: Tried the same with Ubuntu 8 without success

Comparing Java and Python – is Java 10x more verbose than Python (LOC)? A modest empiric approach

In my last post about "50k lines of code considered large?" I've wondered about large code bases and the different perceptions on what a large code base is. I came to the topic because of a blog post: "The Maintenance myth" by Ola Bini. One minor point he makes about maintanence is lines of code in dynamic languages. I know maintanence is mainly about technical debt. But I'm interested in how lines of code factor into maintanence problems. Ola says

"(very large code bases is 50K-100K lines of code in these languages)"

pointing to Ruby and Python. In a reply to a comment from me Ola writes "I would consider 50k-100k in Ruby to be very large, yes, definitely. I know of Python code bases between 100k and 200k, but that’s about the largest I’ve heard of." With my Java background - and some Ruby and Python background mainly from the 90s - I consider very large applications to be much bigger, perhaps 500k to 1M for very large - not 50k lines as for example SnipSnap has 😉 The Linux kernel contains between 6.4M and 10M lines of code depending on the way you count. There seems to be a huge difference in what people consider very large. There could be several reasons:

  • Python and Ruby are very difficult so smaller code bases are considered very large
  • Python and Ruby are more dense so the same amount of logic can be expressed in lesser lines of code

Considering the second hypothesis the factor should be between 10x (50k compared to 500k) and 20x (50k to 1M) for things people consider very large - taking Ola and his coworkes and me (I didn't ask my team 😉 as a very small sample set.

Therefor I've expressed an example in code. The example is an application fragment for managing songs - the idea coming from the common Ruby introduction. I've chosen to compare Python and Java because Python is considered by some people a more mature language and used by larger projects than Ruby and because I did more and bigger projects in Python (my Ruby experience is only some years of coding web applications in Rails and writing an OR mapper and web component framework in Ruby by myself: "Convention over Configuration Framework in Ruby from 2002"). Someone could do a Ruby comparison 🙂

People may be surprised, but Java development and style - at least avantgarde - has changed over the last 13 years. So the example might not look like you think that Java should look. It reflects the style I would right now write green field Java code. It is inspired by Domain Driven Design and functional principles (for more about DDD and composite oriented programming in Java see Qi4J and real world Qi4j). In true DDD style I would prefer more objects like Name and Duration - see "Never, never, never use String in Java (or at least less often :-)" - but I've cut the example for brevity. Some people would not use a SongList domain object but a List directly. From my point of view, if SongList is a Domain Object and not an implementation detail, you should create a class and not use a List. So I've used a list.

A note on formatting: Lately parts of the wave front surfing Java developers switched to one line formatting of small methods, something which helps code readability and understanding a lot (you'll see). IntelliJ IDEA does support this as a formatting option. It's a very good feature in IDEA but to my shame I only detected it very late, but glad I did as it's so much better this way.

For manipulating, filtering and transforming lists I currently use Google collections. For an introduction see here. Google collections make working with lists much easier.

The REST part of the application is missing in Python as I have not enough knowledge to write the code with a state of the art REST framework. Perhaps someone could fill me in. In the Java example I've chosen to create the JSON and XML on my own without an automatic mapper. Automatic mappers do exist though, one could use JAXB. The code for the builder is explained in a previous post.

One cautionary note: The only Python books I have are "Internet Programming With Python" and "Programming Python", the first edition, both from 1996. Sorry that my Python is rusty, all correcting comments or comments on how to do it better are welcome. Please focus on better, not on shorter.

On to the examples:

Java

public class Song extends Entity<SongId, Song> {
  public Property<String> name;
  public Property<Integer> duration;
  public One<Artist> artist;

  public Song(String name, int duration, Artist artist ) {
    this.name = read(name);
    this.duration = read(duration);
    this.artist = read(artist);
  }
}

public class Artist extends Entity<ArtistId, Artist> {
  public Property<String> name;

  public Artist(String name) {
    this.name = read(name);
  }

  public String toString() { return name.get() }

  public Artist artist(String name) { return new Artist(name); }
}

public class SongList implements Iterable<Song≶{
  private List<Song> songs = newArrayList()

  public SongList addSong(Song song) { songs.add(song); return this}

  public Iterator<Song> iterator() { return songs(); }

  public Iterator<Song> filter(Predicate<Song> p) { return filter(iterator(), p); }
}

Some example usage:

// Not counted, as usually the pattern is
// SongList list = ... from Database ... or
// SongList list = ... from UI ...
SongList list = new SongList()
  .add(new Song("S1", 5, artist("A1"))
  .add(new Song("S2", 8, artist("A2"))
  .add(new Song("S3", 13, artist("A3"))

// Print all songs
for (Song song: list) {
  System.out.println( song.name() + " by " + song.artist() + "(" + song.duration() ")" );
}

// Print all songs with duration smaller than 10 (minutes)
final Predicate<Song> durationLowerThan10 = new Predicate<Song>() {
  public boolean apply(Song song) { return song.duration.get() < 10; }
}

for (Song song: list.filter(durationLowerThan10) {
  System.out.println( song.name() + " by " + song.artist() + "(" + song.duration() + ")" );
}

and a simple REST service which returns a JSON or XML (depending on the request) representation of the song list.

// Example REST Service
// Returning JSON and XML to a REST call, without automatic mappers like JAXB
public class SongListResource {
 @Inject ListService service;

 @GET @Path("/songs/{listId}")
 @Produces("text/xml", "application/json")
 public Node getList(@PathParam("listId") String listId) {
   SongList list = service.listForId( listId );

   return $("songs", new List>Song<(list) {
     protected Node item(Song song) { return $("name", song.name() );}
   });
 }
}

The example in Python

class Song:
   def __init__(self, name, duration, artist):
	self.name = name
	self.duration = duration
	self.artist = artist

class Artist:
	def __init__(self, name):
		self.name = name
		
	def __str__(self):
		return self.name

class SongList
    def __init__(self):
		self.songs = []
	
	def add(self, song):
		self.append(song)
	

some example usage

	
# Not counted, as usually the pattern is
# songList = ... from Database ... or
# songList = ... from UI ...
# Not using SongList, the examples should be the same though
# or not?
songList = [ Song("S1", 5, Artist("A1")), 
	Song("S2", 8, Artist("A2")), 
	Song("S3", 13, Artist("A3")) ]

for song in songList:
	print "%s by %s (%d)" % (song.name, song.artist, song.duration)

# could provide print method to list
for song in (song for song in songList if song.duration < 10):
	print "%s by %s (%d)" % (song.name, song.artist, song.duration)

A very preliminary conclusion

The example is very short and perhaps not very meaningful. One would need to do more empiric research (e.g. comparing FP to LOC in different languages). And perhaps some readers will provide addtional information. So the conclusion is preliminary and will be updated. Counting the lines of code there are 33 NCSS in Java and 19 NCSS in Python. Java has around 1.7 times the LOC of Python from my example. Taking the hypothesis above this could mean several things:

  • I've written sub-par code and most applications differ significantly in style and are much shorter in Python
  • Code complexity and lines of code arise from frameworks not the language
  • Java is really only 1.7x more verbose than Python, not 10x to 20x

I can't comment on the first conclusion. The second conclusion means, someone would need to compare two framework examples, say the song list in Seam and Django. The third conclusion is very interesting. It would mean that people consider applications written in Python very large although they (relatively) contain a lot less lines of code. Ola considers 50k to 100k very large, with a factor of 2x this would make 100k to 200k of Java lines. I can't speak for most Java enterprise/startup developers, but as I consider 500k to 1M very large, Ola and I differ by a factor of 5x of what very large is. I only can speculate what's the reason for this.

  • This is a personal thing, and different developers have hugely different views on "very large" (perhaps depending on what they have seen)
  • Developers only write small applications in Python and consider everything else "very large"
  • Python is not maintanable above 50k to 100k lines of code and because of that people consider this code bases very large
  • Developers have trouble understanding and refactoring bigger code bases than 50k to 100k lines of code (perhaps because it's a dynamically reference type language)

The first conclusion somehow fits with another quote from Olas post: "And it’s interesting, the number one question everyone from the static “camp” has, the one thing that worries them the most is maintenance.". They may have seen "very large" applications contrary to the "dynamic camp".

"Of course, this is totally anecdotal, and maybe these guys are above your average developer."

I'm glad to provide a step (small one) from the anecdotal to the empiric and from the empiric of this post I don't think people considering 100k of lines "very large" are "above your average developer".

Another side note: "But in that case, shouldn’t we hear these rumblings from all those Java developers who switched to Ruby? I haven’t heard anyone say they wish they had static typing in Ruby." Perhaps because they do green field (not brown field) development? And you need to develop for several years in one application to make it a brown field? And it takes several years to accumulate enough technical dept? Because most of them just started and don't do "very large" applications?

Other interesting stuff:

  • A paper (PDF) from 2000 about Scripting, C and Java comes to the conclusion: "Designing and writing the program in Perl, Python, Rexx, or Tcl takes no more than half as much time as writing it in C, C++, or Java and the resulting program is only half as long." matching the 1.7x factor of my short example
  • Dhananjay Nene wrote a performance post about Python and Java (and some other languages) and the LOC for Java is 86 and for Python 41, a factor of 2.1x
  • Dave rewrote a Java programm to Python from 4700 lines of code to 700 (factor of 6.7x). This would fit more with Olas impression. Not sure how this fits in, the developer can't show the source and it was a rewrite by a different developer. Also counting comments and empty lines, the styles between the developers could differ significantly.
  • Daveh did a comparison, with Python having 214 LOC (not NCSS) and Java 282 LOC (not NCSS). A factor of 1.3x

Lots of open questions and I would be very interested in other opinions and other examples - and to explore the topic further.

Thanks for listening to this very long post.

Update: Ryan (see comments) supplied a version of a function in C and Python and after removing the hand memory allocation code and the Python interface code of the C version, the factor is 2.2x (38 to 17 NCSS). Thanks.

Update 2: Looking at Oloh (see comments) the factor of Java and Python is 4x. Very large base of examples. One would need to check the types of programs.

Update 3: An old article I've found again "7 reasons I switched back to PHP after 2 years on Rails". An interesting info: After going to Rails and coming back, with the Rails knowledge the PHP app was reduced in size "- … and much more. In only 12,000 lines of code, including HTML templates. (Down from 90,000, before.)". Looks like rewrites or prior experience in the domain reduces code size. Could explain Olas experience with Java developers who switched to Ruby. Came to this article again through a comment by Harry Pynn "Point number 7 is that programming languages are like girlfriends: The new on is better because you are better. Could it be that people moving to dynamic languages from static languages find it easier to write maintainable code having honed their skills with a static language?" on Frank Carvers blog.