Adding Hardware is not Always the Cheapest Option

If your application grows and you need to support more customers, there is the common wisdom to buy more hardware. Or solve performance problems with more hardware. But the simple math says there is a point where investing in software or OS optimization is much cheaper than buying more hardware. There are essentially three ways to optimize and scale your web application - assuming it can scale.

  1. Buy more hardware
  2. Optimize hardware - buy better optimized hardware
  3. Optimize your application - or your software stack

Suppose we have 10 servers and want to add 100% customers. Then we need to buy 10 more servers. Suppose we have 1000 servers and want to add 100% customers. Then, for the same growth, we need to add another 1000 servers. As this gets expensive very fast (exponentially)- not in the beginning of your startup but after some time - you need to look harder and harder into software optimizations. This is also the reason why Facebook benchmarks all it's new hardware to squeeze out every last bit of performance. If you have 10 servers and you squeze out 1% of performance, this is equivalent to 0.1 servers. If you have 10000 servers and you squeze out 1% more performance that's equivalent to 100 servers.

See the example for costs for 10% customer growth:

There are other factors to think of. Most often when talking about scaling, people talk about how their application scales linear. This correlates to your application having O(n) scaling. And in theory this is fine. In reality it's probably f*O(n). And different architectures, although scaling with O(n), have different linear factors. It does matter if f is 2 or f is 20. What f does your web application have?

Actor Myths

Actors are the new concurrency. They are everywhere. People make bold claims about actors, and while I do not agree with many of them, two in particular I regard as myths. Here they are:

  1. Actors are a shared nothing architecture
  2. Actors are easier to get right because of their shared nothing architecture

I know I'm alone with calling those myths, but here we go. Considering the first myth. If they share nothing, they could not collaborate on data. Sharing immutable messages does not help for getting data synced up again. So this can't be the way to collaborate. Indeed actors do share everything! Each actor can be considered a data structure with state and a write lock (synchronized in Java).

The second myth. Actors are easier to get right because they share nothing and therefor do not have locks. As argued above they have locks, so this can't be the reason why they are easier to get right. I consider two characteristics of actors to be the reason for them being easier:

  1. They only can hold one lock at a time (when using synchronized send messages). This is a common approach to prevent deadlocks in locking architectures.
  2. They mostly send asynchronous messages. This prevents the common problems that arise from holding locks and blocking. In locking architectures when accessing locks in an asynchronous way (e.g. with futures and timeouts) there are also no deadlocks for the calling thread.

I would like to learn something from this. So please keep your flaming minimal and help me understand possibles errors in my thinking instead.

Update: If you do not believe me concerning the shared-nothing part, read the example from James

This Function is Not Tail Recursive

Tail recursion seems to be an easy concept, but most people get it wrong - including me. Reading the latest German Java SPEKTRUM, I've found an article about parallel multicore development by Kornelius Fuhrer. One paragraph was about functional development and tail recursion. First he claims tail recursion makes functions 100% parallizeable (I guess broadly speaking all compositions h(g,f) of side effect free functions g,f are 100% parallizeable in f and g, nothing to do with tail recursion) then he claims his example functions are tail recursive:

Wikipedia says about tail recursion:

In computer science, tail recursion (or tail-end recursion) is a special case of recursion in which any last operation performed by the function is a recursive call, the tail call, or returns a (usually simple) value without recursion.

In all three examples the last operation performed is the multiplication (*), not the function call to itself. First the function itself is called, then the return value is multiplied by n. Stackoverflow has a lot to say about tail recursion too.

A tail recursive version of factorial might look like this:

So please, do not call all functions where the last function call in your source code is a call to itself, tail recursive. A function is tail recursive, if the last operation is a function call to itself.

Want to Become a Startup CTO?

During the dot com bubble, I was a founder and startup CTO. At the beginning I wondered what to do as a CTO. There were many conflicting views on that position. From programming to vision, from technology to processes, from tools to people. After some time and more years at managing development teams and departments, my view on the CTO role is much clearer. I've distilled the (web) startup CTO job down to:

  1. Write code - often forgotten, but you need to be able to write code, and if you're one of the first technical hires, there is not enough work for you if you do not write code.
  2. Decide if to hire or to outsource. There are reasons for both, but I'd say in 90% of cases one should hire. The CTO needs to decide which way to go, and - especially important be prepared to defend his decision. Buying is usually not the way to go for a technology or web startup. This doesn't mean there can't be a mix, there are often projects which can and should be outsourced.
  3. Hire developers, testers, admins and operators. Several times I've been asked to be the CTO for a startup, as in Germany most web startups are founded by ex-consultants or economics majors. They usually have no clue - nor should they have - about good or bad developers. As a CTO you need to provide this knowledge and hire the best you can get for your budget.
  4. Know how to scale development and processes (from 1 to 10 developers e.g.). Development changes significanty if you go from one to five and then to ten developers. You need some kind of process (I for example prefer Lean and Scrum). As a startup CTO you need to provide this in the first years (certainly not first months).
  5. Know technologies, craft an architecture and technology strategy.
  6. Know how to scale an application (from 100 to millions of users).

What do others think about the CTO role? Werner Vogels, CTO of Amazon, defines four roles a CTO can have:

  1. Infrastructure Manager
  2. Technology Visionary and Operations Manager
  3. External Facing Technologist
  4. Big Thinker

Many people have contributed to shaping the CTO role in startups. Indus Khaitan thinks the "5 Bare Minimum Things A Web Startup CTO Must Worry About" are:

  1. Security
  2. Availability & Monitoring
  3. Application Errors
  4. Backup
  5. Source Control

I do agree with them, though I'm not sure Source Control is in my Top 5. Security depends on what you do and what framweworks you use, so this might not be an issue for some time into startup life. Availability gets more and more important when your customers increase and revenue over your plattform increases. Monitoring is a must from the beginning, it's hard to add later - you lose especially a lot of insight if you have no monitoring, insight which is dearly needed in your first incident. Backup is often forgotten or underrated - do not forget to see if you really can get your site going again from a backup.

Eric Ries writes on the role of the CTO:

The CTO's primary job is to make sure the company's technology strategy serves its business strategy.

with five specific skills:

  • Platform selection and technical design
  • Seeing the big picture (in graphic detail)
  • Provide options
  • Find the 80/20
  • Grow technical leaders

This is a great oversight of the CTO responsibilities, there is much more juice in that article, go ahead an read it.

Tony Karrer ponders the question "Startup CTO or Developer":

What worries me a bit is how often I read that startups should hire a developer / hands-on lead developer. I understand the desire for hiring someone who is going to product product. But often the result of a Founder hiring a developer or lead developer or even a VP engineering is a gap created between the founders and the developers. [...] By the way, I’m not suggesting that startups should hire a full-time Startup CTO who is not hands-on. Rather they should get a part-time Acting CTO who can help close the gap.

Then there is the open question, after you've been some time into a startup, what is the difference between a CTO and a VP of Engineering? A very good answer can be found - at least it was enlighting to me - at Mark Suster blog in "Want to Know the Difference Between a CTO and a VP Engineering?":

The CTO [...] So I believe that every great technology startup has the technology visionary inside the company. This is the person who lays the foundation of what should be built. [...] And first and foremost a VP of Engineering is a people manager.

The VP of engineering is basically managing people and processes, while the CTO is managing technology and the vision. My take on this:

  • In the beginning you better need a CTO that also does a VP/E job.
  • Later you can split the position and hire a VP/E for processes - if needed

Coming back to the tilte of the post: "Want to Become a Startup CTO?" - These are skills to have and the role to play. I hope this made the CTO role clearer, technical and non-technical founders. See if this is really for you, perhaps know what your boss should do and focus on. What is your take or experience on the CTO role?