TQ
dev.com

Blog about software development

Subscribe

Choosing a high performance web stack

17 Nov 2017 - by 'Maurits van der Schee'

In a previous post I told you that there is no such thing as the "right tool for the job". And this true for most businesses. Nevertheless there are companies that want to prepare for scaling up operations to "world domination" level. In that case there is one more factor to take into consideration when choosing a web development stack: performance.

Why performance matters

To understand why performance matters you first need to know that we are not talking about a few or even tens of percents of better performance. We are talking about factors and even magnitudes of better performance. Also we should consider the costs of rewrites and the costs of switching stacks.

When scaling is needed, you are probably growing fast and the servers can't keep up. In such a situation you cannot afford to do a rewrite as by the time you would finish, your customers have walked away. For a regular sized software product a rewrite typically takes more than a year (in which no other features can be developed). Some companies may call this a "luxury" problem upfront (because it is a sign of success), but I can assure you that that is not how it feels when you encounter it.

A rewrite in another stack is even more daunting as the developers that have built the software (and have all the domain knowledge) have to be trained to program in this new programming language. The only way you see companies successfully switch stacks is by moving to a micro-service architecture in which rewrites can be done on a per micro-service basis and thus incremental from a product perspective.

What performance buys you

A high performance stack may have a positive impact on your operational costs as you have to run less servers when you are dominating the world. Also any smart algorithm you implement will be more efficient when you choose the right stack. This means that you may initially do less optimization on your algorithms, allowing you to develop faster. It may also mean that you may have more budget to apply algorithms and thus make your software "smarter".

Why you should not choose C

The two most popular web servers (Apache and Nginx) are written in C. This is not a coincidence, because (as many people know) the world's best performing code is written in C. So why wouldn't you write your web application in C? There are several reasons to think of:

  1. Error prone due to manual memory management.
  2. Expensive and hard to find developers for.
  3. Not cross platform (both for OS and hardware).

Or as people commonly say: In C it is easy to shoot yourself in the foot.

Which (other) languages perform well enough?

Well that depends on your scaling aspirations. I have been doing some benchmarking myself on a (toy) data access API (exposing MySQL data over HTTP in JSON format) and I found:

  1. Java, 14000 req/sec (source code)
  2. Go, 12000 req/sec (source code)
  3. PHP 7, 6500 req/sec (source code)
  4. C# (.net Core), 5000 req/sec (source code)
  5. Node.js, 4200 req/sec (source code)
  6. Python, 2600 req/sec (source code)

(source)

Benchmarksgame is a website in which people implement toy programs in various programming languages and measures their performance. This benchmarking game shows roughly the following results:

(source)

The number before the "x" indicates how much slower it is than the optimal implementation (often written in C). If you cross reference that list with the 20 most popular languages (from TIOBE) you see something interesting:

  1. Java (~2x)
  2. C (~1x)
  3. C++ (~1x)
  4. Python (~40x)
  5. C# (~2x)
  6. JavaScript (~10x)
  7. Visual Basic .NET (?)
  8. PHP (~20x)
  9. Delphi/Object Pascal (~3x)
  10. Assembly language (?)
  11. R (?)
  12. MATLAB (?)
  13. Ruby (~40x)
  14. Go (~2x)
  15. Perl (~40x)
  16. Scratch (?)
  17. Visual Basic (?)
  18. PL/SQL (?)
  19. Objective-C (?)
  20. Swift (~2x)

(source)

As you can see there is no exclusive relationship between popularity and performance. This can be explained by the fact that most companies do not need to scale up to a level where performance really matters. Now let's look at the most popular websites in the world and the back-end technologies that they use:

  1. Google: C, C++, Go, Java, Python with BigTable, MariaDB
  2. Facebook: Hack, PHP, Python, C++, Java, Erlang, D, Xhp, Haskell with Vitess, BigTable, MariaDB
  3. YouTube: C, C++, Python, Java, Go with MariaDB, MySQL, HBase, Cassandra
  4. Yahoo: PHP with MySQL, PostgreSQL
  5. Amazon: Java, C++, Perl with Oracle Database
  6. Wikipedia: PHP, Hack with MySQL, MariaDB
  7. Twitter: C++, Java, Scala, Ruby with MySQL
  8. eBay: Java, JavaScript, Scala with Oracle Database
  9. Bing: ASP.net (C#) with Microsoft SQL Server
  10. MSN: ASP.net (C#) with Microsoft SQL Server
  11. Microsoft: ASP.net (C#) with Microsoft SQL Server
  12. Linkedin: Java, JavaScript, Scala with Voldemort
  13. Pinterest: Django, Erlang with MySQL, Redis
  14. WordPress: PHP, JavaScript with MariaDB, MySQL

(source)

Interesting is that this list also shows that there is no exclusive choice for high performance languages. One would think that this could be explained by the fact that companies combine slow languages with faster languages to solve their performance bottlenecks. I would have certainly expected that to be the case, but the data does not confirm this. Yahoo, Wikipedia, Pinterest and Wordpress seems to be using relatively slow languages without a faster one next to it, which is something that puzzles me and that I cannot explain. Maybe PHP does not perform so bad in practice as many of it's library functions are implemented in C.

What you do see is that Java as programming language and MySQL as database (even though some already migrated to MariaDB) are the most popular web technologies amongst these web giants. And with their overall popularity and their performance taken into account I feel that choosing Java with MySQL would be a safe choice for a web start-up that wants to take over the world!

YMMV ;-)