14 Nov 2024 - by 'Maurits van der Schee'
A client asked me: How does one scale software to handle 1 million websocket connections? At 1000k connections when every client sends only one message per 30 seconds you have to deal with 33333 websocket messages per second. Dealing with that many requests per second is well understood in HTTP, but unfortunately these are messages on a websocket. Nevertheless, let's assume that these are HTTP requests for now.
Laravel behind Nginx can do 2908 requests per second on a Dell R440 Xeon Gold + 10 GbE when the average API call is 20 database queries per request (see here). This means that in order to handle 33333 API requests for 1 million websockets you need a minimum of 12 of these machines. Also, the corresponding 666k queries per second may need horizontal scaling of the database (unless the queries are really cheap, see here).
Approximately 2 dedicated machines with Haproxy behind DNS round-robin may be able to handle the reverse proxy load of stripping SSL and distributing over multiple machines for about 33k HTTPS requests (see here).
DNS RR --> 2 x HaProxy --> 2 x WS2API --> 12 x PHP --> 2 x PgSQL
These 33k requests per second can be handled by 2 "WS2API" servers that convert websocket messages to HTTP requests.
So what is it the WS2API project does to allow one to treat websocket messages as HTTP requests? Here are the client initiated flows:
WS client --[ws upgrade]--> WS server --[http get request]--> API server
WS client <--[ws connect]-- WS server <--[http response "ok"]-- API server
WS client --[message]--> WS server --[http post request]--> API server
WS client <--[message]-- WS server <--[http response]-- API server
WS client --[ws close]--> WS server --[http delete request]--> API server
WS client <--[ws disconnect]-- WS server <--[http response "ok"]-- API server
And this is the server initiated flow:
API server --[http post request]--> WS server --[message]--> WS client
Note that responses to server-to-client requests are handled as client-to-server requests.
I have implemented WS2API several times in different languages and frameworks:
It is only 200 lines of code, so it is easy to port to any language/library. Here is a stress test (connection ramp-up) with 1 message per 10 seconds instead of per 30 seconds on a single instance of the above software:
As you can see the RPS (request per second) and connection ramp-up are very good in Go, helping in cold-start situations.
Note that there is no centralized connection lookup needed as every requests is always (consistently) mapped to the same WS2API "ws proxy" server as long as each client connects with a unique "Client ID" string in the URI path. This ClientID is also used as an address when sending messages back to the websocket client.
If you want to deal with 1 million websocket connections it may be beneficial to convert the websocket messages to HTTP requests. This way you can deal with a "normal" high traffic web application, instead of a custom websocket solution. WS2API does this for you and WS2API as a concept that is easy to implement.
See: https://github.com/mevdschee/ws2api
Enjoy!
PS: Liked this article? Please share it on Facebook, Twitter or LinkedIn.