This project was inspired by the bloatedness of an "embedded" webserver at ~400K and The C10K Problem page,
which discusses (the lack of) highly scalable servers and what can be done about it.
The trick to handling many, many simultaneous clients is to do as much non-I/O-bound work per-client
as possible while nothing blocks. Once a single operation might block, you note where you left off and jump to
the next client in line. On the next I/O event for a particular connection, the state machine picks up where it left
off. In this fashion, (tens of) thousands of clients may be served simultaneously using a modern CPU.
Other things help as well, such as pre-allocating space and careful organization to leverage memory locality. Furthermore, application-level caching can help avoid duplication of work.
The MIT License, which in a nutshell allows unrestricted
use, modification, distribution, etc. by anyone for any reason as long as you agree not to sue the author.