1. Scale horizontally (scale out): Separate system tiers on different environments: database, Solr, memcached, push, web/app servers. Separation helps in scaling up/down tiers individually.
  2. Use the cloud (Platform as a Service) right away that facilitates scaling out/up/down (Google AppEngine, Heroku, AWS Beanstalk, …
  3. Monitor usage on each tier to scale up/down in the correct time (e.g. NewRelic)
  4. Use push instead of server polling (Pusher)
  5. Don’t use filesystems for storage, unless it is a distributed filesystem (AWS S3)
  6. Don’t involve your app server in long requests/responses. Slow clients may block your server and cause longer request queues (depends on implementation). 
    • If you want to receive an upload get it through S3 with some work on the client side. 
    • If you want to send a huge response, either stream it using a streaming capability of your app server, or generate it using a background job that stores it finally on S3 and sends the direct link when done through the app using push or through email.
  7. Defer long tasks to background jobs (Resque, RabbitMQ, Ruby delayed jobs, …)
  8. Don’t clutter your app server memory with language bindings, use Apache Thrift or Google Protocol Buffers to communicate between different environments
  9. Use Apache Solr (Lucene over HTTP) to query large data even if you don’t have full text search, it can be used for scoping and faceting as well (think SQL WHERE and GROUP BY)
  10. Autoscale your web/app servers depending on traffic. Monitoring shows you traffic metrics and HireFire will autoscale your heroku dynos
  11. Use clients-side rendering (Javascript templates) to get rid of the rendering time on the server
  12. Use caching in different layers (memcached) to be nice on your servers
  13. Asset hosting, static assets (javascripts, stylesheets, html templates, images, …) have nothing to do with your app servers so host them somewhere else (CloudFront/S3)