Holidays
Are you ready to support all your possible customers in the following holidays?
Usually, an eCommerce site will have 3-5x more traffic than the highest spike during the year. Make sure you are prepared!
We have compiled a list of things you have to consider before the upcoming holidays. This will ensure you'll survive the season with a lot of sales instead of dissapointed users.
The content will be divided into sections and each section represents a component in an eCommerce store.
Web Server
Ideally, if you are hosting your own infrastructure and managing your own web servers, you should have at least two balancers with round robin at DNS level: modern browsers can fallback to the second one if the first one is down.
All your static files should be served from the web server rather than the app server, with the correct headers to enable HTTP caching.
If you are using Ruby on Rails to power your store, you can compile the assets in the app servers and let the web server do the job.
Another important thing is to ensure that your SSL certificates are setup adequately with correct root and intermediate CAs.
As a way to secure your site, I recomend you to enable rate-limit-based IP blocking: this will help to protect your eCommerce store against DDoS.
Database
If you are using MySQL, I recommend disabling query cache and adjust innodb_buffer_pool_size
to a more suitable quantity. You can find several good blog posts about getting the correct value based on your total RAM. If you keep MySQL query cache and you have a vast amount of updates, MySQL internally needs to update the cache adding latency and eventually blocking new updates.
Calculate the correct amount of maximum connection limit taking into consideration the new resources you could add: you do not want to wait until you have a lot of traffic to restart your db server.
Make sure your backups are working and that they are recoverables; sometimes we forget to check that the backup actually works. Use SSD preferably.
Talking about replication, ideally you should have two replicas: one replica in real time and another one with some minutes of delay, just in case you need to recover from a pretty bad disaster that also gets replicated (like an accidental DELETE without WHERE clause).
Audit your database for missing indexes: you can use mysql slow query log or pg_stat_user_indexes
& pg_stat_user_tables
to discover them and add them!
If you are using PostgreSQL, enable async commits: this helps IO by adding some microseconds between internal commits. Add around 100ms between commits and see how it behaves. Konstantin Gredeskoul gave a pretty awesome talk about their experience scaling Wanelo.com at a MagmaConf talk, you can find the slides here.
Faster page rendering
Make sure your CDN is propertly setup, serves minified assets and with gzip support enabled.
Check the headers to take advatnage of HTTP Caching,
It is recommended to use up to four different hosts with different domains to use them as CDN: if you use the same domain, it would send all the cookies in each request and they are not needed.
Hosting & Deployments
If your site is hosted on Heroku, I suggest using the adept scale addon and calculate maximum based connections allowed by your database. Also make sure your expected response time is accurate.
If you are managing your infrastructure, provision some machines and put them to ‘sleep’ (hopefully, your hosting provider allows it without billing those machines). If you need more resources, you are some clicks away of turning them on and start serving more requests instead of waiting until more machines are provisioned.
Make sure your provisioning scripts are updated and fully working.
It is also important to make sure that more than one person is able to deploy to production since you might need a hotfix. Your deployment should be done in less than 5 minutes and with ability to easily do a rollback to a previous version.
Monitor EVERYTHING you can: memory, CPU, IO, Network and set alerts with thresholds. You can even use New Relic and its agent to do all the job.
App Servers
Enable fragment cache and use at least two cache servers. The best approach is to take advantage of multi GET
functions if this feature is supported by your caching storage. For example, if you are rendering a category view, you can fetch all the product tiles with a single call to your cache saving some time added by individal calls to the cache server.
When expiring caches, make sure to go through every single cache server expiring the same key.
Add cache to your API endpoints that are read-only with a correct expiration strategy.
Use counter cache as much as possible and preferably delegate this to Redis.
And one last tip: try to eliminate all the SELECT count(1) from table
to calculate the number of pages when paginating results and replace this with 1000+ or similar.
Tracking activity and audience
Make sure your Facebook meta tags are fully working and add tracking codes to every single share button.
Test your product detail page with Twitter cards and make sure that they are working properly. Also, I suggest adding the Facebook segmentation pixel: you'll know your audience better.
Protip: give this data to your marketing team so they can adjust Facebook campaigns.
Finally, make sure that Google Analytics eCommerce is properly working.
Queueing systems
When all tasks are executed asynchronously, there is no need to render the content (such as email delivery, cache expiration, etc.).
Check connection limit for your queueing backend and maybe add more app servers if you reach this limit.
Take into consideration the connections used by your async job processor in your total of max database connections: it's pretty easy to reach limits without realizing it.
One last thing, add uniqueness to your jobs so you do not enqueue twice the same job with the same paremeters; it could save a lot of writes!
Integrations
Talking about third party integrations, make external calls fault tolerant: third party services will be hit by a lot of requests as well.
Do you need help?
We've been working in large projects to get everything ready for this season. If you need help tuning up your infrastructure or need to finish a feature before the holidays: do not hesitate to contact us! We can help you.