Showing posts with label joyent. Show all posts
Showing posts with label joyent. Show all posts

Friday, October 26, 2007

SPAN SUMMIT LIVE: Scaling Facebook Apps

Pretty good pointers for developers of facebook apps around Scaling Facebook Apps

By Mark Mayo; CEO, Joyent

How to Scale

  • Vertically (db)
  • Horizontal (page generation) – stateless page + segregated inserts
  • Agile Infrastructure + Dev – Optimization takes priority over features as soon as app goes viral. Users arrive fast and furious

Specifics

Example of “Are you normal” by Kinzin

- They had their own infrastructure from independent website

- 1sst facebook app

- but used joyent

- Traffic: 150k yesterday, 200k today

What they did in the app

  1. think about reengagement
  2. First 50k users are most difficult. Advertised to get those users
  3. got to recently popular page. Got peaks when they appear on recently popular pag

Their Story

Started Small

  • started with $15 shared hosting
  • chose $45 accelerator
  • 2x125 (1GB) for launch
  • preconfigured and tested load balancing

First Recently popular page

  • 10K users
  • db was 100% cpu
  • not enough mongrels to handle db connections
  • what they did
    • split and clone stateless app server container
    • double database memory
    • tripled performance in 2 h

Its always database

- look for slow queries (obvious but overlooked, untested)

- find offending code, hand optimize sql (do it)

- memcache everything (no other option)

- run a memcached per app server

- most apps don’t need to go to multiple db (very expensive option)

Second recently popular

- 50k users

- much better caching

- just one slow query

- added 2 more app servers (4x now)

- db lagging on I/O now

Recently popular #3

- already dedicated disk spindles for data and logs

- 100k users

- no problems, db cpu under 15%

What happened: went totally viral in 2 weeks. Very little time if your app goes viral. Have a strategy.

Next Steps

- mysql replica

- split read/write queries in code

- add appservers as needed

- metrics indicate another 3-4 app servers to reach 1M with capacity to spare

Lessons learnt

- you only get hours on recently popular page

- must have plan to add capacity on fly in tens of minutes

- plan on how to attack optimization in ur code

- test with more than sample data to find those slow queries.. its always db

Q&A

- to load test.. write script to add users