jabrams

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”: 

Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”

Slide2: 

Friendster Growth

How to scale a webapp: 

How to scale a webapp Lightweight non-sticky sessions Cache almost everything (memcached) Decouple slow processes from webapp Segment the database (don’t use replication to scale) Scale up not out (innovate on your app, not your infrastructure)

Session Management: 

Session Management Use simple lightweight sessions, store in centralized location that can be volatile (memcached at Socializr) 9F6077E3322B4E24C90C43178F42B9C0FFD1E4A43AED1BCA –> 656280029 Keep other user data in cache or cookies Avoid sticky sessions, keep load balancing simple

Tim Bray - Nov 2006 “Comparing Frameworks”: 

Tim Bray - Nov 2006 “Comparing Frameworks” “For Web apps, I’ve given PHP the edge, because I think building scalable PHP is a little easier. By default, PHP gives you a “shared-nothing” (or at least “shared very little”) architecture, which means you’re going to scale out pretty well until your database hits the wall. Java is a much richer system and assumes you’re smart enough to know whether a shared-nothing architecture is appropriate or not. The effect is, you have to be smarter to get the same kind of scaling out of Java.”

Evite Session Management: 

Evite Session Management

Caching: 

Caching Use memcached, don’t invent your own Put a large memcached instance on every webapp node Cache almost everything but think of your expiration strategy and invalidation rules

Avoid queries in loops: 

Avoid queries in loops Queries in loops are SLOW and strain the database Friendster in 2006 – 100s of db and cache queries per page Don’t be afraid of joins when they are optimized and well-indexed and the results are cached Cache big results

MySQL replication is not for scaling: 

MySQL replication is not for scaling If you have mostly reads, just use memcached, not slaves If you have many writes, the master will still be a bottleneck, and you will experience slave lag Scaling requires you to segment the db, not replicate (especially blobs) Use replication only for redundancy (some exceptions to this, i.e. joins on shards)

How not to architect a “friend tracker”: 

How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own tracker rows, sharded based on owner: insert into tracker (owner, user, date, type, param) values … For each trackable event, # of db writes = # of friends for that user

A better “friend tracker” approach:: 

A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own update rows, sharded based on user: insert into tracker (user, date, type, param) values … For each trackable event, # of db writes = 1 To compute someone’s “tracker”: for each of n shards: select * from tracker where user in (list), aggregate in the code

Decouple slow processes: 

Decouple slow processes Expensive computations (i.e. graph) Uploads & photo processing External content integration (via screen scraping, APIs, RSS, etc.) How: iframe, AJAX, POSTs and redirects, subdomains, etc.

How not to decouple:: 

How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php … … trackevent.php?user=1…

Thank You: 

Thank You abrams@jabrams.com www.socializr.com/jobs