nullBuilding a Scalable Architecture for Web Apps -
Part I
(Lessons Learned @ Directi)Building a Scalable Architecture for Web Apps -
Part I
(Lessons Learned @ Directi)By Bhavin Turakhia
CEO, Directi
(http://www.directi.com | http://wiki.directi.com | http://careers.directi.com)Licensed under Creative Commons Attribution Sharealike NoncommercialAgendaAgendaWhy is Scalability important
Introduction to the Variables and Factors
Building our own Scalable Architecture (in incremental steps)
Vertical Scaling
Vertical Partitioning
Horizontal Scaling
Horizontal Partitioning
… etc
Platform Selection Considerations
TipsWhy is Scalability Important in a Web 2.0 worldWhy is Scalability Important in a Web 2.0 worldViral marketing can result in instant successes
RSS / Ajax / SOA
pull based / polling type
XML protocols - Meta-data > data
Number of Requests exponentially grows with user base
RoR / Grails – Dynamic language landscape gaining popularity
In the end you want to build a Web 2.0 app that can serve millions of users with ZERO downtimeThe VariablesThe VariablesScalability - Number of users / sessions / transactions / operations the entire system can perform
Performance – Optimal utilization of resources
Responsiveness – Time taken per operation
Availability - Probability of the application or a portion of the application being available at any given point in time
Downtime Impact - The impact of a downtime of a server/service/resource - number of users, type of impact etc
Cost
Maintenance EffortHigh: scalability, availability, performance & responsiveness Low: downtime impact, cost & maintenance effortThe FactorsThe FactorsPlatform selection
Hardware
Application Design
Database/Datastore Structure and Architecture
Deployment Architecture
Storage Architecture
Abuse prevention
Monitoring mechanisms
… and moreLets Start …Lets Start …We will now build an example architecture for an example app using the following iterative incremental steps –
Inspect current Architecture
Identify Scalability Bottlenecks
Identify SPOFs and Availability Issues
Identify Downtime Impact Risk Zones
Apply one of -
Vertical Scaling
Vertical Partitioning
Horizontal Scaling
Horizontal Partitioning
Repeat processStep 1 – Lets Start …Step 1 – Lets Start …Appserver & DBServerStep 2 – Vertical ScalingStep 2 – Vertical ScalingAppserver, DBServerCPUCPURAMRAMStep 2 - Vertical ScalingStep 2 - Vertical ScalingIntroduction
Increasing the hardware resources without changing the number of nodes
Referred to as “Scaling up” the Server
Advantages
Simple to implement
Disadvantages
Finite limit
Hardware does not scale linearly (diminishing returns for each incremental unit)
Requires downtime
Increases Downtime Impact
Incremental costs increase exponentiallyAppserver, DBServerCPUCPURAMRAMCPUCPURAMRAMStep 3 – Vertical Partitioning (Services)Step 3 – Vertical Partitioning (Services)AppServerDBServerIntroduction
Deploying each service on a separate node
Positives
Increases per application Availability
Task-based specialization, optimization and tuning possible
Reduces context switching
Simple to implement for out of band processes
No changes to App required
Flexibility increases
Negatives
Sub-optimal resource utilization
May not increase overall availability
Finite ScalabilityUnderstanding Vertical PartitioningUnderstanding Vertical PartitioningThe term Vertical Partitioning denotes –
Increase in the number of nodes by distributing the tasks/functions
Each node (or cluster) performs separate Tasks
Each node (or cluster) is different from the other
Vertical Partitioning can be performed at various layers (App / Server / Data / Hardware etc)Step 4 – Horizontal Scaling (App Server)Step 4 – Horizontal Scaling (App Server)AppServerAppServerAppServerLoad BalancerDBServerIntroduction
Increasing the number of nodes of the App Server through Load Balancing
Referred to as “Scaling out” the App ServerUnderstanding Horizontal ScalingUnderstanding Horizontal ScalingThe term Horizontal Scaling denotes –
Increase in the number of nodes by replicating the nodes
Each node performs the same Tasks
Each node is identical
Typically the collection of nodes maybe known as a cluster (though the term cluster is often misused)
Also referred to as “Scaling Out”
Horizontal Scaling can be performed for any particular type of node (AppServer / DBServer etc)Load Balancer – Hardware vs SoftwareLoad Balancer – Hardware vs SoftwareHardware Load balancers are faster
Software Load balancers are more customizable
With HTTP Servers load balancing is typically combined with http acceleratorsLoad Balancer – Session ManagementLoad Balancer – Session ManagementSticky Sessions
Requests for a given user are sent to a fixed App Server
Observations
Asymmetrical load distribution (especially during downtimes)
Downtime Impact – Loss of session dataAppServerAppServerAppServerLoad BalancerSticky SessionsUser 1User 2Load Balancer – Session ManagementLoad Balancer – Session ManagementCentral Session Store
Introduces SPOF
An additional variable
Session reads and writes generate Disk + Network I/O
Also known as a Shared Session Store ClusterAppServerAppServerAppServerLoad BalancerSession StoreCentral Session StorageLoad Balancer – Session ManagementLoad Balancer – Session ManagementClustered Session Management
Easier to setup
No SPOF
Session reads are instantaneous
Session writes generate Network I/O
Network I/O increases exponentially with increase in number of nodes
In very rare circumstances a request may get stale session data
User request reaches subsequent node faster than intra-node message
Intra-node communication fails
AKA Shared-nothing ClusterAppServerAppServerAppServerLoad BalancerClustered Session ManagementLoad Balancer – Session ManagementLoad Balancer – Session ManagementSticky Sessions with Central Session Store
Downtime does not cause loss of data
Session reads need not generate network I/O
Sticky Sessions with Clustered Session Management
No specific advantages
Sticky SessionsAppServerAppServerAppServerLoad BalancerUser 1User 2Load Balancer – Session ManagementLoad Balancer – Session ManagementRecommendation
Use Clustered Session Management if you have –
Smaller Number of App Servers
Fewer Session writes
Use a Central Session Store elsewhere
Use sticky sessions only if you have toLoad Balancer – Removing SPOFLoad Balancer – Removing SPOFIn a Load Balanced App Server Cluster the LB is an SPOF
Setup LB in Active-Active or Active-Passive mode
Note: Active-Active nevertheless assumes that each LB is independently able to take up the load of the other
If one wants ZERO downtime, then Active-Active becomes truly cost beneficial only if multiple LBs (more than 3 to 4) are daisy chained as Active-Active forming an LB ClusterAppServerAppServerAppServerLoad BalancerActive-Passive LBLoad BalancerAppServerAppServerAppServerLoad BalancerActive-Active LBLoad BalancerUsersUsersStep 4 – Horizontal Scaling (App Server)Step 4 – Horizontal Scaling (App Server)DBServerOur deployment at the end of Step 4
Positives
Increases Availability and Scalability
No changes to App required
Easy setup
Negatives
Finite ScalabilityStep 5 – Vertical Partitioning (Hardware)Step 5 – Vertical Partitioning (Hardware)DBServerIntroduction
Partitioning out the Storage function using a SAN
SAN config options
Refer to “Demystifying Storage” at http://wiki.directi.com -> Dev University -> Presentations
Positives
Allows “Scaling Up” the DB Server
Boosts Performance of DB Server
Negatives
Increases CostSANStep 6 – Horizontal Scaling (DB)Step 6 – Horizontal Scaling (DB)DBServerIntroduction
Increasing the number of DB nodes
Referred to as “Scaling out” the DB Server
Options
Shared nothing Cluster
Real Application Cluster (or Shared Storage Cluster)DBServerDBServerSANShared Nothing ClusterShared Nothing ClusterEach DB Server node has its own complete copy of the database
Nothing is shared between the DB Server Nodes
This is achieved through DB Replication at DB / Driver / App level or through a proxy
Supported by most RDBMs natively or through 3rd party softwareDBServer
DatabaseDBServer
DatabaseDBServer
DatabaseNote: Actual DB files maybe stored on a central SANReplication ConsiderationsReplication ConsiderationsMaster-Slave
Writes are sent to a single master which replicates the data to multiple slave nodes
Replication maybe cascaded
Simple setup
No conflict management required
Multi-Master
Writes can be sent to any of the multiple masters which replicate them to other masters and slaves
Conflict Management required
Deadlocks possible if same data is simultaneously modified at multiple placesReplication ConsiderationsReplication ConsiderationsAsynchronous
Guaranteed, but out-of-band replication from Master to Slave
Master updates its own db and returns a response to client
Replication from Master to Slave takes place asynchronously
Faster response to a client
Slave data is marginally behind the Master
Requires modification to App to send critical reads and writes to master, and load balance all other reads
Synchronous
Guaranteed, in-band replication from Master to Slave
Master updates its own db, and confirms all slaves have updated their db before returning a response to client
Slower response to a client
Slaves have the same data as the Master at all times
Requires modification to App to send writes to master and load balance all readsReplication ConsiderationsReplication ConsiderationsReplication at RDBMS level
Support may exists in RDBMS or through 3rd party tool
Faster and more reliable
App must send writes to Master, reads to any db and critical reads to Master
Replication at Driver / DAO level
Driver / DAO layer ensures
writes are performed on all connected DBs
Reads are load balanced
Critical reads are sent to a Master
In most cases RDBMS agnostic
Slower and in some cases less reliableReal Application ClusterReal Application ClusterAll DB Servers in the cluster share a common storage area on a SAN
All DB servers mount the same block device
The filesystem must be a clustered file system (eg GFS / OFS)
Currently only supported by Oracle Real Application Cluster
Can be very expensive (licensing fees)DBServer
SANDBServerDBServerDatabaseRecommendationRecommendationTry and choose a DB which natively supports Master-Slave replication
Use Master-Slave Async replication
Write your DAO layer to ensure
writes are sent to a single DB
reads are load balanced
Critical reads are sent to a masterDBServerDBServerDBServerWrites & Critical ReadsOther ReadsStep 6 – Horizontal Scaling (DB)Step 6 – Horizontal Scaling (DB)Our architecture now looks like this
Positives
As Web servers grow, Database nodes can be added
DB Server is no longer SPOF
Negatives
Finite limitStep 6 – Horizontal Scaling (DB)Step 6 – Horizontal Scaling (DB)Shared nothing clusters have a finite scaling limit
Reads to Writes – 2:1
So 8 Reads => 4 writes
2 DBs
Per db – 4 reads and 4 writes
4 DBs
Per db – 2 reads and 4 writes
8 DBs
Per db – 1 read and 4 writes
At some point adding another node brings in negligible incremental benefitReadsWritesDB1DB2Step 7 – Vertical / Horizontal Partitioning (DB)Step 7 – Vertical / Horizontal Partitioning (DB)Introduction
Increasing the number of DB Clusters by dividing the data
Options
Vertical Partitioning - Dividing tables / columns
Horizontal Partitioning - Dividing by rows (value)
Vertical Partitioning (DB)Vertical Partitioning (DB)Take a set of tables and move them onto another DB
Eg in a social network - the users table and the friends table can be on separate DB clusters
Each DB Cluster has different tables
Application code or DAO / Driver code or a proxy knows where a given table is and directs queries to the appropriate DB
Can also be done at a column level by moving a set of columns into a separate tableApp ClusterDB Cluster 1
Table 1
Table 2DB Cluster 2
Table 3
Table 4Vertical Partitioning (DB)Vertical Partitioning (DB)Negatives
One cannot perform SQL joins or maintain referential integrity (referential integrity is as such over-rated)
Finite LimitApp ClusterDB Cluster 1
Table 1
Table 2DB Cluster 2
Table 3
Table 4Horizontal Partitioning (DB)Horizontal Partitioning (DB)Take a set of rows and move them onto another DB
Eg in a social network – each DB Cluster can contain all data for 1 million users
Each DB Cluster has identical tables
Application code or DAO / Driver code or a proxy knows where a given row is and directs queries to the appropriate DB
Negatives
SQL unions for search type queries must be performed within codeApp ClusterDB Cluster 1
Table 1
Table 2
Table 3
Table 4DB Cluster 2
Table 1
Table 2
Table 3
Table 41 million users1 million usersHorizontal Partitioning (DB)Horizontal Partitioning (DB)Techniques
FCFS
1st million users are stored on cluster 1 and the next on cluster 2
Round Robin
Least Used (Balanced)
Each time a new user is added, a DB cluster with the least users is chosen
Hash based
A hashing function is used to determine the DB Cluster in which the user data should be inserted
Value Based
User ids 1 to 1 million stored in cluster 1 OR
all users with names starting from A-M on cluster 1
Except for Hash and Value based all other techniques also require an independent lookup map – mapping user to Database Cluster
This map itself will be stored on a separate DB (which may further need to be replicated)Step 7 – Vertical / Horizontal Partitioning (DB)Step 7 – Vertical / Horizontal Partitioning (DB)Lookup
MapOur architecture now looks like this
Positives
As App servers grow, Database Clusters can be added
Note: This is not the same as table partitioning provided by the db (eg MSSQL)
We may actually want to further segregate these into Sets, each serving a collection of users (refer next slideStep 8 – Separating SetsStep 8 – Separating SetsLookup
MapLookup
MapGlobal RedirectorGlobal
Lookup
MapSET 1 – 10 million usersSET 2 – 10 million usersNow we consider each deployment as a single Set serving a collection of usersCreating SetsCreating SetsThe goal behind creating sets is easier manageability
Each Set is independent and handles transactions for a set of users
Each Set is architecturally identical to the other
Each Set contains the entire application with all its data structures
Sets can even be deployed in separate datacenters
Users may even be added to a Set that is closer to them in terms of network latencyStep 8 – Horizontal Partitioning (Sets)Step 8 – Horizontal Partitioning (Sets)App Servers
ClusterDB ClusterSANGlobal RedirectorSET 1DB ClusterApp Servers
ClusterDB ClusterSANSET 2DB ClusterOur architecture now looks like this
Positives
Infinite Scalability
Negatives
Aggregation of data across sets is complex
Users may need to be moved across Sets if sizing is improper
Global App settings and preferences need to be replicated across SetsStep 9 – CachingStep 9 – CachingAdd caches within App Server
Object Cache
Session Cache (especially if you are using a Central Session Store)
API cache
Page cache
Software
Memcached
Teracotta (Java only)
Coherence (commercial expensive data grid by Oracle)Step 10 – HTTP AcceleratorStep 10 – HTTP AcceleratorIf your app is a web app you should add an HTTP Accelerator or a Reverse Proxy
A good HTTP Accelerator / Reverse proxy performs the following –
Redirect static content requests to a lighter HTTP server (lighttpd)
Cache content based on rules (with granular invalidation support)
Use Async NIO on the user side
Maintain a limited pool of Keep-alive connections to the App Server
Intelligent load balancing
Solutions
Nginx (HTTP / IMAP)
Perlbal
Hardware accelerators plus Load BalancersStep 11 – Other cool stuffStep 11 – Other cool stuffCDNs
IP Anycasting
Async Nonblocking IO (for all Network Servers)
If possible - Async Nonblocking IO for disk
Incorporate multi-layer caching strategy where required
L1 cache – in-process with App Server
L2 cache – across network boundary
L3 cache – on disk
Grid computing
Java – GridGain
Erlang – natively built inPlatform Selection ConsiderationsPlatform Selection ConsiderationsProgramming Languages and Frameworks
Dynamic languages are slower than static languages
Compiled code runs faster than interpreted code -> use accelerators or pre-compilers
Frameworks that provide Dependency Injections, Reflection, Annotations have a marginal performance impact
ORMs hide DB querying which can in some cases result in poor query performance due to non-optimized querying
RDBMS
MySQL, MSSQL and Oracle support native replication
Postgres supports replication through 3rd party software (Slony)
Oracle supports Real Application Clustering
MySQL uses locking and arbitration, while Postgres/Oracle use MVCC (MSSQL just recently introduced MVCC)
Cache
Teracotta vs memcached vs CoherenceTipsTipsAll the techniques we learnt today can be applied in any order
Try and incorporate Horizontal DB partitioning by value from the beginning into your design
Loosely couple all modules
Implement a REST-ful framework for easier caching
Perform application sizing ongoingly to ensure optimal utilization of hardwareQuestions??
bhavin.t@directi.com
http://directi.com
http://careers.directi.com
Download slides: http://wiki.directi.com Questions??
bhavin.t@directi.com
http://directi.com
http://careers.directi.com
Download slides: http://wiki.directi.com
本文档为【Building a Scalable Architecture for Web App】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。