WebServicesPerforman ce

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Performance Issues of Web Services: 

Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer

Web Services: 

Web Services Services available via the Internet that complete tasks or conduct transactions. Self-contained, modular applications that can be described, published, and invoked over the Internet. Can be automatically invoked by application programs.

Web Services: 

Web Services May be invoked at one site or may combine results of several services executed at different sites.

Performance concerns differ from stanard C/S: 

Performance concerns differ from stanard C/S May involve both web service processing and network delays May be accessed by wide variety of devices -- desktop computers, PDAs, mobile phones, other servers Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency

Performance concerns differ from standard C/S: 

Performance concerns differ from standard C/S Undpredictable nature of requests Highly bursty Varies with geographical location of clients, day of week, time of day Highly variable size of requested objects “Robot” access Autonomous software agents that can consume significant amounts of system resources

Types of servers providing Web Services: 

Types of servers providing Web Services Web servers Transaction servers Proxy servers Cache servers Wireless gateway servers Mirror servers

Common problems: 

Common problems Insufficient bandwidth at peak times Overloaded servers Uneven server loads Delivery of dynamic content Shortage of connections between application servers and database servers Failure of third-party servers Delivery of multi-media content

Example: Bill Paying Service: 

Example: Bill Paying Service Portal offers bill paying service Customers can pay variety of bills through the service Uses services provided by others: Debit authorization (100 tps capability) Electronic funds transfer Customer authentication

Example: Bill Paying Service: 

Example: Bill Paying Service

Example: Bill Paying Service: 

Example: Bill Paying Service Portal B is bill paying service Treat overall web service as ‘system’ Treat component services as ‘devices’ What is the capacity of B, given that the debit authorization service can support 100 tps and that each payment transaction requires 2 visits to the Xi = Vi * X0 100 = 2 * X0 X0 = 50 tps

Web server elements: 

Web server elements

HTML and XML: 

HTML and XML Most documents on the Web written using HTML “markup language” Most consist of text and inline images Can also include other multimedia objects Generates multiple requests: for document and for each inline image -- single click by user may generate series of requests XML uses tags and attributes to define/delimit data Application must interpret meaning of the tags

Hardware and Operating System: 

Hardware and Operating System Hardware view: performance a function of: Number and speed of processors Amount of main memory Bandwidth and storage capacity of disk subsystem Bandwidth of the NIC OS considerations: Performance, scalability, reliability, robustness

Content: 

Content Performance affected by: Content size Content structure Hyperlinks Popularity of content

Perception of Performance: 

Perception of Performance User view: Fast response time; no connections refused Management view: High throughput; high availability Need to have quantitative measurements that describe behavior of Web service

Metrics: 

Metrics Two most important; Response time -- seconds Throughput -- http_ops/sec, also bits/sec

Other metrics: 

Other metrics Hit any connection to a web site, including in-line requests and errors difficult to compare across sites Visit Series of page requests by a user at a single site Inter-request times < timeout_value Session Series of consecutive and related requests made during a single visit Inter-request times < timeout_value

Other metrics: 

Other metrics User-perceived response time Set of geographically distributed agents poll the WS Error rate Increase indicates degrading performance Examples: Overflow of pending connection queue For streaming services: Jitter Startup latency

Most common measurements of Web service performance: 

Most common measurements of Web service performance End-to-end response time Site response time Throughput (req/sec) Throughput (Mbps) Errors/sec Visitors/day Unique visitors/day

Example - Travel Agency: 

Example - Travel Agency Monitor for 30 minutes: 9000 HTTP requests Three types of objects delivered: Html pages (30%, avg. size 11,200 bytes) Images (65%, avg. size 17,200 bytes) Video clips (5%, avg. size 439,000 bytes) What is the throughput: 9000 requests/1800 sec = 5 req/sec What is the throughput in Kbps?

Throughput in Kbps?: 

Throughput in Kbps? Xr = (total_req * class% * avg. size)/time Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25 Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72 Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42 X0 = 131.25 + 436.72 + 857.42 X0 = 1425.39 Kbps To support the Web traffic, the network connection should be at least a T1 line (1.544 Mbit/s ).

QoS indicators for Web Services: 

QoS indicators for Web Services Response time Availability Percentage of time a service is ‘live’ (serving customer requests) Reliability Probability that WS will perform in satisfactory manner for a given period of time under specified operating and load conditions Predictability Cost

Input data needed to monitor QoS: 

Input data needed to monitor QoS Traffic Performance Usage patterns Knowledge of average and peak load

Where are the delays?: 

Where are the delays?

Where are the delays?: 

Where are the delays? Four categories: DNS lookup phase TCP connection set-up phase Server execution time Network time

DNS lookup phase: 

DNS lookup phase Browser converts server name in URL into an IP address to establish the TCP connection If server name can’t be resolved by local cache, send query to higher-level DNS server For leading e-commerce sites, avg. lookup times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.

Anatomy of a Web Transaction: 

Anatomy of a Web Transaction

Anatomy of a Web transaction: 

Anatomy of a Web transaction Browser Network Server

Anatomy of a Web Transaction: the Browser: 

Anatomy of a Web Transaction: the Browser User clicks on hyperlink; requests document Client (browser) checks local cache for document; in case of hit: returns document; user response time R’Browser,hit* In case of miss Browser asks DNS to map server hostname to IP address Cloent opens a TCP connectionto the server defined by the URL of the link Client sends an HTTP request to the server Browser formats and displays document and renders images Returned document is stored in browser cache User response time: R’Browser,miss*

Anatomy of a Web Transaction: the Network: 

Anatomy of a Web Transaction: the Network Imposes delays in delivering info from client to server (R’N1) and from server to client (R’N2). Delays a function of components on path between them: Modems, routers, comm links, bridges, relays R’Network = total time HTTP request spends in the netork = R’N1 + R’N2

Anatomy of a Web transaction: the Server: 

Anatomy of a Web transaction: the Server request arrives from client server parses the request according to the http server executes requested method (GET, HEAD, etc.) if GET server looks up file in its document tree by using the file system; file may be in cache or on disk server read contents of file from disk or cache and writes it to network port when file send complete, close the connection (if non-persistent HTTP) R’server = time spent in execution of HTTP request includes service time and waiting time at the server

Anatomy of a Web transaction: 

Anatomy of a Web transaction If document not found in client’s cache: response time is sum of residence time at all resources Rmiss = R’Browser, miss + R’Network + R’Server If a hit Rhit = R’Browser, hit Typically: Rhit << Rmiss Average response time, R, over NT requests: R = pC * Rhit + (1-pc) * Rmiss

Example: 

Example User wants to analyze impact of local cache size of browser on Web response time perceived by user 20% of requests serviced by local cache with R=400 msec R for remotely serviced requests = 3 sec Previous expts. indicate that 3x cache size results in hit rate of 45% R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec

Bottlenecks: 

Bottlenecks bottleneck = the component that limits system performance Need to identify the bottleneck to improve performance

Example: 

Example home user takes too long to download medium-size page (avg. size 20KB) considering upgrading to processor w/2X faster CPU How will this affect response time?

Example, continued: 

Example, continued Assume: R’network = 7.5 sec R’server = 3.6 sec R’Browser, miss = 0.3 sec R = R’network + R’server +R’Browser, miss R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec not much difference … CPU not the bottleneck

Example: 

Example Pharma co. plans intranet for training and display of images of molecules training sessions have 100 people assume 80% active at any one time Each user performs avg. of 100 ops/hour Each op requests avg. of 5 images Avg. size of requested image is 25600 bytes What is minimum bandwidth of network connection to image server?

Example, continued: 

Example, continued 100 * 0.80 * 100 ops/hour * 5 images/op * 25600 bytes/image * 8 bits /byte * 1 hr/3600 sec (100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps

Web Infrastructure: 

Web Infrastructure

Web infrastructure: 

Web infrastructure Three major delay sources: “last mile” Link between end user and phone company switch, or DSL or cable connection to service provider ISPs Recently, more bandwidth added Improvements via caching, load balancing, more servers ‘backbone’ of network Collection of interconnected network providers Connect to each other to exchange traffic (peering) Public peering: at major interconnection points (NAPs, network access points)(MAEs, Metropolitan Access Points) Delays may occur at peering points

Basic Components: 

Basic Components Servers Browsers Firewalls protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers Screens network traffic going through it, using Software, network hardware, computers Potential performance bottleneck

Proxy, Cache, Mirror: 

Proxy, Cache, Mirror Techniques for improving web performance and security Try to reduce access time to web documents Network bandwidth required for doc xfers Demand on servers w/ very popular docs

Proxy server: 

Proxy server Special type of web server that acts as an agent: server to the client, client to the server Accepts requests from clients, forwards them to web servers Receives responses from remote servers, forwards them back to the client Originally designed to provide web access for users on private networks who had to go through a firewall

Proxy server: 

Proxy server Can be configured to cache relayed responses Benefits: Improves access speed by bringing data closer to consumer Cuts down on network traffic Reduces server load Increases availability in the web Problems: Ensuring that cached docs are up-to-date What’s worth caching? For how long?

Proxy server: 

Proxy server

Caching: 

Caching Used in the Web: Client-side, at the browser In the network, a caching proxy Evaluating caching effectiveness: Hit ratio = requests_satisfied/total_requests Byte hit ratio = hit ratio weighted by doc size Data transferred = bytes xferred/time

Example: 

Example Manager wants to install caching proxy server on corporate intranet w/ > 2000 users Use for 6 months -> then evaluate Consider two cases: Cache holds small documents, avg. size 4800 bytes, hit ratio 60% Cache holds medium documents, avg. size 32500 bytes, hit ratio 20% Monitor for one hour, observe 28800 requests

Cache efficiency: 

Cache efficiency Saved_BW = (num_req * hit_ratio * avg_size)/time Saved_BW_small = (28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps Saved_BW_med = (28800 * 0.20 * 32500*8)/3600 = 416 Kbps Holding larger documents can save more BW

Mirroring: 

Mirroring Replicating site content at other servers Requires: Regular updates DNS to direct browsers to secondary sites when primary is busy Goals: Increase availability Balance server load Thus increasing quality of service

Example: 

Example Manufacturing co., employee portal, too slow for European users Idea: install mirror site in Paris What are the bandwidth savings ?

Example: Mirror site in Paris: 

Example: Mirror site in Paris Current avg. BW is 35 Mbps 40% of load from Europe 42% of traffic could be served from caching Cacheable amount: 35 * 0.42 = 14.7Mbps Estimate cache hit ratio at 38% Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps 40% of traffic from Europe, so: 5.6 * 0.40 = 2.24 Mbps could be served from cache in Paris 6.4% savings on current BW usage at server improvement in perceived response time for European users

Content Delivery Networks(CDN): 

Content Delivery Networks(CDN) cache or replicate content as needed to meet demands from clients over the Web coordinated caching systems implemented through proprietary networks and data centers employ a DNS-redirecting mechanism tries to assign best location from which to serve the requested content

Content Delivery Networks(CDN: 

Content Delivery Networks(CDN DNS-redirecting mechanism: client requests URL; browser generates a DNS request for the IP address corresponding to the domain name in the URL CDN controls the DNS service for this domain name CDN modifies DNS requests with the IP addess of a selected server rather than IP address of original server uses a routing function to select “best” server: client location, id of requested content, load of CDN network and servers, proximity of CDN servers to client are all considered CDN should provide: scalability, high availability, manageability, performance

The WAP Infrastructure: 

The WAP Infrastructure WAP = Wireless Application Protocol architecture + set of protocols for wireless devices to access Web services at regular Web sites wireless device communicates with WAP gateway, over wireless nework WAP gateway communicates with servers

The WAP Infrastructure: 

The WAP Infrastructure

The WAP Infrastructure: 

The WAP Infrastructure Docs for wireless devices written in form of XML known as WML (wireless markup language) can also use WMLscript WML docs structured as set of “cards”, units of user interaction deck = set of cards users navigate between cards

The WAP Infrastructure: 

The WAP Infrastructure WML decks + WMLScripts stored in regular web servers on internet retrieved by WAP gateway via HTTP Web server response is binary encoded by WAP gateway and sent to wireless device via lightweight protocols designed to minimize BW requirements

WAP protocol stack: 

WAP protocol stack

Server Architectures: 

Server Architectures Web Server Application Server Transaction and Database Server Streaming Server Multi-tier Architecture

Web Server: 

Web Server listens for HTTP requests establishes requested connection sends requested file returns to listening mode can handle more than one request at a time fork a copy of the HTTP process for each request multi-threaded HTTP program pool of running processes

Dynamic content: 

Dynamic content can use client-side or server-side programs can improve performance by pushing to client-side

Application Server : 

Application Server software that handles all application operations between broswer-based customers and back-end databases receive client request execute business logic, interacting with transaction and/or DB servers can be implemented in many ways: CGI scripts, FastCGIs, server-applications, server-side scripts

Transaction and Database Server: 

Transaction and Database Server Tranasction Processing (TP) monitor provides: an application programming interface a set of program development tools a system to monitor and control execution of transaction programs DB server: executes and monitor transaction processing applications

Streaming Server: 

Streaming Server Initially, audio and video were “download and play” technologies Streaming media begins to play “almost” immediately client request arrives server retrieves video and audio data and begins to deliver them over the network video and audio are compressed (MPEG, MP3) typically have control part and data part

Example: 

Example Company plans to offer MM online training Employee retrieves lecture of video, audio, slides; 30 minute duration What is the number of streaming servers needed to serve the lecture presentation during busiest period of the day: 4-5 pm

Example: 

Example 400 employees at peak One MM server can stream presentations to 150 viewers simultaneously What is the average number of simultaneous viewers during peak period? Use Little’s Law: N=R  = Req/time = 400 viewers/60 min R = 30 min N = 30 * 400/60 = 200 Need two MM servers

Multi-tier Architecture: 

Multi-tier Architecture web-based apps usually in 3-tier architecture: presentation layer user interface (browser & HTML, XML, etc.) application layer business logic collection of rules to implement application logic may also contain Java applets, ActiveX controls, etc. data service layer persistent data

Multi-tier Architecture: 

Multi-tier Architecture

Example: 

Example application layer designed to support 400 simultaneous processes app process: receives client request executes app logic, interacting with DB server Monitoring shows: app process executes for 150 msec between DB requests DB server handles 440 req/sec 400 app processes running during peak period

What if??: 

What if?? the application servers are replaced by new servers with 2X speed Each application server characterized by Z, “think time” – time between receiving a reply from the DB server and submitting a new DB request DB layer, characterized by throughput, X, in req/sec R = N/X - Z

What if ...?: 

What if ...? DB response time: R = 400/550 – 0.15 = 577 msec = 0.577 sec after cpu upgrade, app processing time should be 75 msec DB response time now: Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec Improvement in app layer may not lead to improvement overall

Dynamic Load Balancing: 

Dynamic Load Balancing heavy traffic load adversely impacting performance add more servers buy bigger (faster) servers need to do cost-performance analysis

Dynamic Load Balancing: 

Dynamic Load Balancing web cluster: multiple web servers single location addressed by one URL and a single virtual IP address incoming requests routed amount servers in user-transparent way switch acts as dispatcher, mapping virtual IP address to actual address

Web cluster: 

Web cluster

Networks: 

Networks Bandwidth measures the rate at which data can be sent through the network usually expressed in bps Latency time needed for a bit (or small packet) to travel across the network

Bandwidth for different types of networks: 

Bandwidth for different types of networks

Planning: 

Planning Streaming service offers training videos training session -> 15 min video at 300 Kbps What impact if videos go to 25 min? Service supports 35 simultaneous sessions Average BW needed (now) 35 * 300 Kbps = 10.5 Mbps Average number simult. sessions (now) N = 35 N =  * R 35 =  * 15  = 35/15 = 35/15 .. assume this remains the same Nnew =  * 25 = 35/15 * 25 = 58.33 Average BW needed (new) 58.33 * 300 Kbps = 17.5 Mbps

Example: 

Example training videos, avg. size 950 MB 100 students, 80% active at one time Each user requests 2 clips/hour BW needed to support: ( 0.80 * 100) * 2 * (8 * 950)/3600 sec 337.7 Mbps Need a 622 ATM network to support