logging in or signing up netmon GGF perspective Arundel0 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: Embed: Flash iPad Copy Does not support media & animations WordPress Embed Customize Embed URL: Copy Thumbnail: Copy The presentation is successfully added In Your Favorites. Views: 180 Category: Product Traini.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 19, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Network Monitoring: The GGF Perspective : Transatlantic Performance Monitoring Workshop CERN, March 2004 Network Monitoring: The GGF Perspective Mark Leese Paul Mealor Contents: Contents The Grid? Why is Grid net monitoring essential? How are GGF addressing problem? Standard measurements? So what can you ask for? Is that all GGF is doing? So this is useful because? Yeah, yeah, but what does it actually do? This isn’t going to be easy, is it? Conclusion The Grid?: The Grid? Basic Grid principle: User applications (Grid apps) submit their work to the middleware which selects the 'best' resources available to runs the job. Network performance information is essential! Why is Grid net monitoring essential?: Simply consider the following use case: Resource Brokers (RBs) are responsible for finding the best resource (Computing Element, CE) to be used for a job, e.g.: Run job at B, using copy of data from A, then store results at C All other things being equal, take into account the data access requirements of the job Out of the list of CEs capable of running the job, use network cost function to identify the CE with the 'best' data access: Consider 'best' combination of data sources and sinks, e.g. IF source data = 10 GB AND resulting data will = 100 GB THEN pick CE based on performance to result storing SE (Storage Element). European Data Grid does something along these lines. Why is Grid net monitoring essential? How are GGF addressing problem?: By producing standards relating to network monitoring services. First with the Network Measurements Working Group (NM-WG): Defining XML schemas for requesting tests and historic data, and publishing network measurements Aims: to standardise communication, and… …use XML, for web services and OGSI model Simple use case… All request andamp; result messages can be formatted using standardised schemas = truly powerful combination How are GGF addressing problem? Network Monitoring Service Standard measurements?: Standard measurements? Schemas based on NM-WG proposed measurement classification system: describes a set of network characteristics and their classification hierarchy used for creating common schemata for describing network monitoring data using a standard classification maximises data portability description + hierarchy So what can you ask for 1?: So what can you ask for 1? Initial schema requirements set. Four sections: what, where, when, how What: Use DAMED style names, e.g. path.delay.oneWay Can request statistical data, with a specified sample interval, e.g. daily averages for one-way delay over the last month Single characteristics (for now), but multiple statistics Can limit number of returned results to avoid overload Where: Source and destination Flexible: IPv4|6, hostnames, or textual names such as 'core router' and 'edge router' (e.g. for security) So what can you ask for 2?: When: Primary means of specifying the time period we are interested in (for tests or data retrieval) is: target Time (an absolute time or 'now') relative +ve and -ve time tolerances… -ve time tolerance = 600 secs target_time = 14:00 -ve time tolerance = 600 secs So what can you ask for 2? So what can you ask for 3?: So what can you ask for 3? Setting limit on number of results controls possibilities: when number of results = 'all': supply all matching measurements in given time period when number of results = 1: time data defines the period for which a measurement is considered to be acceptable, e.g. 14:00 +/- 10 minutes 'now' evaluated as late as possible, to avoid problems with transmission and processing delays absolute time format: secs from 1/1/1970, XML or NTP Can also give start andamp; end time if you wish, but values are mapped to target_time andamp; number of results will = all 'testing interval' controls how often tests are run So what can you ask for 4096?: So what can you ask for 4096? How: Can supply values to act as parameters for tests, or filters for querying past data, including tool name. Uses param specific tags or list of parameters: andlt;remoteParamListandgt;-a –b 10 -candlt;/remoteParamListandgt; Possible to set ranges for parameters… andlt;tcpBufferSize range='max'andgt;4194304andlt;/tcpBufferSizeandgt; andlt;tcpBufferSize range='min'andgt;1048576andlt;/tcpBufferSizeandgt; …and orders of preference. Unspecified params use receiving system’s defaults Can request reporting of actual param values used Can control whether a test is ever run andlt;tcpBufferSizeandgt;4194304andlt;/tcpBufferSizeandgt; andlt;tcpBufferSizeandgt;1048576andlt;/tcpBufferSizeandgt; Is that all GGF is doing?: Is that all GGF is doing? No, GGF Grid High Performance Networking Research Group also hard at work: For last two days you've seen fuzzy diagrams from people talking about how they plan to model the network as a Grid resource so they perform 'advance reservation' etc. Well GHPN is doing the same And it will be better than all the others put together! Errr, maybe! Like schemas, some overlap (shared effort?) with others: Internet2, DANTE… So this is useful because?: So this is useful because? Computing, storage and interconnecting network are all resources: Easier to manage All can be reserved Capability discovery Exploit commonalities Forms integrated stack But this is a network monitoring workshop… Diagram shows potential clients: numerous and varied Yeah, yeah, but what does it actually do?: Yeah, yeah, but what does it actually do? Historic measurement data Predictions Allow clients to run scheduled tests On-demand (real-time) tests Provide less-frequently monitored information (network route, topology…) Event notifications, for all of the above Across multiple administrative domains for all of the above And this is required for: Resource selection (use case highlighted at start) + replica management and time selection Fault detection and analysis (self-healing Grids) SLA monitoring: could be crucial for the 'utility computing' model (resources on demand) associated with the Grid This isn’t going to be easy, is it?: This isn’t going to be easy, is it? No, there are several outstanding issues that NM-WG, GHPN (and the others) will have to consider: Most monitoring and management tools, systems and architectures focus on the IP domain. How do we now handle light paths, lambdas, whatever? These will be required! People keep talking of predictions as if network monitoring people frequently win the lottery and make great stock market gains. Service discovery: where are the monitoring services? Network information service vrs monitoring service? Can you explain the division of responsibility! Network routing and topology. How is this detected, recorded and disseminated? And the usual suspects, e.g. AAA But the benefits will be worth the effort! Conclusion: Conclusion Grid network monitoring crucial to the Grid Grid network monitoring crucial to the Grid resource selection, replica management, time selection fault detection and analysis SLA monitoring There’s a long way to go, but… Grid network monitoring crucial to the Grid …and you knew that already! The End: ? ? ? ? ? Questions email@example.com firstname.lastname@example.org GET INVOLVED! http://www-didc.lbl.gov/NMWG/ http://forge.gridforum.org/projects/ghpn-rg The End You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.