Presentation Transcript
Enterprise QoS: Be Careful What You Ask For!: Enterprise QoS: Be Careful What You Ask For! Terry Gray
University of Washington
UW Network Overview: UW Network Overview 70,000 accounts
30,000 end systems
2,000 modems
50 remote sites
IP-only backbone
350 Gigabytes/day across backbone
NWNet founder, NOC
Center of statewide K20 net
Home of P/NW Gigapop, SNNAP
Executive Summary: Executive Summary Concern: we have to live with whatever’s invented here!
Focus: recurring costs and network reliability.
Hypothesis: dynamic authentication/authorization/ accounting/reservation is a bad idea within an enterprise.
Good news: your most strategic asset is now dependent on those two NT boxes… the KDC and “DEN” DBMS.
Goal: do diff-serv (including premium svc) without per-user auth and without per-pkt or per-flow lookups.
Strategy: Per port subscription for premium svc, plus diff-serv based on application delay-sensitivity
Bad Drivers: Bad Drivers Things that drive costs up
Accounting mechanisms, in general
Per-user "anything" (authentication, authorization, ...)
Keeping track of advance reservations
Reservation preemption policies
User-perceived price/performance mismatches
Free-good syndrome
Numeric multipliers (e.g. edge devices vs. core)
Things that drive reliability down
Additional (multiple/distributed) dependencies
State changes (dynamic vs. static decisions)
Complexity in general (modulo redundancy needs)
QoS Axioms: QoS Axioms QoS doesn't create bandwidth --it just determines who will get poor service at congestion points.
The most important QoS question is: how many "busy" signals constitute success for your network?
Given a busy signal, users will want to proceed anyway.
Network Managers will not trust end systems.
Biggest need is on WAN links, where it’s hardest to do! (scaling, settlements, signalling interoperability).
Multiplexing priorities on a channel improves efficiency at the cost of certainty.
Multiplexing 2 Classes of Traffic: Multiplexing 2 Classes of Traffic Load Delay Low-Priority Hi-Priority 80% 100%
QoS Conundrums: QoS Conundrums Guaranteed reservation vs. Preemption
Sender vs. Receiver control
Simplex vs. Duplex channels
Busy signals vs. getting through
Per-flow state vs. scalability
Differentiated-svcs vs. differentiated-pricing
Multicast!
QoS Policy Space: QoS Policy Space Who can do what to whom at what times…
Neither phones nor airlines provide right metaphor.
Admission Control at congestion points
Via privilege/price or via application’s need?
Via trust or policing?
Privilege/price maps to source/destination parameters:
Physical port
Node/device ID (Mac or IP address)
User/Group ID
Many delay-sensitive apps are full duplex, implying need for bidirectional reservations, thus complicating the privilege-based policy/security problem
Ours is not a perfect world...: Ours is not a perfect world... Having end-systems negotiate with net for reserved bandwidth is conceptually a Good Thing...
BUT: end-to-end per-user QoS implies 4 bad things:
Authenticated requests, and therefore dependency on KDC.
Authorization database, and therefore dependency on DBMS.
Lots of internal router state
Staff to run the authorization DB and write/administer policies.
So...
Is it necessary to invent phone-system-like accounting machinery in order to have phone-system-like QoS ??
Would site be better with investment in reservation/ preemption mechanism or in additional capacity?
More Questions...: More Questions... What do users & app developers want? Need?
Pricing objectives? (capital, use moderation)
Who/what do you police (or invoice)?
User satisfaction criteria? (delay, thruput, busys)
Should the CEO’s email beat your DVC pkts?
Does traffic shaping require user auth?
Is premium svc possible without user auth?
Does the 80/20 locality rule still hold?
How smart must the edge switch be?
Do we need diff serv on upstream traffic?
Where does/will congestion occur?
Is desktop traffic symmetric?
Is Subnet Traffic Symmetric?: Is Subnet Traffic Symmetric?
Another view of subnet asymmetry: Another view of subnet asymmetry
QoS Support Costs: QoS Support Costs Claim:
Least cost: diff serv based only on app’s need
Medium cost: add static per-port privilege
Med-High cost: per-user per-session privilege
Really high cost: per call/flow/whatever setup and accounting with advance reservations and preemption.
Per-User Dynamic QoS Cost Elements
Authentication infrastructure (Sunk cost)
Policy: Create, document, deploy, explain, verify
Handle complaints about policy
Fraud (a consequence of managing scarcity)
Hogging syndrome and/or cost avoidance
Auth may not help (e.g. spoofing app need)
Prior prevention or post audit?
Intra-enterprise sol'n may differ from inter-enterprise
QoS using both Need and Privilege: QoS using both Need and Privilege Privilege: edge switch tags pkts based on port config
Need: application/host sets TOS based on need:
Need = mostly delay sensitivity?
"L4" policy = port numbers?
Could we globally associate a Tspec with port # ??
Maybe, but what about port-agile apps (e.g. FTP, H323) ?
Hosts/apps can't be trusted, but they can ask...
Policing provides carrot/stick for traffic shaping by ES
Topological Model for Strawman: Topological Model for Strawman Router Router Border
Router Core
Switch Closet
Switch Closet
Switch Building
Switch Building
Switch Gigapop Internet2 Desktop Desktop
Strawman: Assumptions: Strawman: Assumptions Internet connectivity for site:
x Mbps "best effort”
y % "better than best effort" or "premium”
Users connected at
Bronze: shared 10 HD (legacy)
Silver: switched 10 HD
Gold: switched 100 FD
Platinum: switched 100 FD "Premium”
Diff-serv only applies to gold & platinum
Encourage use of rate-adaptive applications
Do some ex-post-facto usage monitoring
Strawman: General Approach: Strawman: General Approach Diff-serv using both privilege and need
No per-pkt or flow auth lookups, no reservations
Privilege based on port, not user… but
Could do per-user in context of “L2 session auth”
802.1p denotes Privilege/Price on ingress
TOS denotes Application need
Classification/policing in routers
SBM = router/switch config
Strawman: Responsibilities: Strawman: Responsibilities Host/application:
Traffic shaping enforced by policing in router
Optionally setting TOS bits
Can query router for liklihood of success (rsvp?)
Edge switch, upstream
If premium port, must restrict to one MAC address
Tag all frames from premium ports
Respect TOS bits for queuing?
Interior switches, downstream
Respect 802.1p priority bits set by router
First interior router
Inbound: queue using 802.1p and/or TOS bits; Police Tspec Outbound: set 802.1p frame priority based on above
Outbound border router
Set TOS based on incoming 802.1p and TOS bits
Today’s nets: Not ready for QoS: Today’s nets: Not ready for QoS CSMA/CD = Half-Duplex = Jitter
Not to mention: desktop OS jitter
But everybody's switching to switches
Microsegmentation for performance
and Security (anti-sniffing)
Need full duplex; problematic for switched-10
Cat 3 wiring issue... 100T4 where were you?
Vendors ready with WFQ, Priority, FIFO
but not much evidence of RED or RIO
LANs vs. MANs vs. WANs: LANs vs. MANs vs. WANs LANs and WANs have some opposite characteristics:
In LANs, cost argues for simpler edge devices e.g. UW has 30+ routers, 1000+ hubs/switches
In LANs, edge connect speed may be > core links e.g. labs with Gigabit Enet connections
MANs and community access links, are changing:
Speeds of ADSL and cable modems will put pressure on ISP access port speeds.