Day 1 Intro to Usenet Josh Gagliardi

Uploaded from authorPOINTLite
Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Usenet Training: 

Usenet Training 5-7 May 2004 Disney’s Coronado Springs Resort

Introductions: 

Introductions Highwinds Software Purveyors of Cyclone, Typhoon, Twister, Tornado, and Hurricane Long History of high-performance USENET software Josh Gagliardi Employee #2 at Highwinds Software Currently a member of the Highwinds Software Technology Board

Agenda: 

Agenda Day One USENET/NNTP Introduction Servers Introduction Day Two Building and Enterprise Network Highwinds APIs Highwinds Terminology and Advanced Usage Day Three Server Administration Disaster Recovery Customer-Driven Discussions

Day One: 

Day One Introduction to USENET and NNTP

QUIZ: 

QUIZ Rate Yourself from 1 to 6 Have you ever read/posted to/used USENET? Have you ever installed or managed news server software? Do you know what the Path header line is for? Do you know how moderated groups work? What is “The Freenix Top 1000”? What are the big six hierarchies?

What are your problems? : 

What are your problems? New news administrator No machines No bandwidth No money Growing too rapidly, not enough space Too much traffic, we need more capacity Too much traffic, we want it to stop “Well… my wife/husband left me..”

Communications Technology Lifecycle: 

Communications Technology Lifecycle Phase I: Nerd Toy Phase II: Pornography Phase III: Business Use Phase IV: Mainstream Use Predecessors: Cable & Satellite TV, VCRs, the Web Followers: Web Phones, Camera Phones, Video Phones

What does News look like?: 

What does News look like?

How do you interact with News?: 

How do you interact with News? Newsgroups Reading Posting

Usenet saved this dog!: 

Usenet saved this dog!

How much News is there?: 

How much News is there? 1985: One person can read every message in every group and still get work done. 1986: RFC 977 approved 1990: One group, rec.arts.startrek, is hard to keep up with. Following ten groups is easy. 1997: Cyclone released 2000: 250K articles/day, 7Gb in traffic One Year Ago: 1-2M articles/day, 200 Gb

TODAY 2-4M articles/day 1.1 Tb daily traffic: 

TODAY 2-4M articles/day 1.1 Tb daily traffic

Why Is News Important?: 

Why Is News Important? Network traffic is huge Machines are costly Mistakes are costly Users are dedicated: news makes the phone ring!

How does News get around?: 

How does News get around? Usenet ?

Inside Usenet “Flood Fill”: 

Inside Usenet “Flood Fill” Transit Servers Reader Servers Reader Servers Readers Readers Readers Readers Readers

Transit Servers: 

Transit Servers Communicate with other servers Manage article propagation Cyclone is a transit server Highwinds Innovation

Reader Servers: 

Reader Servers Communicate with newsreaders and with transit servers Manage article organization Serve as injection point for new news Typhoon, Twister, and Tornado are reader servers Transit Server Reader Server Readers

Newsreaders: 

Newsreaders Communicate with reader servers Present a coherent view of the news stream to an individual user Manage creation of new news articles Mozilla, Outlook Express, and Agent are newsreaders Special-Purpose Readers

Inside Usenet: 

Inside Usenet Transit Servers Reader Servers Reader Servers Readers Readers Readers Readers Readers

Is News the same as Mail?: 

Is News the same as Mail? No: Mail is delivered point-to-point while news is broadcast. No: Each user pays for the storage of his own mail. No: If everyone voiced their opinions to 50,000 people with email, the mail server might well fill or fail.

Is News the same as Chat?: 

Is News the same as Chat? No: If you go away on vacation, you can catch up on your news. No: Your comments are reliably gatewayed to machines around the world. People don’t have to be able to connect to your chat server to talk to you. No: You can use attachments in newsgroups where such behavior is welcome.

How quickly does News get around?: 

How quickly does News get around? Before Cyclone, it could take hours and sometimes days for articles to make it to every leaf node. Now, articles typically arrive at well-connected sites within a few seconds and to less well-connected sites within minutes.

What is Next?: 

What is Next? Discussion Message Boards Replicated Discussion Message Boards Archived Instant Messages Archived Mailing Lists Hubs for Peer-to-Peer Numbering Servers Archive Servers ?????????

Other Servers: 

Other Servers Two other servers, both Highwinds inventions: NUMBERING SERVER - Hurricane ARCHIVE SERVER - Tornado BE Tornado BE Tornado Hurricane

Anatomy of an Article: 

Anatomy of an Article Governed by RFC 850 (same as SMTP) Header + Body Header Body

Article Anatomy - Gory Details: 

Article Anatomy - Gory Details Know your separators! CR = ^M = \r LF = ^J = \n HeaderName: Value HeaderName: Value Body TERMINATOR \r\n \r\n \r\n.\r\n SEPARATOR \r\n\r\n

Article Headers: 

Article Headers QUIZ

Article Headers, Continued...: 

Article Headers, Continued... The basics: From Subject Newsgroups Path Message-ID XRef Path Date NNTP-Posting-Host NNTP-Posting-Date X-Trace References

An Article: 

An Article Path:ndnws01.ne.mediaone.net!chnws05.ne.mediaone.net!24.128.1.91!chnws02.mediaone.net!192.148.253.68!netnews.com!newshub.northeast.verio.net!nuq-peer.news.verio.net!news.verio.net!dfw-artgen.news.verio.net!ord-read.news.verio.net.POSTED!not-for-mail From: "Henry C. Barta" <hbarta@miles.wwa.com> Subject: Re: old IBM thinkpads and linux? Newsgroups: comp.os.linux.portable References: <19990728232115.18753.00003295@ng-ba1.aol.com> Message-ID: <hStB3.112$DY.3777@ord-read.news.verio.net> Date: Wed, 08 Sep 1999 13:52:13 GMT I ran Linux on a 750Cs - 33 MHz 486, 12MB Ram and 360 MB hardrive. I was able to install X and run that. It was pretty OK

A Day in the Life of an Article: 

A Day in the Life of an Article Creation Posting Feeding / Transit - By Message-ID Storage - By Group and Number Reading Expiration

Article Creation and Posting: 

Article Creation and Posting Minimal requirements on the user The First Time Is Always Special: POST vs. IHAVE Post Filtering SPAM prevention Accountability Moderated Groups Upstream

Article Feeding and Transit: 

Article Feeding and Transit Each server offers articles to the other servers it knows about, its “peers” During feeding Message-ID matters more than Group/Article Number Servers offer articles with IHAVE Servers refuse already-seen articles The “history” IHAVE is chatty... CHECK / TAKETHIS

Article Feeding II: NNTP STREAMING: 

Article Feeding II: NNTP STREAMING

Article Feeding III: Adaptive Streaming: 

Article Feeding III: Adaptive Streaming Key Question: How many CHECKs should you send? Multiple modes, appropriate for all different load conditions Optimized for “getting articles into” another server Key Cyclone differentiator

Article Storage: 

Article Storage In the Highwinds servers, articles are stored in Spools. Articles are indexed for retrieval by readers The Active File and the Overview Database create the “illusion” of groups.

Space Utilization : 

Space Utilization Spools are always full Articles expire based on SPACE, not TIME You decide how much space to allocate, by hierarchy or article size New data overwrites old data, as needed, and without any pause for cleanup

Data Storage Philosophy: 

Data Storage Philosophy Customized Data Structures Maximize Utilization Maximize Locality Avoid inode hits Allocate space at install time RESULT: Block-Oriented Storage

Reading Articles: 

Reading Articles LIST ACTIVE: What groups exist? GROUP: Select a group for reading XOVER: What articles exist? ARTICLE: Finally, read an article All of these commands depend on article numbers.

How to Write a News Reader: 

How to Write a News Reader

Slide41: 

How to Write a News Reader II

Structure of the Highwinds Servers: 

Structure of the Highwinds Servers Installation uses a hellishly complicated object-oriented GUI-driven tool called... tar. Three configuration files control the server behavior: start.conf for command-line arguments <server>.conf for storage declarations and whole-server parameters feeds.conf for controlling how the server communicates with other servers and with users

start.conf: 

start.conf Contains parameters most likely to change during server tuning Is called by the bin/start script

<server>.conf: 

<server>.conf Contains mostly parameters having to do with storage Declares the filesystem paths for storage objects SPOOLS OVERVIEWS HISTORY ACTIVE FILE OVERVIEW CACHES

feeds.conf: 

feeds.conf Allows you to tune server-server and server-browser communication in excruciating detail Virtual Servering - the server appears different to different users Fine-grained feeding control - Cyclone Virtual-servering primitives: IncomingHostnames, rate limiting, groups visible Cyclone feeding primitives: backlogs, retry and failover settings

Virtual Servering on Steroids: Authentication Programs: 

Virtual Servering on Steroids: Authentication Programs Ultra-fine control Server-spawned program Multiple instances allowed Full real-time override of all virtual server parameters Intercepts allowed at connect Can force users to authenticate with username/password

Death to Spammers: SPAM filtering: 

Death to Spammers: SPAM filtering Server-spawned slave program With -fastfilter, parallel filters and shared-memory communication Program allows or denies each incoming article The post filter can do spam filtering for locally-sourced articles Delayed acknowledgement False acknowledgement

Details of Content Control : 

Details of Content Control Subscription FilterSubscription Globs Used for spools, overview caches, and virtual servering Examples: Subscription * FilterSubscription special.* Subscription basic.* FilterSubscription !basic.* CROSSPOSTING

Logfiles and Log Rotation: 

Logfiles and Log Rotation Many logfiles available: article log 0x, 0i, 0s, etc. Stats.in Stats.out Stats.reader Stats.group Logs are buffered bin/statsnow Rotation Recipe mv log log.old bin/statsnow sleep 5 gzip log.old

FINAL ADVICE: 

FINAL ADVICE READ the config files. They contain great examples. Use bin/validate before bin/restart. Decide your storage policies early. Get lots of spindles working for you. Check the syslog!