ac2005 advanced modules

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Advanced Topics in Module Design: Threadsafety and Portability: 

Advanced Topics in Module Design: Threadsafety and Portability Aaron Bannert aaron@apache.org / aaron@codemass.com http://www.codemass.com/~aaron/presentations/ apachecon2005/ac2005advancedmodules.ppt

Thread-safe: 

Thread-safe From The Free On-line Dictionary of Computing (09 FEB 02) [foldoc]: thread-safe A description of code which is either re-entrant or protected from multiple simultaneous execution by some form of mutual exclusion. (1997-01-30)

APR: 

APR The Apache Portable Runtime

The APR Libraries: 

The APR Libraries APR System-level “glue” APR-UTIL Portable routines built upon APR APR-ICONV Portable international character support

Glue Code vs. Portability Layer: 

Glue Code vs. Portability Layer “Glue Code” Common functional interface Multiple Implementations eg. db2, db3, db4, gdbm, … Sockets, File I/O, … “Portability Layer” Routines that embody portability eg. Bucket Brigades, URI routines, …

What Uses APR?: 

What Uses APR? Apache HTTPD Apache Modules Subversion Flood JXTA-C Various ASF Internal Projects ...

The Basics: 

The Basics Some APR Primitive Types

A Who’s Who of Mutexes: 

A Who’s Who of Mutexes apr_thread_mutex_t apr_proc_mutex_t apr_global_mutex_t apr_xxxx_mutex_lock() Grab the lock, or block until available apr_xxxx_mutex_unlock() Release the current lock

Normal vs. Nested Mutexes: 

Normal vs. Nested Mutexes Normal Mutexes (aka Non-nested) Deadlocks when same thread locks twice Nested Mutexes Allows multiple locks with same thread (still have to unroll though)

Reader/Writer Locks: 

Reader/Writer Locks apr_thread_rwlock_t apr_thread_rwlock_rdlock() Grab the shared read lock, blocks for any writers apr_thread_rwlock_wrlock() Grab the exclusive write lock, blocking new readers apr_thread_rwlock_unlock() Release the current lock

Condition Variables: 

Condition Variables apr_thread_cond_t apr_thread_cond_wait() Sleep until any signal arrives apr_thread_cond_signal() Send a signal to one waiting thread apr_thread_cond_broadcast() Send a signal to all waiting threads

Threads: 

Threads apr_thread_t apr_thread_create() Create a new thread (with specialized attributes) apr_thread_exit() Exit from the current thread (with a return value) apr_thread_join() Wait until another thread exits.

One-time Calls: 

One-time Calls apr_thread_once_t apr_thread_once_init() Initialize an apr_thread_once_t variable apr_thread_once() Execute the given function once

Apache 2.x Architecture: 

Apache 2.x Architecture A quick MPM overview

What’s new in Apache 2.x?: 

What’s new in Apache 2.x? Filters MPMs Multithreaded Server Native OS Optimizations SSL Encryption Improved Proxy and Cache lots more…

What is an MPM?: 

What is an MPM? “Multi-processing Module” Different HTTP server process models Each give us Platform-specific features Admin may chose suitable: Reliability Performance Features

Prefork MPM: 

Child Prefork MPM Classic Apache 1.3 model 1 connection per Child Pros: Isolates faults Performs well Cons: Scales poorly (high memory reqts.) Parent Child Child … (100s)

Worker MPM: 

Child Worker MPM Hybrid Process/Thread 1 connection per Thread Many threads per Child Pros: Efficient use of memory Highly Scalable Cons: Faults destroy all threads in that Child 3rd party libraries must be threadsafe Parent Child Child … (10s) 10s of threads

WinNT MPM: 

WinNT MPM Single Parent/Single Child 1 connection per Thread Many threads per Child Pros: Efficient use of memory Highly Scalable Cons: Faults destroy all threads Parent Child 100s of threads

The MPM Breakdown: 

The MPM Breakdown * The WinNT MPM has a single parent and a single child.

Other MPMs: 

Other MPMs BeOS Netware Threadpool Similar to Worker, experimental Leader-Follower Similar to Worker, also experimental

Apache 2.x Hooks: 

Apache 2.x Hooks Threadsafety within the Apache Framework

Useful APR Primitives: 

Useful APR Primitives mutexes reader/writer locks condition variables shared memory ...

Global Mutex Creation: 

Global Mutex Creation Create it in the Parent: Usually in post_config hook Attach to it in the Child: This is the child_init hook

Example: Create a Global Mutex: 

Example: Create a Global Mutex static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv; shm_counter_scfg_t *scfg; /* Get the module configuration */ scfg = ap_get_module_config(s->module_config, &shm_counter_module); /* Create a global mutex from the config directive */ rv = apr_global_mutex_create(&scfg->mutex, scfg->shmcounterlockfile, APR_LOCK_DEFAULT, pconf);

Example: Attach Global Mutex: 

Example: Attach Global Mutex static void shm_counter_child_init(apr_pool_t *p, server_rec *s) { apr_status_t rv; shm_counter_scfg_t *scfg = ap_get_module_config(s->module_config, &shm_counter_module); /* Now that we are in a child process, we have to * reconnect to the global mutex. */ rv = apr_global_mutex_child_init(&scfg->mutex, scfg->shmcounterlockfile, p);

Common Pitfall: 

Common Pitfall The double DSO-load problem Apache loads each module twice: First time to see if it fails at startup Second time to actually load it Also reloaded after each restart.

Avoiding the Double DSO-load: 

Avoiding the Double DSO-load Solution: Don’t create mutexes during the first load First time in post_config we set a userdata flag Next time through we look for that userdata flag if it is set, we create the mutex

What is Userdata?: 

What is Userdata? Just a hash table Associated with each pool Same lifetime as its pool Key/Value entries

Example: Double DSO-load: 

Example: Double DSO-load static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { apr_status_t rv; void *data = NULL; const char *userdata_key = "shm_counter_post_config"; apr_pool_userdata_get(&data, userdata_key, s->process->pool); if (data == NULL) { /* WARNING: This must *not* be apr_pool_userdata_setn(). */ apr_pool_userdata_set((const void *)1, userdata_key, apr_pool_cleanup_null, s->process->pool); return OK; /* This would be the first time through */ } /* Proceed with normal mutex and shared memory creation . . . */

Summary: 

Summary Create in the Parent (post_config) Attach in the Child (child_init) This works for these types: mutexes condition variables reader/writer locks shared memory etc…

Shared Memory: 

Shared Memory Efficient and portable shared memory for your Apache module

Types of Shared Memory: 

Types of Shared Memory Anonymous Requires process inheritance Created in the parent Automatically inherited in the child Name-based Associated with a file Processes need not be ancestors Must deal with file permissions

Anonymous Shared Memory: 

Anonymous Shared Memory Parent

Example: Anonymous Shmem: 

Example: Anonymous Shmem static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv; ... /* Create an anonymous shared memory segment by passing * a NULL as the shared memory filename */ rv = apr_shm_create(&scfg->counters_shm, sizeof(*scfg->counters), NULL, pconf);

Accessing the Segment: 

Accessing the Segment scfg->counters = apr_shm_baseaddr_get(scfg->counters_shm); Segment is mapped as soon as it is created It has a start address You can query that start address Reminder: The segment may not be mapped to the same address in all processes.

Windows Portability: 

Windows Portability Windows can’t inherit shared memory it has no fork() call! Solution: Just like we did with mutexes: The “child” process attaches (hint: to be portable to Windows, we can only use name-based shared memory.)

Name-based Shared Memory: 

Name-based Shared Memory

Sharing with external apps: 

Sharing with external apps Must use name-based shm Associate it with a file The other programs can attach to that file Beware of race conditions Order of file creation and attaching. Beware of weak file permissions (note previous security problem in Apache scoreboard)

Example: Name-based Shmem: 

Example: Name-based Shmem static int shm_counter_post_config(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { int rv; shm_counter_scfg_t *scfg; ... /* Get the module configuration */ scfg = ap_get_module_config(s->module_config, &shm_counter_module); /* Create a name-based shared memory segment using the filename * out of our config directive */ rv = apr_shm_create(&scfg->counters_shm, sizeof(*scfg->counters), scfg->shmcounterfile, pconf);

Example: Name-based Shmem (cont): 

Example: Name-based Shmem (cont) static void shm_counter_child_init(apr_pool_t *p, server_rec *s) { apr_status_t rv; shm_counter_scfg_t *scfg = ap_get_module_config(s->module_config, &shm_counter_module); rv = apr_shm_attach(&scfg->counters_shm, scfg->shmcounterfile, p); scfg->counters = apr_shm_baseaddr_get(scfg->counters_shm);

RMM (Relocatable Memory Manager): 

RMM (Relocatable Memory Manager) Provides malloc() and free() Works with any block of memory Estimates overhead Thread-safe Usable on shared memory segments

Efficiency: 

Efficiency Tricks of the Trade

Questions to ask yourself:: 

Questions to ask yourself: Uniprocessor or Multiprocessor? What Operating System(s)? How can we minimize or eliminate our critical code sections? Exclusive access or read/write access?

APR Lock Performance Mac OS X 10.2.x PowerPC: 

APR Lock Performance Mac OS X 10.2.x PowerPC lower is better

APR Lock Performance Linux 2.4.18 (Redhat 7.3): 

APR Lock Performance Linux 2.4.18 (Redhat 7.3) lower is better

APR Lock Performance Linux 2.4.20 SMP (Redhat 9): 

APR Lock Performance Linux 2.4.20 SMP (Redhat 9) lower is better

APR Lock Performance Solaris 2.9 x86: 

APR Lock Performance Solaris 2.9 x86 lower is better

Relative Mutex Performance Comparing Normal Mutexes: 

Relative Mutex Performance Comparing Normal Mutexes lower is better

Relative Mutex Performance Comparing Nested Mutexes: 

Relative Mutex Performance Comparing Nested Mutexes lower is better

Relative R/W Lock Performance Comparing Read/Write Locks: 

Relative R/W Lock Performance Comparing Read/Write Locks lower is better

R/W Locks vs. Mutexes: 

R/W Locks vs. Mutexes Reader/Writer locks allow parallel reads APR’s nested mutexes are slow Reader/Writer locks tend to scale much better SMP hurts lock-heavy tasks

OS Observations: 

OS Observations Solaris has very fast and stable locks Linux struggling but getting faster NTPL shows improvement in overall thread performance, but not in lock overhead. MacOS (Jaguar) is stable and moderately fast rwlocks could be improved

APR Atomics: 

APR Atomics Very Fast Operations Can implement a very fast mutex Pros: Can be very efficient (sometimes it becomes just one instruction) Cons: Produces non-portable binaries (e.g. a Solaris 7 binary may not work on Solaris 8)

Threads: 

Threads Adding threads to your Apache modules

Why use threads in Apache?: 

Why use threads in Apache? background processing asynchronous event handling pseudo-event-driven models high concurrency services low latency services

Thread Libraries: 

Thread Libraries Three major types 1:1 one kthread = one userspace thread 1:N one kthread = many userspace threads N:M many kthreads ~= many userspace threads

1:1 Thread Libraries: 

1:1 Thread Libraries E.g. Linuxthreads NPTL (linux 2.6?) Solaris 9+’s threads etc... Good with an O(1) scheduler Can span multiple CPUs Resource intensive Userspace Kernel kthread1 kthread4 kthread5 kthread6 kthread3 kthread2 thread1 Process thread2 thread3

1:N Thread Libraries: 

1:N Thread Libraries E.g. GnuPth FreeBSD <4.6? etc... Shares one kthread Can NOT span multiple CPUs Not Resource Intensive Poor with compute-bound problems Userspace Kernel kthread1 kthread4 kthread5 kthread6 kthread3 kthread2 thread1 Process thread2 thread3

M:N Thread Libraries: 

M:N Thread Libraries E.g. NPTL (from IBM) Solaris 6, 7, 8 AIX etc... Shares one or more kthreads Can span multiple CPUs Complicated Impl. Good with crappy schedulers Userspace Kernel kthread1 kthread4 kthread5 kthread6 kthread3 kthread2 thread1 Process thread2 thread3

Pitfalls: 

Pitfalls pool association cleanup registration proper shutdown async signal handling signal masks

Bonus: apr_reslist_t: 

Bonus: apr_reslist_t Resource Lists

Resource Pooling: 

Resource Pooling List of Resources Created/Destroyed as needed Useful for persistent database connections request servicing threads ...

Reslist Parameters: 

Reslist Parameters min min allowed available resources smax soft max allowed available resources hmax hard max on total resources ttl max time an available resource may idle

Constructor/Destructor: 

Constructor/Destructor Registered Callbacks Create called for new resource Destroy called when expunging old Implementer must ensure threadsafety

Using a Reslist: 

Using a Reslist Set up constructor/destructor Set operating parameters Main Loop Retrieve Resource Use Release Resource Destroy reslist

Thank You: 

Thank You The End