logging in or signing up gc Pumbaa Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 883 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: January 03, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Garbage collection: Garbage collection David Walker CS 320GC: GC Every modern programming language allows programmers to allocate new storage dynamically New records, arrays, tuples, objects, closures, etc. Every modern language needs facilities for reclaiming and recycling the storage used by programs It’s usually the most complex aspect of the run-time system for any modern language (Java, ML, Lisp, Scheme, Modula, …)Memory layout: Memory layout static data stack heap grows to preset limit new pages allocated via calls to OS physical memory TLB address translation per process virtual memoryGC: GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage? GC: GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage? No. It’s undecidable. Eg: if long-and-tricky-computation then use v else don’t use v GC: GC Since determining which objects are garbage is tricky, people have come up with many different techniques It’s the programmers problem: Explicit allocation/deallocation Reference counting Tracing garbage collection Mark-sweep, copying collection Generational GC Explicit MM: Explicit MM User library manages memory; programmer decides when and where to allocate and deallocate void* malloc(long n) void free(void *addr) Library calls OS for more pages when necessary Advantage: people are smart Disadvantage: people are dumb and they really don’t want to bother with such details if they can avoid itExplicit MM: Explicit MM How does malloc/free work? Blocks of unused memory stored on a freelist malloc: search free list for usable memory block free: put block onto the head of the freelist freelistExplicit MM: Explicit MM Drawbacks malloc is not free: we might have to do a significant search to find a big enough block As program runs, the heap fragments leaving many small, unusable pieces freelistExplicit MM: Explicit MM Solutions: Use multiple free lists, one for each block size Malloc and free become O(1) But can run out of size 4 blocks, even though there are many size 6 blocks or size 2 blocks! Blocks are powers of 2 Subdivide blocks to get the right size Adjacent free blocks merged into the next biggest size Overall, external fragmentation traded for internal fragmentation 30% wasted space No magic bullet: memory management always has a costAutomatic MM: Automatic MM Languages with explicit MM are much harder to program than languages with automatic MM Always worrying about dangling pointers, memory leaks: a huge software engineering burden Impossible to develop a secure system, impossible to use these languages in emerging applications involving mobile code Soon, languages with unsafe, explicit MM will all but disappearAutomatic MM: Automatic MM Question: how do we decide which objects are garbage? We conservatively approximate Normal solution: an object is garbage when it becomes unreachable from the roots The roots = registers, stack, global static data If there is no path from the roots to an object, it cannot be used later in the computation so we can safely recycle its memoryObject Graph: Object Graph How should we test reachability? r1 stack r2Reference Counting: Reference Counting Keep track of the number of pointers to each object (the reference count). When the reference count goes to 0, the object is unreachable garbageObject Graph: Object Graph Reference counting can’t detect cycles r1 stack r2Reference Counting: Reference Counting In place of a single assignment x.f = p: z = x.f c = z.count c = c – 1 z.count = c If c = 0 call putOnFreeList(z) x.f = p c = p.count c = c + 1 p.Count = c Ouch, that hurts performace-wise! Dataflow analysis can eliminate some increments and decrements, but many remain Reference counting used in some special cases but not usually as the primary GC mechanism in a language implementationCopying Collection: Copying Collection Basic idea: use 2 heaps One used by program The other unused until GC time GC: Start at the roots & traverse the reachable data Copy reachable data from the active heap (from-space) to the other heap (to-space) Dead objects are left behind in from space Heaps switch rolesCopying Collection: Copying Collection to-space from-space roots Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan nextCopying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Done when next = scanCopying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Done when next = scanCopying GC: Copying GC Pros Simple & collects cycles Run-time proportional to # live objects Automatic compaction eliminates fragmentation Fast allocation: pointer increment by object size Cons Precise type information required (pointer or not) Tag bits take extra space; normally use header word Twice as much memory used as program requires Usually, we anticipate live data will only be a small fragment of store Allocate until 70% full From-space = 70% heap; to-space = 30% Long GC pauses = bad for interactive, real-time appsBaker’s Concurrent GC: Baker’s Concurrent GC GC pauses avoided by doing GC incrementally Program only holds pointers to to-space On field fetch, if pointer to from-space, copy object and fix pointer Extra fetch code = 20% performance penalty On swap, copy roots as beforeGenerational GC: Generational GC Empirical observation: if an object has been reachable for a long time, it is likely to remain so Empirical observation: in many languages (especially functional languages), most objects died young Conclusion: we save work by scanning the young objects frequently and the old objects infrequentlyGenerational GC: Generational GC Assign objects to different generations G0, G1,… G0 contains young objects, most likely to be garbage G0 scanned more often than G1 Common case is two generations (new, tenured) Roots for GC of G0 include all objects in G1 in addition to stack, registersGenerational GC: Generational GC How do we avoid scanning tenured objects? Observation: old objects rarely point to new objects Normally, object is created and when it initialized it will point to older objects, not newer ones Only happens if old object modified well after it is created In functional languages that use mutation infrequently, pointers from old to new are very uncommon Compiler inserts extra code on object field pointer write to catch modifications to old objects Remembered set is used to keep track of objects that point into younger generation. Remembered set included in set of roots for scanning.Generational GC: Generational GC Other issues When do we promote objects from young generation to old generation Usually after an object survives a collection, it will be promoted How big should the generations be? Appel says each should be exponentially larger than the last When do we collect the old generation? After several minor collections, we do a major collectionGenerational GC: Generational GC Other issues Sometimes different GC algorithms are used for the new and older generations. Why? Because the have different characteristics Copying collection for the new Less than 10% of the new data is usually live Copying collection cost is proportional to the live data Mark-sweep for the old We’ll get to this in the next class! You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
gc Pumbaa Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 883 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: January 03, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Garbage collection: Garbage collection David Walker CS 320GC: GC Every modern programming language allows programmers to allocate new storage dynamically New records, arrays, tuples, objects, closures, etc. Every modern language needs facilities for reclaiming and recycling the storage used by programs It’s usually the most complex aspect of the run-time system for any modern language (Java, ML, Lisp, Scheme, Modula, …)Memory layout: Memory layout static data stack heap grows to preset limit new pages allocated via calls to OS physical memory TLB address translation per process virtual memoryGC: GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage? GC: GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage? No. It’s undecidable. Eg: if long-and-tricky-computation then use v else don’t use v GC: GC Since determining which objects are garbage is tricky, people have come up with many different techniques It’s the programmers problem: Explicit allocation/deallocation Reference counting Tracing garbage collection Mark-sweep, copying collection Generational GC Explicit MM: Explicit MM User library manages memory; programmer decides when and where to allocate and deallocate void* malloc(long n) void free(void *addr) Library calls OS for more pages when necessary Advantage: people are smart Disadvantage: people are dumb and they really don’t want to bother with such details if they can avoid itExplicit MM: Explicit MM How does malloc/free work? Blocks of unused memory stored on a freelist malloc: search free list for usable memory block free: put block onto the head of the freelist freelistExplicit MM: Explicit MM Drawbacks malloc is not free: we might have to do a significant search to find a big enough block As program runs, the heap fragments leaving many small, unusable pieces freelistExplicit MM: Explicit MM Solutions: Use multiple free lists, one for each block size Malloc and free become O(1) But can run out of size 4 blocks, even though there are many size 6 blocks or size 2 blocks! Blocks are powers of 2 Subdivide blocks to get the right size Adjacent free blocks merged into the next biggest size Overall, external fragmentation traded for internal fragmentation 30% wasted space No magic bullet: memory management always has a costAutomatic MM: Automatic MM Languages with explicit MM are much harder to program than languages with automatic MM Always worrying about dangling pointers, memory leaks: a huge software engineering burden Impossible to develop a secure system, impossible to use these languages in emerging applications involving mobile code Soon, languages with unsafe, explicit MM will all but disappearAutomatic MM: Automatic MM Question: how do we decide which objects are garbage? We conservatively approximate Normal solution: an object is garbage when it becomes unreachable from the roots The roots = registers, stack, global static data If there is no path from the roots to an object, it cannot be used later in the computation so we can safely recycle its memoryObject Graph: Object Graph How should we test reachability? r1 stack r2Reference Counting: Reference Counting Keep track of the number of pointers to each object (the reference count). When the reference count goes to 0, the object is unreachable garbageObject Graph: Object Graph Reference counting can’t detect cycles r1 stack r2Reference Counting: Reference Counting In place of a single assignment x.f = p: z = x.f c = z.count c = c – 1 z.count = c If c = 0 call putOnFreeList(z) x.f = p c = p.count c = c + 1 p.Count = c Ouch, that hurts performace-wise! Dataflow analysis can eliminate some increments and decrements, but many remain Reference counting used in some special cases but not usually as the primary GC mechanism in a language implementationCopying Collection: Copying Collection Basic idea: use 2 heaps One used by program The other unused until GC time GC: Start at the roots & traverse the reachable data Copy reachable data from the active heap (from-space) to the other heap (to-space) Dead objects are left behind in from space Heaps switch rolesCopying Collection: Copying Collection to-space from-space roots Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan nextCopying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Copying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Done when next = scanCopying GC: Copying GC Cheny’s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root scan next Done when next = scanCopying GC: Copying GC Pros Simple & collects cycles Run-time proportional to # live objects Automatic compaction eliminates fragmentation Fast allocation: pointer increment by object size Cons Precise type information required (pointer or not) Tag bits take extra space; normally use header word Twice as much memory used as program requires Usually, we anticipate live data will only be a small fragment of store Allocate until 70% full From-space = 70% heap; to-space = 30% Long GC pauses = bad for interactive, real-time appsBaker’s Concurrent GC: Baker’s Concurrent GC GC pauses avoided by doing GC incrementally Program only holds pointers to to-space On field fetch, if pointer to from-space, copy object and fix pointer Extra fetch code = 20% performance penalty On swap, copy roots as beforeGenerational GC: Generational GC Empirical observation: if an object has been reachable for a long time, it is likely to remain so Empirical observation: in many languages (especially functional languages), most objects died young Conclusion: we save work by scanning the young objects frequently and the old objects infrequentlyGenerational GC: Generational GC Assign objects to different generations G0, G1,… G0 contains young objects, most likely to be garbage G0 scanned more often than G1 Common case is two generations (new, tenured) Roots for GC of G0 include all objects in G1 in addition to stack, registersGenerational GC: Generational GC How do we avoid scanning tenured objects? Observation: old objects rarely point to new objects Normally, object is created and when it initialized it will point to older objects, not newer ones Only happens if old object modified well after it is created In functional languages that use mutation infrequently, pointers from old to new are very uncommon Compiler inserts extra code on object field pointer write to catch modifications to old objects Remembered set is used to keep track of objects that point into younger generation. Remembered set included in set of roots for scanning.Generational GC: Generational GC Other issues When do we promote objects from young generation to old generation Usually after an object survives a collection, it will be promoted How big should the generations be? Appel says each should be exponentially larger than the last When do we collect the old generation? After several minor collections, we do a major collectionGenerational GC: Generational GC Other issues Sometimes different GC algorithms are used for the new and older generations. Why? Because the have different characteristics Copying collection for the new Less than 10% of the new data is usually live Copying collection cost is proportional to the live data Mark-sweep for the old We’ll get to this in the next class!