logging in or signing up approx matches Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 14 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: August 02, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Search for Approximate Matchesin Large Databases: Search for Approximate Matches in Large Databases Eugene Fink Jaime Carbonell Aaron Goldstein Philip Hayes Motivation: Motivation Fast identification of approximate matches in large sets of records. Applications: Medical databases Customer records National security Outline: Outline Records and queries Search for matches Experimental results Table of records: Table of records We specify a table of records by a list of attributes. Example We can describe patients in a hospital by their sex, age, and diagnosis. Records and queries: Records and queries A record includes a specific value for each attribute. A query may include lists of values and numeric ranges. Query Sex: male, female Age: 20..40 Dx: asthma, flu Query types: Query types Exact matches: Exact matches A record is an exact match for a query if every value in the record belongs to the respective range in the query. Approximate matches: Approximate matches A record is an approximate match for a query if it is 'close' to the query region. Record Approximate queries: Approximate queries An approximate query includes: Point or region Distance function Number of matches Distance limit Outline: Outline Records and queries Search for matches Experimental results Indexing structure: Indexing structure male female 30 40 50 30 asthma flu fracture ulcer asthma flu Maintain a PATRICIA tree of records Search for matches: Search for matches male female 30 40 50 30 asthma flu fracture ulcer asthma flu Depth-first search for exact matches Best-first search for approximate matches Outline: Outline Records and queries Search for matches Experimental results Performance: Performance : Twenty-one attributes 1.6 million records Experiments with a database of all patients admitted to Massachusetts hospitals from October 2000 to September 2002 Use of a Pentium computer: 2.4 GHz CPU 1 Gbyte memory 400 MHz bus Variables: Variables Control variables: Number of records Memory size Query type Measurements: Retrieval time Small memory: Small memory Number of records: 100 to 1,672,016 Memory size: 4 MByte Large memory: Large memory Number of records: 1,672,016 Memory size: 64 to 1,024 MByte Summary: Summary Retrieval time grows as fractional power (about 0.5) of database size If we extrapolate this growth rate, retrieval times are reasonable for very large databases Summary: Summary Retrieval time grows as fractional power (about 0.5) of database size If we extrapolate this growth rate, retrieval times are reasonable for very large databases: You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
approx matches Freedom Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 14 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: August 02, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Search for Approximate Matchesin Large Databases: Search for Approximate Matches in Large Databases Eugene Fink Jaime Carbonell Aaron Goldstein Philip Hayes Motivation: Motivation Fast identification of approximate matches in large sets of records. Applications: Medical databases Customer records National security Outline: Outline Records and queries Search for matches Experimental results Table of records: Table of records We specify a table of records by a list of attributes. Example We can describe patients in a hospital by their sex, age, and diagnosis. Records and queries: Records and queries A record includes a specific value for each attribute. A query may include lists of values and numeric ranges. Query Sex: male, female Age: 20..40 Dx: asthma, flu Query types: Query types Exact matches: Exact matches A record is an exact match for a query if every value in the record belongs to the respective range in the query. Approximate matches: Approximate matches A record is an approximate match for a query if it is 'close' to the query region. Record Approximate queries: Approximate queries An approximate query includes: Point or region Distance function Number of matches Distance limit Outline: Outline Records and queries Search for matches Experimental results Indexing structure: Indexing structure male female 30 40 50 30 asthma flu fracture ulcer asthma flu Maintain a PATRICIA tree of records Search for matches: Search for matches male female 30 40 50 30 asthma flu fracture ulcer asthma flu Depth-first search for exact matches Best-first search for approximate matches Outline: Outline Records and queries Search for matches Experimental results Performance: Performance : Twenty-one attributes 1.6 million records Experiments with a database of all patients admitted to Massachusetts hospitals from October 2000 to September 2002 Use of a Pentium computer: 2.4 GHz CPU 1 Gbyte memory 400 MHz bus Variables: Variables Control variables: Number of records Memory size Query type Measurements: Retrieval time Small memory: Small memory Number of records: 100 to 1,672,016 Memory size: 4 MByte Large memory: Large memory Number of records: 1,672,016 Memory size: 64 to 1,024 MByte Summary: Summary Retrieval time grows as fractional power (about 0.5) of database size If we extrapolate this growth rate, retrieval times are reasonable for very large databases Summary: Summary Retrieval time grows as fractional power (about 0.5) of database size If we extrapolate this growth rate, retrieval times are reasonable for very large databases: