logging in or signing up mysql Modest Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 2962 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 30, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Experience Unicode Enabling MySQL: Experience Unicode Enabling MySQL Thomas Emerson Senior Software Engineer Basis Technology Corp.Overview: Overview IntroductionOverview: Overview Introduction What is mySQL?Overview: Overview Introduction What is mySQL? Character Set Architecture in mySQLOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased ImplementationOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased Implementation SummaryOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased Implementation Summary Q & APreliminaries: Preliminaries Cell Phones? Just say vibrate.Preliminaries: Preliminaries Cell Phones? Just say vibrate. If you need to take a call, please get up and leave.Preliminaries: Preliminaries Cell Phones? Just say vibrate. If you need to take a call, please get up and leave. If you fall asleep, you will be rediculed.Assumptions: Assumptions Unicode and Unicode TerminologyAssumptions: Assumptions Unicode and Unicode Terminology Basic RDBMS conceptsIntroduction: Introduction Who am I and why am I here?Introduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Simplified and Traditional Chinese Japanese Korean Thai Western and Eastern European LanguagesIntroduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Accessability Across Platforms Web-based InterfaceIntroduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Accessability Low Impact Could not take cycles (hard-, soft-, or wet-) from our Oracle 8i system and its DBA. Didn’t have big iron available No budgetWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine High Performance Robust PopularWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Entry-level SQL92 ODBC Level 0-2What is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Advanced (though complex) authentication systemWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Advanced (though complex) authentication system Extra datatypes, including ENUM and SETWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Excellent Support for Legacy Encodings Big Five, GB 2312, and GBK EUC-JP and ShiftJIS TIS 620 ISO-Latin-1 KOI-8RWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Excellent Support for Legacy Encodings C and C++ APIs, and bindings for Python, Perl, PHP, and others.I18N Architecture in mySQL: I18N Architecture in mySQL Server can be built to support multiple encodings Databases can only contain a single character set Support for single- and double-byte character sets.Phased Implementation: Phased Implementation UTF-8 in and out UTF-8 as a multibyte encoding UCS-2 as the internal encodingPhase I: Phase I No Unicode-specific features. Unicode support is piggy-backed as ISO-Latin-1. This is surprisingly effective, but:Phase I: Phase I No Unicode-specific features. Unicode support is piggy-backed as ISO-Latin-1. This is surprisingly effective, but: Wild card searches are awkward (since each “character” is composed of up to three Latin 1 characters) No regular expression support No collation supportPhase I (cont.): Phase I (cont.) The Font End problem was solved with PHP (www.php.org) An HTML front end using UTF-8 as the document charset PHP not Unicode aware, but it just doesn’t matter!Phase II: Phase II Treat UTF-8 as a multibyte character setPhase II: Phase II Treat UTF-8 as a multibyte character set Simple collation modelPhase II: Phase II Treat UTF-8 as a multibyte character set Simple collation model Still no regular expression supportPhrase II (cont.): Phrase II (cont.) Rosette is used as the Unicode layer No longer limited to a single character set But now we need to differentiate between language and script!Phrase III: Phrase III Use UCS-2 as the internal character representation. Transcoding to legacy encodings as needed, so existing databases will continue to work. Each column can have a *different* legacy encodingPhase III (Cont): Phase III (Cont) Data can be imported, transcoded and filtered using Rosette’s full transform functionality. Hankaku/Zenkaku transformation Case Conversion SGML Entity Folding Ad nauseumStatus: Status Phase I is complete and live. Phase II is underway, as time allows. UTF-8 support in place. Collation still going. Phase III is planned, but not yet started.Status (cont.): Status (cont.) Removal of Rosette to use glibc features and/or ICU Measure and improve performance All will be released to the MySQL codeQ&A: Q&A Tom Emerson tree@basistech.com Slides and other information available at http://cymru.basistech.com/iuc17 You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
mysql Modest Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 2962 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 30, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Experience Unicode Enabling MySQL: Experience Unicode Enabling MySQL Thomas Emerson Senior Software Engineer Basis Technology Corp.Overview: Overview IntroductionOverview: Overview Introduction What is mySQL?Overview: Overview Introduction What is mySQL? Character Set Architecture in mySQLOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased ImplementationOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased Implementation SummaryOverview: Overview Introduction What is mySQL? Character Set Architecture in mySQL Phased Implementation Summary Q & APreliminaries: Preliminaries Cell Phones? Just say vibrate.Preliminaries: Preliminaries Cell Phones? Just say vibrate. If you need to take a call, please get up and leave.Preliminaries: Preliminaries Cell Phones? Just say vibrate. If you need to take a call, please get up and leave. If you fall asleep, you will be rediculed.Assumptions: Assumptions Unicode and Unicode TerminologyAssumptions: Assumptions Unicode and Unicode Terminology Basic RDBMS conceptsIntroduction: Introduction Who am I and why am I here?Introduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Simplified and Traditional Chinese Japanese Korean Thai Western and Eastern European LanguagesIntroduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Accessability Across Platforms Web-based InterfaceIntroduction: Introduction Who am I and why am I here? Large amounts of linguistic and lexicographic data Accessability Low Impact Could not take cycles (hard-, soft-, or wet-) from our Oracle 8i system and its DBA. Didn’t have big iron available No budgetWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine High Performance Robust PopularWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Entry-level SQL92 ODBC Level 0-2What is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Advanced (though complex) authentication systemWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Advanced (though complex) authentication system Extra datatypes, including ENUM and SETWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Excellent Support for Legacy Encodings Big Five, GB 2312, and GBK EUC-JP and ShiftJIS TIS 620 ISO-Latin-1 KOI-8RWhat is mySQL?: What is mySQL? GPL’d buzz-word compliant SQL engine Supports Industry Standards Extensions Excellent Support for Legacy Encodings C and C++ APIs, and bindings for Python, Perl, PHP, and others.I18N Architecture in mySQL: I18N Architecture in mySQL Server can be built to support multiple encodings Databases can only contain a single character set Support for single- and double-byte character sets.Phased Implementation: Phased Implementation UTF-8 in and out UTF-8 as a multibyte encoding UCS-2 as the internal encodingPhase I: Phase I No Unicode-specific features. Unicode support is piggy-backed as ISO-Latin-1. This is surprisingly effective, but:Phase I: Phase I No Unicode-specific features. Unicode support is piggy-backed as ISO-Latin-1. This is surprisingly effective, but: Wild card searches are awkward (since each “character” is composed of up to three Latin 1 characters) No regular expression support No collation supportPhase I (cont.): Phase I (cont.) The Font End problem was solved with PHP (www.php.org) An HTML front end using UTF-8 as the document charset PHP not Unicode aware, but it just doesn’t matter!Phase II: Phase II Treat UTF-8 as a multibyte character setPhase II: Phase II Treat UTF-8 as a multibyte character set Simple collation modelPhase II: Phase II Treat UTF-8 as a multibyte character set Simple collation model Still no regular expression supportPhrase II (cont.): Phrase II (cont.) Rosette is used as the Unicode layer No longer limited to a single character set But now we need to differentiate between language and script!Phrase III: Phrase III Use UCS-2 as the internal character representation. Transcoding to legacy encodings as needed, so existing databases will continue to work. Each column can have a *different* legacy encodingPhase III (Cont): Phase III (Cont) Data can be imported, transcoded and filtered using Rosette’s full transform functionality. Hankaku/Zenkaku transformation Case Conversion SGML Entity Folding Ad nauseumStatus: Status Phase I is complete and live. Phase II is underway, as time allows. UTF-8 support in place. Collation still going. Phase III is planned, but not yet started.Status (cont.): Status (cont.) Removal of Rosette to use glibc features and/or ICU Measure and improve performance All will be released to the MySQL codeQ&A: Q&A Tom Emerson tree@basistech.com Slides and other information available at http://cymru.basistech.com/iuc17