logging in or signing up lrec metadata WoodRock Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 64 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 14, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Customizing the IMDI metadata schema for endangered languages: Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)Introduction: Introduction IMDI: International Standards for Language Engineering Metadata Initiative DOBES: Volkswagen Foundation’s Documentation of Endangered Languages initiative AILLA: the Archive of the Indigenous Languages of Latin AmericaTypes of resources: Types of resources Audio and video recordings in various digital formats Annotation text files, e.g. transcriptions and translations Standalone texts, e.g. dictionaries, poetry Wide range of genres: from verbal art to scholarly analysesBundles of resources: Bundles of resources Session (IMDI, 2001): resources resulting from a linguistic elicitation session - recordings and annotations. Only models one kind of resource production - a recording session. Collections will include a greater variety of resources, in sets of related materials.Types of bundles: Types of bundles Canonical bundle: the original session. A digitized recording, in different formats, and some textual annotation files, also in different formats. Minimal bundle: a single file. Examples: dictionary, poem, recording of uninterpretable chants. Meta-bundle: a bundle containing other bundles. Example: a book about a set of annotated recordings.Bundle elements: Bundle elements Current: Name of bundle Date and place of production Proposed: Resource relations Date archived Last modifiedMajor subschemas: Major subschemas Project Collector Content Participants Resources ReferencesThe Content Subschema: The Content Subschema Genre is the top-level category: Interaction: conversation, interview … Explanation: description, recipe … Performance: narrative, poem, oratory … Teaching: primer, textbook … Analysis: grammar, dictionary …Other Content categories: Other Content categories Modality: speech, writing, gesture Communication context: Interactivity Planning Involvement Languages Task Description KeysAILLA’s Content Keys: AILLA’s Content Keys Register: a characterization of how the discourse reflects the social context. Example: honorific speech Style: about poetic and stylistic effects. Examples: parallelism, metered verse.The Project subschema: The Project subschema Current elements: Name: a nickname or acronym Title: official title ID: a unique identifier Contact information Proposed element: Funder: name of funding organizationThe Collector subschema: The Collector subschema AILLA renames this Depositor, since this is the individual we have to keep track of (e.g. for Level 3 access permission). When the Depositor is not also the Collector, Collector can be listed under Participants.The Participants subschema: The Participants subschema Type: functional role, e.g. creator Role: family relationship Name/Full name Language(s) Ethnic group, age, sex: Education Anonymous: True if participant’s Full name is reserved; False otherwiseAILLA additions to Participants: AILLA additions to Participants Origin: Place (country, region, etc) of origin of the creator of the primary resource in the bundle (e.g. the speaker whose voice is recorded). Occupation: Can be relevant in assessing accuracy of some kinds of data.The Resources subschema: The Resources subschema Resources contains information about formats and provenance of files in a bundle. Media Files: audio, video, etc. Annotation Files: text files. Proposal: call them all Media Files, to reduce redundancy in the database. (All have URL, size, etc. elements.)Text resources: Text resources Current elements: Type: type of annotation, e.g. phonetic transcription. Content encoding: annotation encoding scheme, e.g. EUROTYP. Character encoding: character set(s) used in a text file.Text resources 2: Text resources 2 Proposed elements: Transcription type Translation (aka Glossing) type Software: used to produce transcriptions, translations, other annotations (e.g. Shoebox) Describe Annotator in Participants (along with Translator, etc.)Proposed subschema: Proposed subschema Place: composed of several elements: Continent Country Region Subregion (address) Repeated at least twice, in Bundle and in Participants (Origin). Might also be useful in the Language subschema. Conclusion: Conclusion IMDI schema is a flexible tool. Customization through Key/Value pairs allows local modifications. Most of the proposed changes are terminological, moving from the DOBES in-house terminology to more general usage. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
lrec metadata WoodRock Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 64 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 14, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Customizing the IMDI metadata schema for endangered languages: Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)Introduction: Introduction IMDI: International Standards for Language Engineering Metadata Initiative DOBES: Volkswagen Foundation’s Documentation of Endangered Languages initiative AILLA: the Archive of the Indigenous Languages of Latin AmericaTypes of resources: Types of resources Audio and video recordings in various digital formats Annotation text files, e.g. transcriptions and translations Standalone texts, e.g. dictionaries, poetry Wide range of genres: from verbal art to scholarly analysesBundles of resources: Bundles of resources Session (IMDI, 2001): resources resulting from a linguistic elicitation session - recordings and annotations. Only models one kind of resource production - a recording session. Collections will include a greater variety of resources, in sets of related materials.Types of bundles: Types of bundles Canonical bundle: the original session. A digitized recording, in different formats, and some textual annotation files, also in different formats. Minimal bundle: a single file. Examples: dictionary, poem, recording of uninterpretable chants. Meta-bundle: a bundle containing other bundles. Example: a book about a set of annotated recordings.Bundle elements: Bundle elements Current: Name of bundle Date and place of production Proposed: Resource relations Date archived Last modifiedMajor subschemas: Major subschemas Project Collector Content Participants Resources ReferencesThe Content Subschema: The Content Subschema Genre is the top-level category: Interaction: conversation, interview … Explanation: description, recipe … Performance: narrative, poem, oratory … Teaching: primer, textbook … Analysis: grammar, dictionary …Other Content categories: Other Content categories Modality: speech, writing, gesture Communication context: Interactivity Planning Involvement Languages Task Description KeysAILLA’s Content Keys: AILLA’s Content Keys Register: a characterization of how the discourse reflects the social context. Example: honorific speech Style: about poetic and stylistic effects. Examples: parallelism, metered verse.The Project subschema: The Project subschema Current elements: Name: a nickname or acronym Title: official title ID: a unique identifier Contact information Proposed element: Funder: name of funding organizationThe Collector subschema: The Collector subschema AILLA renames this Depositor, since this is the individual we have to keep track of (e.g. for Level 3 access permission). When the Depositor is not also the Collector, Collector can be listed under Participants.The Participants subschema: The Participants subschema Type: functional role, e.g. creator Role: family relationship Name/Full name Language(s) Ethnic group, age, sex: Education Anonymous: True if participant’s Full name is reserved; False otherwiseAILLA additions to Participants: AILLA additions to Participants Origin: Place (country, region, etc) of origin of the creator of the primary resource in the bundle (e.g. the speaker whose voice is recorded). Occupation: Can be relevant in assessing accuracy of some kinds of data.The Resources subschema: The Resources subschema Resources contains information about formats and provenance of files in a bundle. Media Files: audio, video, etc. Annotation Files: text files. Proposal: call them all Media Files, to reduce redundancy in the database. (All have URL, size, etc. elements.)Text resources: Text resources Current elements: Type: type of annotation, e.g. phonetic transcription. Content encoding: annotation encoding scheme, e.g. EUROTYP. Character encoding: character set(s) used in a text file.Text resources 2: Text resources 2 Proposed elements: Transcription type Translation (aka Glossing) type Software: used to produce transcriptions, translations, other annotations (e.g. Shoebox) Describe Annotator in Participants (along with Translator, etc.)Proposed subschema: Proposed subschema Place: composed of several elements: Continent Country Region Subregion (address) Repeated at least twice, in Bundle and in Participants (Origin). Might also be useful in the Language subschema. Conclusion: Conclusion IMDI schema is a flexible tool. Customization through Key/Value pairs allows local modifications. Most of the proposed changes are terminological, moving from the DOBES in-house terminology to more general usage.