logging in or signing up Morphology Cuthbert Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 926 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: November 26, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... By: yosefmaprince (17 month(s) ago) thanks. really good and i benefited a lot from this piece of writing Saving..... Post Reply Close Saving..... Edit Comment Close Premium member Presentation Transcript Morphology from a computational point of view: Morphology from a computational point of view March 2001Today: Today Minimal Edit Distance, and Viterbi more generally; Letter to Sound What is morphology? Finite-state automata Finite-state phonological rules1. What is morphology?: 1. What is morphology? Study of the internal structure of words: morph-ology word-s jump-ing Why? For some purposes, we need to know what the internal pieces are. Knowledge of the words of a language can’t be summarized in a finite list: we need to know the principles of word-formationSome resources: Some resources Richard Sproat: Morphology and Computation (MIT Press, 1992) Excellent overview of computational morphology and phonoloy by Harald Trost at http://www.ai.univie.ac.at/~harald/handbook.html 2. What applications need knowledge of words?: 2. What applications need knowledge of words? Any high-level linguistic analysis: syntactic parser machine translation speech recognition, text-to-speech (TTS) information retrieval (IR) dictionary, spell-checker3. A list is not enough: 3. A list is not enough An empirical fact: AP newswire: mid-Feb – Dec 30 1988 Nearly 300,000 words. “New” words that appeared on Dec 31 1988: compounds: prenatal-care, publicly-funded, channel-switching, owner-president, logic-loving, part-Vulcan, signal-emitting, landsite, government-aligned, armhole, signal-emitting... ...new words...: ...new words... dumbbells, groveled, fuzzier, oxidized ex-presidency, puppetry, boulderlike, over-emphasized, hydrosulfite, outclassing, non-passengers, racialist, counterprograms, antiprejudice, re-unification, traumatological, refinancings, instrumenting, ex-critters, mega-lizardSlide8: ex-presidency: prefix ex- boulder-like: suffix –like over-emphasized: prefix over- antiprejudice: prefix anti This is often called the OOV problem (“out of vocabulary”). Slide9: If we work out the principles of word-formation, we will simultaneously: compress the size of our internalized list of words; become able to deal with new words on the fly.4. Overview of applications that need knowledge of words: 4. Overview of applications that need knowledge of words Speech generation: text to speech (TTS) Speech recognitionText to speech: Text to speech Problem: take text, in standard spelling, and produce a sequence of phonemes which can be synthesized by the “backend”. Severe problems: Proper names (persons, places), OOV words boathouse B OW1 T H AU2 SSpeech recognition: Speech recognition Take a sound file (e.g., *.wav) and produce a list of words in standard orthography. Bill Clinton is a recent ex-president. If someone says it, we need to figure out what the word was.Do we know what a word is?: Do we know what a word is? This is actually not an easy question! – especially if we turn to Asian languages, without a tradition of putting in “white space” between “words”, as we do in the West. German writes more compounds without white space than English does.Basic principles of morphology: Basic principles of morphology For some purposes, we need to think about phonemes, while for others it’s more convenient to talk about letters. For our purposes, I’ll talk about letters whenever we don’t need to specifically focus on phonemes.Morpheme: Morpheme It is convenient to be able to talk about the pieces into which words may be broken, and linguists call these pieces morphemes: the smallest parts of a language that can be regularly assigned a meaning.Morphemes: Morphemes Uncontroversial morphemes: door, dog, jump, -ing, -s, to More controversial morphemes sing/sang: s-ng + i/a cut/cut: cut + PASTClassic distinctions in morphology:: Classic distinctions in morphology: Analytic (isolating) languages: no morphology of derivational or inflectional sort. Synthetic (inflecting) languages: Agglutinative: 1 function per morpheme Fusional: > 1 function per morphemeAgglutinative:Finnish Nominal Declension: Agglutinative: Finnish Nominal Declension talo 'the-house' kaup-pa 'the-shop' talo-ni 'my house' kaup-pa-ni 'my shop' talo-ssa 'in the-house' kaup-a-ssa 'in the-shop' talo-ssa-ni 'in my house’ kaup-a-ssa-ni 'in my shop' talo-i-ssa 'in the-houses’ kaup-o-i-ssa 'in the-shops' talo-i-ssa-ni 'in my houses’ kaup-o-i-ssa-ni 'in my shops' Courtesy of Bucknell Univ. web pageFusional: LatinLatin Declension of hortus 'garden' : Fusional: Latin Latin Declension of hortus 'garden' Singular Plural Nominative (Subject) hort-us hort-i Genitive (of) hort-i hort-rum Dative (for/to) hort-o hort-is Accusative (Direct Obj) hort-um hort-us Vocative (Call) hort-e hort-i Ablative (from/with) hort-o hort-is Morphemes vs. morphs: Morphemes vs. morphs Some analysts distinguish between “morphemes” and “morphs”. Morphemes are motivated by an analysis, and include “plural” and “past” Morphs are strings of letters or phones that “realize” or “manifest” a morpheme.Free and bound morphemes: Free and bound morphemes Free morphemes can form (free-standing) words; bound morphemes are only found in combination with other morphemes. Examples? Functions of morphology: Functions of morphology Derivational morphology: creates one lexeme from another compute > computer > computerize > computerization Inflectional morphology: creates the form of a lexeme that’s right for a sentence: the nominative singular form of a noun; or the past 3rd person singular form of a verb.Slide23: Word: an identifiable string of letters (or phonemes) sing Word-form: a word with a specific set of syntactic and morphological features. The sing in I sing is 1st person sg, and is a distinct word-from from the sing in you sing. Lexeme: a complete set of inflectionally related word-forms, including sing, sings, and sang Lemma: a complete set of morphologically related lexemes: sing, sings, song, sang.A lexeme’s stem: A lexeme’s stem In many languages (unlike English), constellations of word-forms forming a lexeme demand the recognition of a basic stem which does not stand freely as a word: Italian ragazzo, ragazzi (boy, girl) ragazzi, ragazze (boys, girls) ragazz-Compounds: Compounds Compounds are composed of 2 (or more) words or stems Compounds: hot dog, White House, bookstore, cherry-coveredLanguages vary in the amount of morphology they have and use: Languages vary in the amount of morphology they have and use English has a lot of derivational morphology and relatively little inflectional morphology English verb’s inflectional forms: bare stem, -s, -ed, -ingEuropean languages: European languages Not uncommon for a verb to have 30 to 50+ forms: marking tense, person and number of the subjectDerivation : Derivation Derivational morphology usually consists of adding a prefix or suffix to a base (= stem). The base has a lexical category (it is a noun, verb, adjective), and the suffix typically assigns a different category to the whole word. sad ness Adj Noun -ness: suffix that takes an adjective, & makes a noun.Slide29: un interest ing Adj Adj Verb Vb Adj AdjDistinct from contractions…: Distinct from contractions… English (and some other languages) permit the collapsing together of common words. In some extremely rare cases, only the collapsed form exists (English possessive ’s). He will arrive tonight > he’ll arrive… The [King of England]’s childrenSome basics of English morphology: Some basics of English morphology Inflectional morphology Nouns: -NULL, -s, -’s Verbs: -NULL, s, -ed, -ing (so-called weak verbs) Strong verbs: 3 major groups a. Internal verb change (sing/sang, drive/drove/driven, dive/dove) b. –t suffix, typically with vowel-shortening dream/dreamt, sleep/slept c. –aught replacement: catch, teach, seek, Derivational morphology in complex: Derivational morphology in complex This morphology creates new words, by adding prefixes or suffixes. It is helpful to divide them into two groups, depending on whether they leave the pronunciation of the base unchanged or not. There are, as always, some fuzzy cases.Slide33: Level 1 ize, ization, al, ity, al, ic, al, ity, ion, y (nominaliz-ing), al, ate, ous, ive, ation Can attach to non-word stems (fratern-al, paternal; parent-al) Typically change stress and vowel quality of stem Level 2 Never precede Level 1 suffixes Never change stress pattern or vowel quality Almost always attach to words that already exist hood, ness, ly, s, ing, ish, ful, ly, ize, less, y (adj.) Combinations of Class 1,2: Combinations of Class 1,2 Class 1 + Class 1: histor-ic-al, illumina-at-tion, indetermin-at-y; Class 1 + Class 2: frantern-al-ly, transform-ate-ion-less; Class 2 + Class 2: weight-less-ness ?? Class 2 + Class 1: *weight-less-ity, fatal-ism-alSignature: Signature Set of suffixes (or prefixes) that occurs in a corpus with a set of stems.Slide36: NULL.ed.ing.s look interest add claim mark extend demand remain want succeed record offer represent cover return end explain follow help belong attempt talk fear happen assault account point award appeal train contract result request staff view fail kick visit confront attack comment sponsor Slide37: NULL.s paper retain improvement missile song truth doctor indictment window conductor dick misunderstanding struggle stake tank belief cafeteria material mind operator bassi lot movement chain notion marriage dancer scholarship reservoir sweet right battalion hold mr shot cardinal athletic revenue duel confrontation solo talent guest shoe russian commitment average monk election street roger rifle worker area plane pinch-hitter dozen browning conclusion teacher narcotic appearance alternative dealer producer mile stock shrine sometime bag successor career mistake ankle weapon model front spotlight rhode pace debate payment requirement fairway consultation chip dollar employer thank mustang rocket-bomb hat string precinct robert employee action detective pressure measure spirit forbid hitter breast yankee partner floor member Slide38: NULL.d.s increase tie hole associate reserve price fire receive challenge rate purchase propose feature celebrate decide suite single change sculpture combine privilege pledge issue frame indicate believe damage include use aide graduate surprise intervene practice trouble serve oppose promise charge note schedule continue raise decline cause operate emphasize relieve hope share judge birdie produce exchange Slide39: NULL.ed.er.ing.s report turn walk park pick flow NULL.d.ment enforce announce engage arrange replace improve encourage NULL.n.s rose low take law drive rise undertake NULL.al intern profession logic fat tradition extern margin jurisdiction historic education promotion constitution addition sensation roy ration origin classic convention NULL.man sand news police states gross sun fresh sports boss sales 3- patrol bonds ed.er.ing slugg manag crush publish robb NULL.ity.s major senior moral hospital NULL.ry hung mason ave summit scene surge rival forest NULL.a.s indian kind american Finite state morphology: Finite state morphology FSA: finite-state automata: FSA: finite-state automata Consists of a set of states a starting state a set of final or accepting states a finite set of symbols a set of transitions: each is defined by a from-state, a to-state, and a symbolSlide42: It’s natural to think of this as describing an annotated directed graph. q0 q1 q2 q3 b a a q3 ! a a a !Slide43: An FSA can be thought of as judging (accepting) a string, or as generating one. How does it judge? Find a start to finish path that matches the string. How does it generate? Walk through any start-to-finish path.Deterministic and Non-deterministic FSAs: Deterministic and Non-deterministic FSAs Just a little difference: Deterministic case: For every state, there is a maximum of one transition associated with any given symbol. You can say that there’s a function from {states}X{symbols} {states} Nondeterministic case: There is no such restriction; hence, given a state and a symbol, it is not necessarily certain which transition is to be taken.Slide45: q0 q1 q2 q3 b a a q3 ! a deterministic…Slide46: q0 q1 q2 q3 b a a q3 ! a non-deterministic The best things in life are non-deterministic.Figure 3.4 p. 68: Figure 3.4 p. 68 q0 q1 q2 q3 un- adj-root -er –est -ly e Alternate notationYet a third way: rows in an array(to-column can consist of pointers): Yet a third way: rows in an array (to-column can consist of pointers) Stop states: 2,3Slide49: Figure 3.5 p. 69 q0 q1 q2 q5 un- adj-root-1 -er –est -ly e q4 -er –est q3 adj-root-2 adj-root-1Slide50: Yet a third way: rows in an array (to-column can consist of pointers) Stop states: 2,4,5Slide51: Figure 3.6, p. 70Slide52: We can simplify greatly (generating a bit more….)Finite-State Transducers (FST): Finite-State Transducers (FST) The symbols of the FST are complex: they’re really pairs of symbols, one for each of two “tapes” or levels. Recognizer: decides if a given pair of representations fits together “OK” Generator: generates pairs of representations that fit together Translator: takes a representation on one level and produces the appropriate representation on the other level Finite state transducers: Finite state transducers can be inverted, or composed and you get another FST.Complex symbols: Complex symbols Usually of the form a:b, which means a can appear on the upper tape when b appears on the lower tape. So a:b means that’s a permissible pairing up of symbols. “a” along means a:a, etc. epsilon e means null character. Remember, “other” means “any feasible pair that is not in this transducer” (p. 78)Using FSTs for orthographic rules: Using FSTs for orthographic rulesSlide57: Using FSTs for orthographic rules fox^s#…we get to q1 with ‘x’Slide58: Using FSTs for orthographic rules fox^s#…we get to q2 with ‘^’Slide59: Using FSTs for orthographic rules fox^s#…we can get to q3 with ‘NULL’Slide60: Using FSTs for orthographic rules fox^s#…we also get to q5 with ‘s’ but we don’t want to!Slide61: fox^s#…we also get to q5 with ‘s’ but we don’t want to! So why is this transition there? ?friend^ship, ?fox^s^s (= foxes’s)Slide62: fox^s#…q4 with sSlide63: fox^s#…q0 with # (accepting state)Slide64: arizona: we leave q0 but return Other transitions…Slide65: m i s s ^ s Other transitions… You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Morphology Cuthbert Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 926 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: November 26, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... By: yosefmaprince (17 month(s) ago) thanks. really good and i benefited a lot from this piece of writing Saving..... Post Reply Close Saving..... Edit Comment Close Premium member Presentation Transcript Morphology from a computational point of view: Morphology from a computational point of view March 2001Today: Today Minimal Edit Distance, and Viterbi more generally; Letter to Sound What is morphology? Finite-state automata Finite-state phonological rules1. What is morphology?: 1. What is morphology? Study of the internal structure of words: morph-ology word-s jump-ing Why? For some purposes, we need to know what the internal pieces are. Knowledge of the words of a language can’t be summarized in a finite list: we need to know the principles of word-formationSome resources: Some resources Richard Sproat: Morphology and Computation (MIT Press, 1992) Excellent overview of computational morphology and phonoloy by Harald Trost at http://www.ai.univie.ac.at/~harald/handbook.html 2. What applications need knowledge of words?: 2. What applications need knowledge of words? Any high-level linguistic analysis: syntactic parser machine translation speech recognition, text-to-speech (TTS) information retrieval (IR) dictionary, spell-checker3. A list is not enough: 3. A list is not enough An empirical fact: AP newswire: mid-Feb – Dec 30 1988 Nearly 300,000 words. “New” words that appeared on Dec 31 1988: compounds: prenatal-care, publicly-funded, channel-switching, owner-president, logic-loving, part-Vulcan, signal-emitting, landsite, government-aligned, armhole, signal-emitting... ...new words...: ...new words... dumbbells, groveled, fuzzier, oxidized ex-presidency, puppetry, boulderlike, over-emphasized, hydrosulfite, outclassing, non-passengers, racialist, counterprograms, antiprejudice, re-unification, traumatological, refinancings, instrumenting, ex-critters, mega-lizardSlide8: ex-presidency: prefix ex- boulder-like: suffix –like over-emphasized: prefix over- antiprejudice: prefix anti This is often called the OOV problem (“out of vocabulary”). Slide9: If we work out the principles of word-formation, we will simultaneously: compress the size of our internalized list of words; become able to deal with new words on the fly.4. Overview of applications that need knowledge of words: 4. Overview of applications that need knowledge of words Speech generation: text to speech (TTS) Speech recognitionText to speech: Text to speech Problem: take text, in standard spelling, and produce a sequence of phonemes which can be synthesized by the “backend”. Severe problems: Proper names (persons, places), OOV words boathouse B OW1 T H AU2 SSpeech recognition: Speech recognition Take a sound file (e.g., *.wav) and produce a list of words in standard orthography. Bill Clinton is a recent ex-president. If someone says it, we need to figure out what the word was.Do we know what a word is?: Do we know what a word is? This is actually not an easy question! – especially if we turn to Asian languages, without a tradition of putting in “white space” between “words”, as we do in the West. German writes more compounds without white space than English does.Basic principles of morphology: Basic principles of morphology For some purposes, we need to think about phonemes, while for others it’s more convenient to talk about letters. For our purposes, I’ll talk about letters whenever we don’t need to specifically focus on phonemes.Morpheme: Morpheme It is convenient to be able to talk about the pieces into which words may be broken, and linguists call these pieces morphemes: the smallest parts of a language that can be regularly assigned a meaning.Morphemes: Morphemes Uncontroversial morphemes: door, dog, jump, -ing, -s, to More controversial morphemes sing/sang: s-ng + i/a cut/cut: cut + PASTClassic distinctions in morphology:: Classic distinctions in morphology: Analytic (isolating) languages: no morphology of derivational or inflectional sort. Synthetic (inflecting) languages: Agglutinative: 1 function per morpheme Fusional: > 1 function per morphemeAgglutinative:Finnish Nominal Declension: Agglutinative: Finnish Nominal Declension talo 'the-house' kaup-pa 'the-shop' talo-ni 'my house' kaup-pa-ni 'my shop' talo-ssa 'in the-house' kaup-a-ssa 'in the-shop' talo-ssa-ni 'in my house’ kaup-a-ssa-ni 'in my shop' talo-i-ssa 'in the-houses’ kaup-o-i-ssa 'in the-shops' talo-i-ssa-ni 'in my houses’ kaup-o-i-ssa-ni 'in my shops' Courtesy of Bucknell Univ. web pageFusional: LatinLatin Declension of hortus 'garden' : Fusional: Latin Latin Declension of hortus 'garden' Singular Plural Nominative (Subject) hort-us hort-i Genitive (of) hort-i hort-rum Dative (for/to) hort-o hort-is Accusative (Direct Obj) hort-um hort-us Vocative (Call) hort-e hort-i Ablative (from/with) hort-o hort-is Morphemes vs. morphs: Morphemes vs. morphs Some analysts distinguish between “morphemes” and “morphs”. Morphemes are motivated by an analysis, and include “plural” and “past” Morphs are strings of letters or phones that “realize” or “manifest” a morpheme.Free and bound morphemes: Free and bound morphemes Free morphemes can form (free-standing) words; bound morphemes are only found in combination with other morphemes. Examples? Functions of morphology: Functions of morphology Derivational morphology: creates one lexeme from another compute > computer > computerize > computerization Inflectional morphology: creates the form of a lexeme that’s right for a sentence: the nominative singular form of a noun; or the past 3rd person singular form of a verb.Slide23: Word: an identifiable string of letters (or phonemes) sing Word-form: a word with a specific set of syntactic and morphological features. The sing in I sing is 1st person sg, and is a distinct word-from from the sing in you sing. Lexeme: a complete set of inflectionally related word-forms, including sing, sings, and sang Lemma: a complete set of morphologically related lexemes: sing, sings, song, sang.A lexeme’s stem: A lexeme’s stem In many languages (unlike English), constellations of word-forms forming a lexeme demand the recognition of a basic stem which does not stand freely as a word: Italian ragazzo, ragazzi (boy, girl) ragazzi, ragazze (boys, girls) ragazz-Compounds: Compounds Compounds are composed of 2 (or more) words or stems Compounds: hot dog, White House, bookstore, cherry-coveredLanguages vary in the amount of morphology they have and use: Languages vary in the amount of morphology they have and use English has a lot of derivational morphology and relatively little inflectional morphology English verb’s inflectional forms: bare stem, -s, -ed, -ingEuropean languages: European languages Not uncommon for a verb to have 30 to 50+ forms: marking tense, person and number of the subjectDerivation : Derivation Derivational morphology usually consists of adding a prefix or suffix to a base (= stem). The base has a lexical category (it is a noun, verb, adjective), and the suffix typically assigns a different category to the whole word. sad ness Adj Noun -ness: suffix that takes an adjective, & makes a noun.Slide29: un interest ing Adj Adj Verb Vb Adj AdjDistinct from contractions…: Distinct from contractions… English (and some other languages) permit the collapsing together of common words. In some extremely rare cases, only the collapsed form exists (English possessive ’s). He will arrive tonight > he’ll arrive… The [King of England]’s childrenSome basics of English morphology: Some basics of English morphology Inflectional morphology Nouns: -NULL, -s, -’s Verbs: -NULL, s, -ed, -ing (so-called weak verbs) Strong verbs: 3 major groups a. Internal verb change (sing/sang, drive/drove/driven, dive/dove) b. –t suffix, typically with vowel-shortening dream/dreamt, sleep/slept c. –aught replacement: catch, teach, seek, Derivational morphology in complex: Derivational morphology in complex This morphology creates new words, by adding prefixes or suffixes. It is helpful to divide them into two groups, depending on whether they leave the pronunciation of the base unchanged or not. There are, as always, some fuzzy cases.Slide33: Level 1 ize, ization, al, ity, al, ic, al, ity, ion, y (nominaliz-ing), al, ate, ous, ive, ation Can attach to non-word stems (fratern-al, paternal; parent-al) Typically change stress and vowel quality of stem Level 2 Never precede Level 1 suffixes Never change stress pattern or vowel quality Almost always attach to words that already exist hood, ness, ly, s, ing, ish, ful, ly, ize, less, y (adj.) Combinations of Class 1,2: Combinations of Class 1,2 Class 1 + Class 1: histor-ic-al, illumina-at-tion, indetermin-at-y; Class 1 + Class 2: frantern-al-ly, transform-ate-ion-less; Class 2 + Class 2: weight-less-ness ?? Class 2 + Class 1: *weight-less-ity, fatal-ism-alSignature: Signature Set of suffixes (or prefixes) that occurs in a corpus with a set of stems.Slide36: NULL.ed.ing.s look interest add claim mark extend demand remain want succeed record offer represent cover return end explain follow help belong attempt talk fear happen assault account point award appeal train contract result request staff view fail kick visit confront attack comment sponsor Slide37: NULL.s paper retain improvement missile song truth doctor indictment window conductor dick misunderstanding struggle stake tank belief cafeteria material mind operator bassi lot movement chain notion marriage dancer scholarship reservoir sweet right battalion hold mr shot cardinal athletic revenue duel confrontation solo talent guest shoe russian commitment average monk election street roger rifle worker area plane pinch-hitter dozen browning conclusion teacher narcotic appearance alternative dealer producer mile stock shrine sometime bag successor career mistake ankle weapon model front spotlight rhode pace debate payment requirement fairway consultation chip dollar employer thank mustang rocket-bomb hat string precinct robert employee action detective pressure measure spirit forbid hitter breast yankee partner floor member Slide38: NULL.d.s increase tie hole associate reserve price fire receive challenge rate purchase propose feature celebrate decide suite single change sculpture combine privilege pledge issue frame indicate believe damage include use aide graduate surprise intervene practice trouble serve oppose promise charge note schedule continue raise decline cause operate emphasize relieve hope share judge birdie produce exchange Slide39: NULL.ed.er.ing.s report turn walk park pick flow NULL.d.ment enforce announce engage arrange replace improve encourage NULL.n.s rose low take law drive rise undertake NULL.al intern profession logic fat tradition extern margin jurisdiction historic education promotion constitution addition sensation roy ration origin classic convention NULL.man sand news police states gross sun fresh sports boss sales 3- patrol bonds ed.er.ing slugg manag crush publish robb NULL.ity.s major senior moral hospital NULL.ry hung mason ave summit scene surge rival forest NULL.a.s indian kind american Finite state morphology: Finite state morphology FSA: finite-state automata: FSA: finite-state automata Consists of a set of states a starting state a set of final or accepting states a finite set of symbols a set of transitions: each is defined by a from-state, a to-state, and a symbolSlide42: It’s natural to think of this as describing an annotated directed graph. q0 q1 q2 q3 b a a q3 ! a a a !Slide43: An FSA can be thought of as judging (accepting) a string, or as generating one. How does it judge? Find a start to finish path that matches the string. How does it generate? Walk through any start-to-finish path.Deterministic and Non-deterministic FSAs: Deterministic and Non-deterministic FSAs Just a little difference: Deterministic case: For every state, there is a maximum of one transition associated with any given symbol. You can say that there’s a function from {states}X{symbols} {states} Nondeterministic case: There is no such restriction; hence, given a state and a symbol, it is not necessarily certain which transition is to be taken.Slide45: q0 q1 q2 q3 b a a q3 ! a deterministic…Slide46: q0 q1 q2 q3 b a a q3 ! a non-deterministic The best things in life are non-deterministic.Figure 3.4 p. 68: Figure 3.4 p. 68 q0 q1 q2 q3 un- adj-root -er –est -ly e Alternate notationYet a third way: rows in an array(to-column can consist of pointers): Yet a third way: rows in an array (to-column can consist of pointers) Stop states: 2,3Slide49: Figure 3.5 p. 69 q0 q1 q2 q5 un- adj-root-1 -er –est -ly e q4 -er –est q3 adj-root-2 adj-root-1Slide50: Yet a third way: rows in an array (to-column can consist of pointers) Stop states: 2,4,5Slide51: Figure 3.6, p. 70Slide52: We can simplify greatly (generating a bit more….)Finite-State Transducers (FST): Finite-State Transducers (FST) The symbols of the FST are complex: they’re really pairs of symbols, one for each of two “tapes” or levels. Recognizer: decides if a given pair of representations fits together “OK” Generator: generates pairs of representations that fit together Translator: takes a representation on one level and produces the appropriate representation on the other level Finite state transducers: Finite state transducers can be inverted, or composed and you get another FST.Complex symbols: Complex symbols Usually of the form a:b, which means a can appear on the upper tape when b appears on the lower tape. So a:b means that’s a permissible pairing up of symbols. “a” along means a:a, etc. epsilon e means null character. Remember, “other” means “any feasible pair that is not in this transducer” (p. 78)Using FSTs for orthographic rules: Using FSTs for orthographic rulesSlide57: Using FSTs for orthographic rules fox^s#…we get to q1 with ‘x’Slide58: Using FSTs for orthographic rules fox^s#…we get to q2 with ‘^’Slide59: Using FSTs for orthographic rules fox^s#…we can get to q3 with ‘NULL’Slide60: Using FSTs for orthographic rules fox^s#…we also get to q5 with ‘s’ but we don’t want to!Slide61: fox^s#…we also get to q5 with ‘s’ but we don’t want to! So why is this transition there? ?friend^ship, ?fox^s^s (= foxes’s)Slide62: fox^s#…q4 with sSlide63: fox^s#…q0 with # (accepting state)Slide64: arizona: we leave q0 but return Other transitions…Slide65: m i s s ^ s Other transitions…