logging in or signing up 875 PERL 06 mini Haggrid Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 210 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 02, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... By: arivu (30 month(s) ago) can u send this ppt to my mail id: arivu.kkarasan@yahoo.in Saving..... Post Reply Close Saving..... Edit Comment Close By: ksubbaraju (37 month(s) ago) please can u send this PPT ksubbaraju08@gmail.com Saving..... Post Reply Close Saving..... Edit Comment Close Premium member Presentation Transcript Slide1: Introduction to PERL Genetics 875 10/19/06 PERL is a language that is easy to use and was designed to do certain tasks (like reading, writing, moving text & sequences) very well Advantages of PERL: - itâs intuitive - itâs easy to get started ⌠you donât need to know everything initially - itâs very good at reading and manipulating files, sequences, text - there is usually >1 way to accomplish a task Disadvantages of PERL: Perl programs are different from other programs, in that the program you write is ârunâ by another program which interprets your code (this interpreter is actually called perl ⌠your programs will be run by the perl interpreter). Because of this, your code is one level removed from the actual computer ⌠Therefore, perl programs are slower than other languages (like C, C++). Thus, perl is not used so much for functions that require heavy computation.Slide2: A great way to learn PERL: http://www.oreilly.com/catalog/lperl3/ âLearning Perlâ Also, some great online resources: http://www.perl.com/ A short PERL tutorial http://archive.ncsa.uiuc.edu/General/Training/PerlIntro/ And lots of other help on the web âŚ.Slide3: Like any language, programming languages have structure A book has words, sentences, paragraphs, chapters, and punctuation linking them all together ENGLISH PERL Noun Scalar, variable Verb Function, command Phrase Statement, expression Paragraph Loop Chapter Subroutines, packages, modulesSlide4: A variable is a container that can hold information that has the potential to vary Variables can be singular, in which case they are identified by a â$â in front of the variable name eg) $x $File1 $StudentName They can be a number, letter (called a âcharacterâ in perl), string of numbers, or string of letters ⌠just remember that whatever it is, it is considered a single item. eg) $x = 5 $motif = âGATTACâ $StudentName = âRutabegaâ Variables can be plural and those come in different forms: Arrays and HashesSlide5: Arrays are a list of single variables An array is a container that holds a list of separate, single variables in a specific order An array is denoted by a @ in front of its name Eg) @StudentNames = âCaligulaâ, âRandolphâ, âImeldaâ âCaligulaâ is stored in the first âcellâ of the array, which is the â0â cell âRandophâ is stored in the second âcellâ of the array, which is the â1â cell âImeldaâ is stored in the third âcellâ of the array, which is the â2â cell ** Note that in programming languages, you always start counting at â0â instead of at â1â Position in array: 0 1 2 Value stored at that position: Caligula Randolph ImeldaSlide6: Arrays are a list of single variables An array is a container that holds a list of separate, single variables in a specific order An array is denoted by a @ in front of its name Eg) @StudentNames = ( âCaligulaâ, âRandophâ, âImeldaâ ) Position in array: 0 1 2 Value stored at that position: Caligula Randoph Imelda You can âcallâ a specific cell (which, remember, is a singular variable identified by $): $StudentNames[0] = âCaligulaâ $StudentNames[1] = âRandolphâ This $ tells perl that you want a singular variable These brackets tell perl that you are looking at a single cell in an array Between these two parts of the name, perl knows this is a cell of an arraySlide7: Exercise 0: Write your first perl program! We will start by creating a simple perl program where you will print a string to the screen. A few things about writing perl programs: -- The first few lines of the program (which youâll write in a .pl text file) will contain information for the computer about how to run the program -- In order for the perl interpreter to understand your code, you must use the right syntax. Like in English, each phrase must have an obvious start and stop point. The most common punctuation in perl is â;â which acts like a period does in English. A statement begins after the â;â from the previous statement, and ends at the next â;â There will be other kinds of punctuation which define statements/items, like (âŚ.) {âŚâŚ} ââŚ..â and weâll get to these in a bit. One useful punctuation is # which means âDonât read this line of the fileâ â it is useful because you can type in notes to yourself (âcommentsâ) that arenât part of the code.Slide8: Exercise 0: Write your first perl program! Open the terminal on your computer and go to the desktop use the unix command âcdâ to change directories (type everything written in brown) cd Desktop We will use the text editor âemacsâ to create and write your file emacs FirstProgram.pl This should open a blank file, since you just created it Type this in the first line of your .pl file: #!/usr/bin/perl This is a special magic command that tells the computer to use the perl interpreter to read and execute your program. We will use a special mode of perl called âstrict.â To do that, type this on the second line of your .pl file: use strict; Save your file using the emacs command, âCtrl x Ctrl sâ (ie, hold down Ctrl key and hit x then s) You are now ready to start writing your own code!Slide9: Exercise 0: Write your first perl program! 6. Print a sentence to the screen using the built-in perl âprintâ function print âHello. Welcome to your first perl program \nâ; The default for the print function is to print to the screen from where you ran the program. We will learn later how you can print to a file. Note the â\nâ at the end of this print statement. â\nâ stands for ânew-line characterâ This â\nâ adds a âreturnâ to the end of your statement to end the line Youâve now written your first perl program. To run your program, open another terminal window. You will call the perl interpreter and then feed it your program file name perl FirstProgram.pl 7. Save your file using the emacs command, âCtrl x Ctrl sâ (ie, hold down Cntrl key and hit x then s) You will either see your sentenced on the screen, or you will get some kind of error âŚSlide10: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ;Slide11: Exercise 1: Modify your first perl program We will create and define a string variable and an array. You will add code to your existing program. Make a variable called Name: my $Name; *since we are using âstrictâ mode, you must define a variable before you use it ⌠for whatever reason, you do that by typing âmyâ in front of the variable, only when you create the variable (ie. The first time you ever type it) Define the variable $Name to be your own name: $Name = âAudreyâ; Create an array called FavoriteHolidays my @FavoriteHolidays; Define the array as your top 3 favorite holidays, exactly as below: @FavoriteHolidays = (âHalloweenâ, âChristmasâ , âArbor Dayâ); Slide12: Exercise 1: Modify your first perl program Print the variables you just defined to the screen using the built-in perl âprintâ function print âThe top favorite holiday for $Name is $FavoriteHolidays[0]\nâ; Save your program by typing Ctrl x Ctrl s 12. Exit the program by typing Ctrl x Ctrl c You will either see your name and holiday, or you will get some kind of error ⌠To run your program, open another terminal window. You will call the perl interpreter and then feed it your program file perl FirstProgram.pl We will create and define a string variable and an array.Slide13: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); print "The top favorite holiday for $Name is $FavoriteHolidays[0]\n"; Slide14: Hashes are fancy containers for single variables Whereas an array indexes variables by their position in the list: A hash indexes one variable by another (known as a âkeyâ): for example, Name and hometown Key in hash: Caligula Randolph Imelda Value stored with that key: Rome Berlin Manila A hash is denoted by %. To call the individual values contained in the hash, you need the key name my %HomeTowns; $HomeTowns{ âCaligulaâ} = âRomeâ Position in array: 0 1 2 Value stored at that position: Caligula Randoph Imelda $ for calling single variable curly brackets tell you itâs a hashSlide15: Exercise 2: Create and use a Hash 1. You will add code to your existing program. Make a hash called HolidayMonth: my %HolidayMonth; Define the Hash, with the key = holiday and the stored value = the month $HolidayMonth{ âHalloweenâ } = âOctoberâ; $HolidayMonth{ âChristmasâ } = âDecemberâ; $HolidayMonth{ âArbor Dayâ } = âAprilâ; 3. Print the month of the top holiday print âThe top favorite holiday for $Name is $FavoriteHolidays[0] in $HolidayMonth{Halloween} \nâ;Slide16: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; print âThe top favorite holiday for $Name is $FavoriteHolidays[0] in $HolidayMonth{Halloween} \nâ;Slide17: Perl has a lot of built in functions and âoperatorsâ + means add $x + 5; is 7 means subtract $y â 3; is 0 * means multiply $x * 3; is 6 / means divide ($x*3)/2 is 3 ++ means increase by 1 $y++; is 4 = assignment operator (set a variable to = something) = = is to evaluate equality There are different operators for strings: $x = 123 $y = 456 $z = 3 . means concatenate two strings $x . $y; is 123456 x means replicate a string $z x 4; is 3333 eq evaluates string equality These things work on numbers. $x = 2; $y = 3;Slide18: Conditional statement Often you only want to do something if a certain condition is true. This is a case for if/unless/else statements If $x is equal to 5, then do something translates to if ($x = = 5) { something âŚ. }Slide19: Conditional statement Often you only want to do something if a certain condition is true. This is a case for if/unless/else statements If $x is equal to 5, then do something translates to if ($x = = 5) { something âŚ. } Parentheses define the start and stop of the condition = = means if $x is exactly equal to 5 If you type if ($x = 5) it will reset $x to be 5 and the statement is automatically true ⌠this is because to perl, = means âset this variable equal to âŚâ Curly brackets define what to do if the conditional statement is true.Slide20: Conditional statement Can also use if-then-else statements: if ($x = = 5) { something âŚ. }else { do something different ⌠} if ($x = = 5) { something âŚ. }elsif ($x<10) { do something different ⌠} OR The program will evaluate the statement in ( âŚ) â if true, it will do whatâs in { ..} if false it will SKIP whatâs in { ⌠} and resume on the line after that section.Slide21: Conditional statement The âwhileâ statement is useful: do something while (some condition is true). my $count = 0; while ($count < 100) { do some function ⌠$count++; ) The âwhileâ statement turns out to be very useful for reading in files ⌠Remember that ++ is the âincrement by oneâ operator. So each time you go through the loop, $count increases by one. If you forget to increase count and it stays at 0, you will be in an infinite loop. Note that a while statement is a kind of loop âŚSlide22: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} There are 3 components of a âfor loopâ:Slide23: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} create a new variable to use as a counter usually start that counter off at 0 do whatever as long as $i < 10 after each loop, increment $I by one (using the ++ operator)Slide24: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} create a new variable to use as a counter usually start that counter off at 0 do whatever as long as $i < 10 after each loop, increment $I by one (using the ++ operator) An important concept: scope â if you create a variable inside a loop, it is a âlocalâ variable = it only exists while youâre in the loop (in this case, $i is a local variable). If you want a variable that is âglobal,â ie. it exists for the duration of the program, be sure to declare it outside of any loops.Slide25: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; for (my $i=0; $i<3; $i++) { print âNumber $i favorite holiday for $Name is $FavoriteHolidays[$i]; } Exercise 3: using loopsSlide26: A note about loops, spaces, and punctuation: Having the correct punctuation is critical in perl. -- Sometimes if you have the wrong punctuation, perl will choke and give you an error -- Other times it will NOT choke and may look like itâs working, but itâs actually not doing what you intended. Spaces: perl reads anything without a space as one âwordâ. If you are trying to add two variables together, you must allow perl to read the separate items in your statement: $x=$y.$z #(perl will read this as one âwordâ and probably choke) $x = $y . $z #(eg. In order to understand the â.â operator, it should be flanked by spaces eg) if ($x < $y) {⌠} # if there is no space between âifâ and â(â perl canât read it. however) chomp( $whatever ); # here chomp is a function and the ( ..) actually are # part of that function â so here you canât have a space # between chomp and (). it takes awhile to get the hang of when spaces donât affect things and when they do. Curly brackets in loops: Always think carefully about the structure of your program. Sometimes you want to finish one loop before beginning the next one, other times you need nested loops ⌠it depends on what youâre trying to do â but if you set it up incorrectly, your program may not be functioning as you had intended (an example of this later on). One thing that helps a lot in looking at your own code is using tabs in your writing: every time you start a new loop, indent each line of code within the loop with a tab ⌠note how I have my code structured â it really helps to see where the loops are and which brackets pair up.Slide27: File Handling: talking to the outside world can open existing files to read in data and can create new files to write to using âopenâ open (HANDLE, âFileName.txtâ); shorthand file handle actual file name ⌠default is read-only fileSlide28: File Handling: talking to the outside world can open existing files to read in data and can create new files to write to using âopenâ open (HANDLE, â>FileName.txtâ) shorthand file handle actual file name this â>â means itâs a writable file You can also use this function to create a new file and write to it: open (SF, â>SaveFile.txtâ); print SF â$xâ; # instead of printing to the screen, you will now print to the file.Slide29: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; open (SF, â>SaveFile.txtâ); for (my $i=0; $i<3; $i++) { print SF âNumber $i favorite holiday for $Name is $FavoriteHolidays[$i]\nâ; } Exercise 4: print results to a file #Notice how I had to create SF outside the loop so that the file is globally accessible.Slide30: Regular expressions: comparing sequences These are some of the most useful functions in PERL. They allow you to easily scan your sequence, search for substrings, transpose, etc. =~ is the operator for doing regular expressions. =~ m is the match operator ⌠used to search for a match to some sequence $sequence = âCCATATAGAGATGAGCCTATAâ; if ($sequence =~ m/GATGAG/) { print âsequence contains GATGAG\nâ; } # This will search $sequence for whatever is between / .. / (GATGAG in this case) .. if any part of your sequence matches GATGAG, the statement is TRUE & you get inside the loopSlide31: Reading in a file: combining file handling and the while statement open (FILE, âFileName.txtâ) while (my $line = <FILE>) { chomp($line); print â$line\nâ; }Slide32: Reading in a file: combining file handling and the while statement open (FILE, âFileName.txtâ) while (my $line = <FILE>) { chomp($line); print â$line\nâ; } create a variable to hold each line of the file <..> is the line input operator ⌠reads each line in a file while there are more lines in FILE The chomp function removes the last character of the line (you only want to use this if you need to get rid of the âreturnâ character at the end of a line ⌠try with and without this line of code to see what itâs doing.Slide33: Exercise 4: open and read a Fasta file Create a new file called ReadFasta.pl emacs ReadFile.pl Type the usual stuff at the top of the file #!/usr/bin/perl use strict; Open the file upstream.fasta and read in the data using the âwhileâ statement open (FILE, âPAC-genes.fastaâ); while (my $line = <FILE>) { # try also with chomp($line); print âline = $line\nâ; } Save the file: Ctrl x s Run the file: perl ReadFasta.plSlide34: #!/usr/bin/perl use strict; open (FILE< âPAC-genes.fastaâ); while (my $line = <FILE>) { chomp($line); # try with and without this line print âline is $line\nâ; }Slide35: Exercise 4: open and read a Fasta file You will store the fasta sequence data in a Hash. Go back into your program and create a hash to hold the FASTA sequence. Then create a scalar $gene to hold gene name my %Fasta; my $gene; In the while statement, evaluate each line to see if it is Name or Sequence. A fasta file has >NAME\n followed by sequence if ($line =~ m/>/) { # if the line contains a â>â character my $gene = $line; } 8. Now you know that the subsequent lines must be sequence. Store that in the hash else { $Fasta{$gene} = $Fasta{$gene} . $line; } Note what we are doing: we expect >NAME to come before sequence ⌠but the sequence could extend for multiple lines in the file. Therefore, we need to concatenate sequence from multiple lines, hence the â.â operator to concatenate strings ⌠Here we are resetting $Fasta{$gene} to be whatever was stored in there before plus (using the string operator . ) the new line of sequence.Slide36: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { chomp ($line); if ($line =~m />/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } }Slide37: Exercise 4: open and read a Fasta file Next, youâll search through each upstream sequence for each gene for a consensus sequence. We need a way to search through all of the sequences, indexed by genes. We will use the âforeachâ method of looping. Because the elements of a hash are not stored in any special order, we will use a way to step through each âkeyâ in the hash. foreach my $g (keys %Fasta) { print âgene is $g and sequence is $Fasta{$g}\nâ; } This means, foreach of the keys in %Fasta, set $g equal to the key at hand ⌠then do whatever functions on that particular key ⌠for the next loop, $g will get set to the next key in the hash ⌠you will cycle through the data until youâve gone through all the keys in the hash.Slide38: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { if ($line =~m />/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } # be sure to finish loading your file (close the loop) before starting next loop! foreach my $g(keys %Fasta) { print âgene is $g and sequence is $Fasta{$g}\nâ; }Slide39: Exercise 4: open and read a Fasta file Next, youâll search through each upstream sequence for each gene for a consensus sequence. You will make a new hash to store the sequence matches. 10. Next, within your loop ⌠search each upstream sequence for the motif, GATGAG If there is a match, print the data to the screen { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains match to GATGAGâ; } this little i means do a case-insensitive searchSlide40: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { if ($line =~ m/>/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } # be sure to finish loading your file (close the loop) before starting next loop! foreach my $g(keys %Fasta) { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains GATGAG\nâ; } }Slide41: Exercise 4: open and read a Fasta file Finally, save the results to a new file. Create savefile, âMatches.txtâ using the open operator ⌠you must create this outside the loop so that it is globally visable to perl: open (SF, â>Matches.txtâ); Step through the hash and print the gene and match information to the file Save the file Ctrl x Ctrl s Run the program from the command line perl ReadFasta.plSlide42: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { chomp($line); if ($line =~ m/>/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } open (SAVEFILE, â>Matches.txtâ); foreach my $g(keys %Fasta) { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains GATGAG\nâ; print SAVEFILE â$g contains GATGAG\nâ; } }Slide43: One more useful thing ⌠more flexible matching You can also search for less specific motifs by having flexible characters at specific positions in your binding site if ($Fasta{$g} =~ m/GA[GATC]GAG/) { ⌠} Here (specifically, inside the / ⌠/ of a âmatchâ expression), the square brackets specify that the match could contain any ONE of the characters listed at that position of the motif. so, GAGGAG, GAAGAG, GATGAG, GACGAG would all match the sequence youâre searching for. if ($Fasta{$g} =~ m/GA[GAT]GAG/) { ⌠} here, GAGGAG, GAAGAG, GATGAG would match but not GACGAG (since C is not specified in the 3rd position). You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
875 PERL 06 mini Haggrid Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 210 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 02, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... By: arivu (30 month(s) ago) can u send this ppt to my mail id: arivu.kkarasan@yahoo.in Saving..... Post Reply Close Saving..... Edit Comment Close By: ksubbaraju (37 month(s) ago) please can u send this PPT ksubbaraju08@gmail.com Saving..... Post Reply Close Saving..... Edit Comment Close Premium member Presentation Transcript Slide1: Introduction to PERL Genetics 875 10/19/06 PERL is a language that is easy to use and was designed to do certain tasks (like reading, writing, moving text & sequences) very well Advantages of PERL: - itâs intuitive - itâs easy to get started ⌠you donât need to know everything initially - itâs very good at reading and manipulating files, sequences, text - there is usually >1 way to accomplish a task Disadvantages of PERL: Perl programs are different from other programs, in that the program you write is ârunâ by another program which interprets your code (this interpreter is actually called perl ⌠your programs will be run by the perl interpreter). Because of this, your code is one level removed from the actual computer ⌠Therefore, perl programs are slower than other languages (like C, C++). Thus, perl is not used so much for functions that require heavy computation.Slide2: A great way to learn PERL: http://www.oreilly.com/catalog/lperl3/ âLearning Perlâ Also, some great online resources: http://www.perl.com/ A short PERL tutorial http://archive.ncsa.uiuc.edu/General/Training/PerlIntro/ And lots of other help on the web âŚ.Slide3: Like any language, programming languages have structure A book has words, sentences, paragraphs, chapters, and punctuation linking them all together ENGLISH PERL Noun Scalar, variable Verb Function, command Phrase Statement, expression Paragraph Loop Chapter Subroutines, packages, modulesSlide4: A variable is a container that can hold information that has the potential to vary Variables can be singular, in which case they are identified by a â$â in front of the variable name eg) $x $File1 $StudentName They can be a number, letter (called a âcharacterâ in perl), string of numbers, or string of letters ⌠just remember that whatever it is, it is considered a single item. eg) $x = 5 $motif = âGATTACâ $StudentName = âRutabegaâ Variables can be plural and those come in different forms: Arrays and HashesSlide5: Arrays are a list of single variables An array is a container that holds a list of separate, single variables in a specific order An array is denoted by a @ in front of its name Eg) @StudentNames = âCaligulaâ, âRandolphâ, âImeldaâ âCaligulaâ is stored in the first âcellâ of the array, which is the â0â cell âRandophâ is stored in the second âcellâ of the array, which is the â1â cell âImeldaâ is stored in the third âcellâ of the array, which is the â2â cell ** Note that in programming languages, you always start counting at â0â instead of at â1â Position in array: 0 1 2 Value stored at that position: Caligula Randolph ImeldaSlide6: Arrays are a list of single variables An array is a container that holds a list of separate, single variables in a specific order An array is denoted by a @ in front of its name Eg) @StudentNames = ( âCaligulaâ, âRandophâ, âImeldaâ ) Position in array: 0 1 2 Value stored at that position: Caligula Randoph Imelda You can âcallâ a specific cell (which, remember, is a singular variable identified by $): $StudentNames[0] = âCaligulaâ $StudentNames[1] = âRandolphâ This $ tells perl that you want a singular variable These brackets tell perl that you are looking at a single cell in an array Between these two parts of the name, perl knows this is a cell of an arraySlide7: Exercise 0: Write your first perl program! We will start by creating a simple perl program where you will print a string to the screen. A few things about writing perl programs: -- The first few lines of the program (which youâll write in a .pl text file) will contain information for the computer about how to run the program -- In order for the perl interpreter to understand your code, you must use the right syntax. Like in English, each phrase must have an obvious start and stop point. The most common punctuation in perl is â;â which acts like a period does in English. A statement begins after the â;â from the previous statement, and ends at the next â;â There will be other kinds of punctuation which define statements/items, like (âŚ.) {âŚâŚ} ââŚ..â and weâll get to these in a bit. One useful punctuation is # which means âDonât read this line of the fileâ â it is useful because you can type in notes to yourself (âcommentsâ) that arenât part of the code.Slide8: Exercise 0: Write your first perl program! Open the terminal on your computer and go to the desktop use the unix command âcdâ to change directories (type everything written in brown) cd Desktop We will use the text editor âemacsâ to create and write your file emacs FirstProgram.pl This should open a blank file, since you just created it Type this in the first line of your .pl file: #!/usr/bin/perl This is a special magic command that tells the computer to use the perl interpreter to read and execute your program. We will use a special mode of perl called âstrict.â To do that, type this on the second line of your .pl file: use strict; Save your file using the emacs command, âCtrl x Ctrl sâ (ie, hold down Ctrl key and hit x then s) You are now ready to start writing your own code!Slide9: Exercise 0: Write your first perl program! 6. Print a sentence to the screen using the built-in perl âprintâ function print âHello. Welcome to your first perl program \nâ; The default for the print function is to print to the screen from where you ran the program. We will learn later how you can print to a file. Note the â\nâ at the end of this print statement. â\nâ stands for ânew-line characterâ This â\nâ adds a âreturnâ to the end of your statement to end the line Youâve now written your first perl program. To run your program, open another terminal window. You will call the perl interpreter and then feed it your program file name perl FirstProgram.pl 7. Save your file using the emacs command, âCtrl x Ctrl sâ (ie, hold down Cntrl key and hit x then s) You will either see your sentenced on the screen, or you will get some kind of error âŚSlide10: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ;Slide11: Exercise 1: Modify your first perl program We will create and define a string variable and an array. You will add code to your existing program. Make a variable called Name: my $Name; *since we are using âstrictâ mode, you must define a variable before you use it ⌠for whatever reason, you do that by typing âmyâ in front of the variable, only when you create the variable (ie. The first time you ever type it) Define the variable $Name to be your own name: $Name = âAudreyâ; Create an array called FavoriteHolidays my @FavoriteHolidays; Define the array as your top 3 favorite holidays, exactly as below: @FavoriteHolidays = (âHalloweenâ, âChristmasâ , âArbor Dayâ); Slide12: Exercise 1: Modify your first perl program Print the variables you just defined to the screen using the built-in perl âprintâ function print âThe top favorite holiday for $Name is $FavoriteHolidays[0]\nâ; Save your program by typing Ctrl x Ctrl s 12. Exit the program by typing Ctrl x Ctrl c You will either see your name and holiday, or you will get some kind of error ⌠To run your program, open another terminal window. You will call the perl interpreter and then feed it your program file perl FirstProgram.pl We will create and define a string variable and an array.Slide13: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); print "The top favorite holiday for $Name is $FavoriteHolidays[0]\n"; Slide14: Hashes are fancy containers for single variables Whereas an array indexes variables by their position in the list: A hash indexes one variable by another (known as a âkeyâ): for example, Name and hometown Key in hash: Caligula Randolph Imelda Value stored with that key: Rome Berlin Manila A hash is denoted by %. To call the individual values contained in the hash, you need the key name my %HomeTowns; $HomeTowns{ âCaligulaâ} = âRomeâ Position in array: 0 1 2 Value stored at that position: Caligula Randoph Imelda $ for calling single variable curly brackets tell you itâs a hashSlide15: Exercise 2: Create and use a Hash 1. You will add code to your existing program. Make a hash called HolidayMonth: my %HolidayMonth; Define the Hash, with the key = holiday and the stored value = the month $HolidayMonth{ âHalloweenâ } = âOctoberâ; $HolidayMonth{ âChristmasâ } = âDecemberâ; $HolidayMonth{ âArbor Dayâ } = âAprilâ; 3. Print the month of the top holiday print âThe top favorite holiday for $Name is $FavoriteHolidays[0] in $HolidayMonth{Halloween} \nâ;Slide16: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; print âThe top favorite holiday for $Name is $FavoriteHolidays[0] in $HolidayMonth{Halloween} \nâ;Slide17: Perl has a lot of built in functions and âoperatorsâ + means add $x + 5; is 7 means subtract $y â 3; is 0 * means multiply $x * 3; is 6 / means divide ($x*3)/2 is 3 ++ means increase by 1 $y++; is 4 = assignment operator (set a variable to = something) = = is to evaluate equality There are different operators for strings: $x = 123 $y = 456 $z = 3 . means concatenate two strings $x . $y; is 123456 x means replicate a string $z x 4; is 3333 eq evaluates string equality These things work on numbers. $x = 2; $y = 3;Slide18: Conditional statement Often you only want to do something if a certain condition is true. This is a case for if/unless/else statements If $x is equal to 5, then do something translates to if ($x = = 5) { something âŚ. }Slide19: Conditional statement Often you only want to do something if a certain condition is true. This is a case for if/unless/else statements If $x is equal to 5, then do something translates to if ($x = = 5) { something âŚ. } Parentheses define the start and stop of the condition = = means if $x is exactly equal to 5 If you type if ($x = 5) it will reset $x to be 5 and the statement is automatically true ⌠this is because to perl, = means âset this variable equal to âŚâ Curly brackets define what to do if the conditional statement is true.Slide20: Conditional statement Can also use if-then-else statements: if ($x = = 5) { something âŚ. }else { do something different ⌠} if ($x = = 5) { something âŚ. }elsif ($x<10) { do something different ⌠} OR The program will evaluate the statement in ( âŚ) â if true, it will do whatâs in { ..} if false it will SKIP whatâs in { ⌠} and resume on the line after that section.Slide21: Conditional statement The âwhileâ statement is useful: do something while (some condition is true). my $count = 0; while ($count < 100) { do some function ⌠$count++; ) The âwhileâ statement turns out to be very useful for reading in files ⌠Remember that ++ is the âincrement by oneâ operator. So each time you go through the loop, $count increases by one. If you forget to increase count and it stays at 0, you will be in an infinite loop. Note that a while statement is a kind of loop âŚSlide22: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} There are 3 components of a âfor loopâ:Slide23: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} create a new variable to use as a counter usually start that counter off at 0 do whatever as long as $i < 10 after each loop, increment $I by one (using the ++ operator)Slide24: Repeating actions: Loops Very often, want to repeat the same function many times (often on different variables). For example: -- open a file of microarray data -- read in each line of the file -- divide the 3rd cell of data by some constant -- save the file for (my $i = 0; $i<10; $i++) { do something ⌠} create a new variable to use as a counter usually start that counter off at 0 do whatever as long as $i < 10 after each loop, increment $I by one (using the ++ operator) An important concept: scope â if you create a variable inside a loop, it is a âlocalâ variable = it only exists while youâre in the loop (in this case, $i is a local variable). If you want a variable that is âglobal,â ie. it exists for the duration of the program, be sure to declare it outside of any loops.Slide25: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; for (my $i=0; $i<3; $i++) { print âNumber $i favorite holiday for $Name is $FavoriteHolidays[$i]; } Exercise 3: using loopsSlide26: A note about loops, spaces, and punctuation: Having the correct punctuation is critical in perl. -- Sometimes if you have the wrong punctuation, perl will choke and give you an error -- Other times it will NOT choke and may look like itâs working, but itâs actually not doing what you intended. Spaces: perl reads anything without a space as one âwordâ. If you are trying to add two variables together, you must allow perl to read the separate items in your statement: $x=$y.$z #(perl will read this as one âwordâ and probably choke) $x = $y . $z #(eg. In order to understand the â.â operator, it should be flanked by spaces eg) if ($x < $y) {⌠} # if there is no space between âifâ and â(â perl canât read it. however) chomp( $whatever ); # here chomp is a function and the ( ..) actually are # part of that function â so here you canât have a space # between chomp and (). it takes awhile to get the hang of when spaces donât affect things and when they do. Curly brackets in loops: Always think carefully about the structure of your program. Sometimes you want to finish one loop before beginning the next one, other times you need nested loops ⌠it depends on what youâre trying to do â but if you set it up incorrectly, your program may not be functioning as you had intended (an example of this later on). One thing that helps a lot in looking at your own code is using tabs in your writing: every time you start a new loop, indent each line of code within the loop with a tab ⌠note how I have my code structured â it really helps to see where the loops are and which brackets pair up.Slide27: File Handling: talking to the outside world can open existing files to read in data and can create new files to write to using âopenâ open (HANDLE, âFileName.txtâ); shorthand file handle actual file name ⌠default is read-only fileSlide28: File Handling: talking to the outside world can open existing files to read in data and can create new files to write to using âopenâ open (HANDLE, â>FileName.txtâ) shorthand file handle actual file name this â>â means itâs a writable file You can also use this function to create a new file and write to it: open (SF, â>SaveFile.txtâ); print SF â$xâ; # instead of printing to the screen, you will now print to the file.Slide29: #!usr/bin/perl use strict; print âHello. Welcome to your first perl program \nâ; my $Name; $Name = "Audrey"; my @FavoriteHolidays; @FavoriteHolidays = ("Halloween", "Christmas", "Arbor Day"); my %HolidayMonth; $HolidayMonth{âHalloweenâ} = âOctoberâ; $HolidayMonth{âChristmasâ} = âDecemberâ; $HolidayMonth{âArbor Dayâ} = âAprilâ; open (SF, â>SaveFile.txtâ); for (my $i=0; $i<3; $i++) { print SF âNumber $i favorite holiday for $Name is $FavoriteHolidays[$i]\nâ; } Exercise 4: print results to a file #Notice how I had to create SF outside the loop so that the file is globally accessible.Slide30: Regular expressions: comparing sequences These are some of the most useful functions in PERL. They allow you to easily scan your sequence, search for substrings, transpose, etc. =~ is the operator for doing regular expressions. =~ m is the match operator ⌠used to search for a match to some sequence $sequence = âCCATATAGAGATGAGCCTATAâ; if ($sequence =~ m/GATGAG/) { print âsequence contains GATGAG\nâ; } # This will search $sequence for whatever is between / .. / (GATGAG in this case) .. if any part of your sequence matches GATGAG, the statement is TRUE & you get inside the loopSlide31: Reading in a file: combining file handling and the while statement open (FILE, âFileName.txtâ) while (my $line = <FILE>) { chomp($line); print â$line\nâ; }Slide32: Reading in a file: combining file handling and the while statement open (FILE, âFileName.txtâ) while (my $line = <FILE>) { chomp($line); print â$line\nâ; } create a variable to hold each line of the file <..> is the line input operator ⌠reads each line in a file while there are more lines in FILE The chomp function removes the last character of the line (you only want to use this if you need to get rid of the âreturnâ character at the end of a line ⌠try with and without this line of code to see what itâs doing.Slide33: Exercise 4: open and read a Fasta file Create a new file called ReadFasta.pl emacs ReadFile.pl Type the usual stuff at the top of the file #!/usr/bin/perl use strict; Open the file upstream.fasta and read in the data using the âwhileâ statement open (FILE, âPAC-genes.fastaâ); while (my $line = <FILE>) { # try also with chomp($line); print âline = $line\nâ; } Save the file: Ctrl x s Run the file: perl ReadFasta.plSlide34: #!/usr/bin/perl use strict; open (FILE< âPAC-genes.fastaâ); while (my $line = <FILE>) { chomp($line); # try with and without this line print âline is $line\nâ; }Slide35: Exercise 4: open and read a Fasta file You will store the fasta sequence data in a Hash. Go back into your program and create a hash to hold the FASTA sequence. Then create a scalar $gene to hold gene name my %Fasta; my $gene; In the while statement, evaluate each line to see if it is Name or Sequence. A fasta file has >NAME\n followed by sequence if ($line =~ m/>/) { # if the line contains a â>â character my $gene = $line; } 8. Now you know that the subsequent lines must be sequence. Store that in the hash else { $Fasta{$gene} = $Fasta{$gene} . $line; } Note what we are doing: we expect >NAME to come before sequence ⌠but the sequence could extend for multiple lines in the file. Therefore, we need to concatenate sequence from multiple lines, hence the â.â operator to concatenate strings ⌠Here we are resetting $Fasta{$gene} to be whatever was stored in there before plus (using the string operator . ) the new line of sequence.Slide36: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { chomp ($line); if ($line =~m />/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } }Slide37: Exercise 4: open and read a Fasta file Next, youâll search through each upstream sequence for each gene for a consensus sequence. We need a way to search through all of the sequences, indexed by genes. We will use the âforeachâ method of looping. Because the elements of a hash are not stored in any special order, we will use a way to step through each âkeyâ in the hash. foreach my $g (keys %Fasta) { print âgene is $g and sequence is $Fasta{$g}\nâ; } This means, foreach of the keys in %Fasta, set $g equal to the key at hand ⌠then do whatever functions on that particular key ⌠for the next loop, $g will get set to the next key in the hash ⌠you will cycle through the data until youâve gone through all the keys in the hash.Slide38: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { if ($line =~m />/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } # be sure to finish loading your file (close the loop) before starting next loop! foreach my $g(keys %Fasta) { print âgene is $g and sequence is $Fasta{$g}\nâ; }Slide39: Exercise 4: open and read a Fasta file Next, youâll search through each upstream sequence for each gene for a consensus sequence. You will make a new hash to store the sequence matches. 10. Next, within your loop ⌠search each upstream sequence for the motif, GATGAG If there is a match, print the data to the screen { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains match to GATGAGâ; } this little i means do a case-insensitive searchSlide40: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { if ($line =~ m/>/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } # be sure to finish loading your file (close the loop) before starting next loop! foreach my $g(keys %Fasta) { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains GATGAG\nâ; } }Slide41: Exercise 4: open and read a Fasta file Finally, save the results to a new file. Create savefile, âMatches.txtâ using the open operator ⌠you must create this outside the loop so that it is globally visable to perl: open (SF, â>Matches.txtâ); Step through the hash and print the gene and match information to the file Save the file Ctrl x Ctrl s Run the program from the command line perl ReadFasta.plSlide42: #!/usr/bin/perl use strict; open (FILE< âupstreams.fastaâ); my %Fasta; my $gene; while (my $line = <FILE>) { chomp($line); if ($line =~ m/>/) { $gene = $line; } else { $Fasta{$gene} = $Fasta{$gene} . $line; } } open (SAVEFILE, â>Matches.txtâ); foreach my $g(keys %Fasta) { if ($Fasta{$g} =~ m/GATGAG/i) { print â$g contains GATGAG\nâ; print SAVEFILE â$g contains GATGAG\nâ; } }Slide43: One more useful thing ⌠more flexible matching You can also search for less specific motifs by having flexible characters at specific positions in your binding site if ($Fasta{$g} =~ m/GA[GATC]GAG/) { ⌠} Here (specifically, inside the / ⌠/ of a âmatchâ expression), the square brackets specify that the match could contain any ONE of the characters listed at that position of the motif. so, GAGGAG, GAAGAG, GATGAG, GACGAG would all match the sequence youâre searching for. if ($Fasta{$g} =~ m/GA[GAT]GAG/) { ⌠} here, GAGGAG, GAAGAG, GATGAG would match but not GACGAG (since C is not specified in the 3rd position).