The program to search for strand-bias (see class8) was modified as follows:

* Homework assignments:

Read through the different scripts to understand what is going on (compare to class8).

Run the script on at least 3 genomes (see hw5 on where to find the genomes, and question below to choose appropriate genomes; in selecting appropriate genomes keep in mind that the individual lists are "incomplete", e.g., the listing of completed "microbial genomes" does not contain some (any?) archaea). Interpret your results!

Repeat the analysis with differently sized oligos (please feel free to modify to make this possible without rewriting the code).

** Is there a limit to how large one could make the oligos? Why would it be meaningless to calculate the bias for a 30mer?

** Do different organisms have the same or similar oligo strand bias? How quickly does the bias change (with respect to the oligo sequence), when one moves to less related organisms. Possible questions to address: Do archaea have the same bias as bacteria? How about genomes for the same species or the same genus?

** To what extent can the oligo starnd bias be explained by the nucleotide stand bios? Can one calculate how big an oligo bias should be expected from a given single nucleotide bias?