UMGC FASTA Nucleotide Sequence Python Project
Description
Q2) Write a program that asks the user for a file containing a FASTA nucleotide sequence (you can use the same sequence.fasta file as above). Then prompt the user to select a frame (number 1 through 6). Your program should then find the translation (protein sequence) of the nucleotide sequence in that frame. Print the translation to the screen.
Input validation: Check to see that the file name entered by the user exists AND that the sequence is in FASTA format. You can assume that there is only one sequence in the file.
Q3) Write a program that asks the user for a sequence in GenBank format (included is a file called sequence.gb that you can use). Your program should convert the GenBank formatted sequence into FASTA format. Write the FASTA formatted sequence to a file, name of which should include the accession number (i.e. NM_001250672.txt, where NM_001250672 is the accession number).
Q4) Write a program that asks the user for a file containing a nucleotide sequence AND the name of a restriction enzyme. Your program should return the positions in the sequence where the enzyme cuts. Parse out the enzymes and their cut sites from the attached RestrictionEnzymes.txt file.
Q5) Read in a whole genome (in FASTA format – file called genome.txt, see attached) and compute the background codon frequencies. The background frequency of a codon is computed by the formula:
background_frq(codon) = 100 * N(codon)/ Total_codons
where N(codon) is the number of occurrence of the codon across the entire genome, and Total_codons is the total number of all codons in the whole genome. Print out the background frequency of each codon, from AAA to TTT. Use a dictionary in your solution.
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."