Use WU-blastn
or NCBI’s blastn (Sequence Identification Application in the Applications link
in CU’s Bioinformatics web page).
a. Human ornithine aminotransferase (OAT), nuclear gene encoding mitochondrial protein
BASE COUNT 584 a 373 c 449 g 624 tORIGIN 1 gcgtgtaccc ggttgtcctc aggcgctgtc agatctgtgg tttttctact tgaaggacac
61 aatgttttcc aaactagcac atttgcagag gtttgctgta cttagtcgcg gagttcattc 121 ttcagtggct tctgctacat ctgttgcaac taaaaaaaca gtccaaggcc ctccaacctc 181 tgatgacatt tttgaaaggg aatataagta tggtgcacac aactaccatc ctttacctgt 241 agccctggag agaggaaaag gtatttactt atgggatgta gaaggcagaa aatattttga 301 cttcctgagt tcttacagtg ctgtcaacca agggcattgt caccccaaga ttgtgaatgc 361 tctgaagagt caagtggaca aattgacctt aacatctaga gctttctata ataacgtact 421 tggtgaatat gaggagtata ttactaaact tttcaactac cacaaagttc ttcctatgaa 481 tacaggagtg gaggctggag agactgcctg taaactagct cgtaagtggg gctataccgt 541 gaagggcatt cagaaataca aagcaaagat tgtttttgca gctgggaact tctggggtag 601 gacgttgtct gctatctcca gttccacaga cccaaccagt tacgatggtt ttggaccatt 661 tatgccggga ttcgacatca ttccctataa tgatctgccc gcactggagc gtgctcttca 721 ggatccaaat gtggctgcgt tcatggtaga accaattcag ggtgaagcag gcgttgttgt 781 tccggatcca ggttacctaa tgggagtgcg agagctctgc accaggcacc aggttctctt 841 tattgctgat gaaatacaga caggattggc cagaactggt agatggctgg ctgttgatta 901 tgaaaatgtc agacctgata tagtcctcct tggaaaggcc ctttctgggg gcttataccc 961 tgtgtctgca gtgctgtgtg atgatgacat catgctgacc attaagccag gggagcatgg 1021 gtccacatac ggtggcaatc cactaggctg ccgagtggcc atcgcagccc ttgaggtttt 1081 agaagaagaa aaccttgctg aaaatgcaga caaattgggc attatcttga gaaatgaact 1141 catgaagcta ccttctgatg ttgtaactgc cgtaagagga aaaggattat taaatgctat 1201 tgtcattaaa gaaaccaaag attgggatgc ttggaaggtg tgtctacgac ttcgagataa 1261 tggacttctg gccaagccaa cccatggcga cattatcagg tttgcgcctc cgctggtgat 1321 caaggaggat gagcttcgag agtccattga aattattaac aagaccatct tgtctttctg 1381 agggtagcca gctgttttca gtggtccctg ggagccagct ggagacaggt ggtcctgtaa 1441 aagctttatt cctaatgtgg gcacattcca ctcccatgag tcttcaaaaa cttttttttt 1501 gaatatattt ttttcagttg atacataata gaacaacgtt tatgaacctg ccgtttgctt 1561 tgtaacgtaa ctaaataatg taatggcatc tatattcagt tgaagtgttt tgatgtgcat 1621 gtgtacttcc taaggtgaaa tgcatctata tacagacagc ctctaaatca agtccttcag 1681 tataattgat atatgttttt ataatttcct cactggtata agtgtttcat atttgaaaaa 1741 gttatctctg ggtattgcat aaaaggcttc atcttataaa gtgaaatcat tgttattgaa 1801 ttttaggaag gattaatggt taagtgtata taaaatacta atattaagta aacttcatat 1861 tggccaacac cagggttgta ttctatggat gtcattattt tgaattaaga attagtgttt 1921 aacattccta aattgttttg agtgcttgat tataatttgt aaaaaatgtt tattttcaat 1981 acttctttaa atttaaaata aagcttatat ttcaaaaaaa aaaaaaaaaa
b.
YEL076C gene in chromosome V
of Saccharomyces cerevisiae.
Sequence is too long to
copy into this link. For correct answer do the following.
Copy and paste (or type)
the following into the search box:
YEL076C AND Saccharomyces
cerevisiae AND chromosome V
and click on “GO”.
c. Homo sapiens herpesvirus
protein B (HVEB) mRNA
BASE COUNT 388 a 623 c 600 g 317 tORIGIN
1 gagcagaaca gggaggctag agcgcagcgg gaaccggccc ggagccggag ccggagcccc 61 acaggcacct actaaaccgc ccagccgatc ggcccccaca gagtggcccg cgggcctccg 121 gccgggccca gtcccctccc gggccctcca tggcccgggc cgctgccctc ctgccgtcga 181 gatcgccgcc gacgccgctg ctgtggccgc tgctgctgct gctgctcctg gaaaccggag 241 cccaggatgt gcgagttcaa gtgctacccg aggtgcgagg ccagctcggg ggcaccgtgg 301 agctgccgtg ccacctgctg ccacctgttc ctggactgta catctccctg gtgacctggc 361 agcgcccaga tgcacctgcg aaccaccaga atgtggccgc cttccaccct aagatgggtc 421 ccagcttccc cagcccgaag cctggcagcg agcggctgtc cttcgtctct gccaagcaga 481 gcactgggca agacacagag gcagagctcc aggacgccac gctggccctc cacgggctca 541 cggtggagga cgagggcaac tacacttgcg agtttgccac cttccccaag gggtccgtcc 601 gagggatgac ctggctcaga gtcatagcca agcccaagaa ccaagctgag gcccagaagg 661 tcacgttcag ccaggaccct acgacagtgg ccctctgcat ctccaaagag ggccgcccac 721 ctgcccggat ctcctggctc tcatccctgg actgggaagc caaagagact caggtgtcag 781 ggaccctggc cggaactgtc actgtcacca gccgcttcac cttggtgccc tcgggccgag 841 cagatggtgt cacggtcacc tgcaaagtgg agcatgagag cttcgaggaa ccagccctga 901 tacctgtgac cctctctgta cgctaccctc ctgaagtgtc catctccggc tatgatgaca 961 actggtacct cggccgtact gatgccaccc tgagctgtga cgtccgcagc aacccagagc 1021 ccacgggcta tgactggagc acgacctcag gcaccttccc gacctccgca gtggcccagg 1081 gctcccagct ggtcatccac gcagtggaca gtctgttcaa taccaccttc gtctgcacag 1141 tcaccaatgc cgtgggcatg ggccgcgctg agcaggtcat ctttgtccga gaaaccccca 1201 gggcctcgcc ccgagatgtg ggcccgctgg tgtggggggc cgtggggggg acactgctgg 1261 tgctgctgct tctggctggg gggtccttgg ccttcatcct gctgagggtg aggaggagga 1321 ggaagagccc tggaggagca ggaggaggag ccagtggcga cgggggattc tacgatccga 1381 aagctcaggt gttgggaaat ggggaccccg tcttctggac accagtagtc cctggtccca 1441 tggaaccaga tggcaaggat gaggaggagg aggaggagga agagaaggca gagaaaggcc 1501 tcatgttgcc tccaccccca gcactcgagg atgacatgga gtcccagctg gacggctccc 1561 tcatctcacg gcgggcagtt tatgtgtgac ctggacacag acagagacag agccaggccc 1621 ggccctcccg cccccgacct gaccacgccg gcctagggtt ccagactggt tggacttgtt 1681 cgtctggacg acactggagt ggaacactgc ctcccacttt cttgggactt ggagggaggt 1741 ggaacagcac actggacttc tcccgtctct agggctgcat ggggagcccg gggagctgag 1801 tagtggggat ccagagagga cccccgcccc cagagacttg gttttggctc cagccttccc 1861 ctggccccgt gacactcagg agttaataaa tgccttggag gaaaacaaaa aaaaaaaaaa
1921 aaaaaaaa
d. Simian immunodeficiency virus of Chimpanzee (Pan troglodytes), related to HIV.
Sequence is too long to
copy into this link. For correct answer do the following.
Copy and paste (or type)
the following into the search box:
Pan trogloytes AND
immunodeficiency
and click on “GO”.
2. Find the possible origin and gene function of the following nucleotide sequences:
Use WU-blastn or NCBI’s blastn (Sequence Identification
Application)
a.
Sequence 1
CACGGTGATCAAAGTGAGAATGAGCTCCCAGGATTGGGGGGCAAGGAAGATAGGAGGGTCAAACAGAGTCGGGGAGAAGCCAGGGAGAGCTACAGAGAAACCGGGTCCAGCAGAGCAAGTGATGAGAGAGCTGCCCATCTTCCAACCAGCACACCCCTAGACATTGACACTGCATCGGAGTCAGGCCAAGATCCGCAGGACAGTCGAAGGTCAGCTGACGCCCTGCTCAGGCTGCAAGCCATGGCAGGAATCTTGGAGGAACAAGGCTCAGACACGGACACCCCTAGGGTG
Measles Virus (NE-94N) nucleoprotein mRNA, length 1578
b.
Sequence 2
CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGATCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG
Human beta-globin gene from a thalassemia
patient, length 3703
c.
Sequence 3
AGTACGGTACACCAATTATTTTGAAGGCCGCTTATGGAGGAGGTGGTCGTGGAATTCGTCGTGTTGATAAATTGGAAGAAGTTGAAGAGGCATTCCGTAGATCCTACTCGGAAGCTCAAGCTGCGTTTGGAGACGGAAGTCTTTTCGTTGAAAAGTTTGTTGAGAGACCAAGACATATTGAAGTTCAGCTGCTTGGAGACCATCATGGAAATATTGTTCATTTGTATGAGCGTGATTGTTCAGTGCAACGTCGTCATCAAAAGGTTGTTGAAATTGCTCCAGCGCCAGCTCTCCCAGAAGGTGTTCGTGAGAAAATTTTGGCAGACGCTCTTCGACTTGCAAGACATGTTGGATACCAAAATGCTGGTACAGTCGAATTCCTGGTTGATCAGAAGGGCAACTACTATTTCATCGAAGTGAATGCACGTCTTCAAGTCGAGCATACAGTAACTGAAGAGATCACTGGTGTGGATCTTGTCCAAGCTCAAATTCGTATCGCCGAAGGAAAATCTCTGGATGATCTGAAGCTTTCACAGGAAACTATTCAAACTACTGGCTCAGCTATTCAATGTCGTGTCACAACTGAAGATCCAGCTAAAGGATTCCAGCCAGATTCCGGAAGAATTGAAGTTTTCCGATCCGGAGAGGGAATGGGAATTCGTCTTGATTCAGCATCTGCCTTCGCAGGATCAGTCATTTCACCTCACTACGATTCTTTGATGGTCAAAGTAATTGCATCGGCTAGAAATCATCCGAACGCTGCCGCAAAAATGATTCGTGCCCTCAAAAAGTTCCGTATCCGAGGCGTAAAGACAAACATCCCATTTCTGCTCAACGTTCTTCGCCAGCCCAGCTTCCTTGATGCATCCGTCGATACGTATTTCATTGATGAGCATCCAGAGTTGTTCCAATTCAAACCAAGCCAAAACCGTGCTCAAAAGTTGTTGAACTATTTGGGAGAAGTTAAGGTGAACGGTCCAACTACTCCTCTTGCTACTGACCTGAAACCAGCAGTTGTTT
Caenorhabditis elegans pyruvate carboxylase (pyc-1) mRNA,
length 3864
3. Align the following sequences (single sequence alignment):
a.
Sequence 1
CAGGGGCAGTGCGGGAGGTGCTGGGCTTTCTCAACAGCTGAGGTGATTTCCGATCGAACATGTATTGCAA
GCAATGGTACCCAACAACCAATCATCTCCCCAACTGATCTGCTCACTTGTTGTGGAATGTCATGCGGAGA
GGGCTGTAACGGCGGCTAT
b.
Sequence 2
ATGTCTACTATTCCATCAGAAATAATCAATTGGACAATTTTGAACGAGATAATTTCCATGGATGACGATG
ACAGTGATTTTAGCAAAGGATTAATCATTCAATTCATAGATCAGGCTCAAACCACGTTTGCACAAATGCA
GAGACAACTAGACGGCGAAAAGAATCTTACTGAACTGGATAACTTGGGGCATTTCTTAAAAGGATCGTCT
GCCGCGCTCGGCTTGCAAAGGATCGCTTGGGTTTGTGAGCGTATTCAGAATTTAGGGAGAAAGATGGAAC
ACTTTTTCCCTAACAAAACAGAACTAGTAAATACCCTTTCAGATAAGTCCATTATAAATGGAATCAACAT
TGACGAGGATGATGAAGAAATAAAAATTCAAGTCGACGATAAAGATGAAAATAGTATCTATTTGATTTTA
ATAGCAAAGGCCCTGAACCAATCTCGATTGGAGTTTAAATTAGCTAGAATTGAACTATCAAAGTACTATA
ATACTAATCTT
c.
Sequence 3
ATTCTTTTGAGTCGGGAGAACTAGGTAACAATTCGGAAACTCCAAAGGGTGGATGAGGGGCGCGCGGGGT
GTGTGTGGGGGATACTCTGGTCCCCCGTGCAGTGACCTCTAAGTCAGAGGCTGGCACACACACACCTTCC
ATTTTTTCCCAACCGCAGGATGGCGCCTCATCCCTTGGATGCGCTCACCATCCAAGTGTCCCCAGAGACA
CAACAACCTTTTCCCGGAGCCTCGGACCACGAAGTGCTCAGTTCCAATTCCACCCCACCTAGCCCCACTC
TCATACCTAGGGACTGCTCCGAAGCAGAAGTGGGTGACTGCCGAGGGACCTCGAGGAAGCTCCGCGCCCG
ACGCGGAGGGCGCAACAGGCCCAAGAGCGAGTTGGCACTCAGCAAACAGCGAAGAAGCCGGCGCAAGAAG
GCCAATGATCGGGAGCGCAATCGCATGCACAACCTCAACTCGGCGCTGGATGCGCTGCGCGGTGTCCTGC
CCACCTTCCCGGATGACGCCAAACTTACAAAGATCGAGACCCTGCGCTTCGCCCACAACTACATCTGGGC
ACTGACTCAGACGCTGCGCATAGCGGACCACAGCTTCTATGGCCCGGAGCCCCCTGTGCCCTGTGGAGAG
CTGGGGAGCCCCGGAGGTGGCTCCAACGGGGACTGGGGCTCTATCTACTCCCCAGTCTCCCAAGCGGGTA
ACCTGAGCCCCACGGCCTCATTGGAGGAATTCCCTGGCCTGCAGGTGCCCAGCTCCCCATCCTATCTGCT
CCCGGGAGCACTGGTGTTCTCAGACTTCTTGTGAAGAGACCTGTCTGGCTCTGGGTGGTGGGTGCTAGTG
GAAAGGGAGGGGACCACAGCC
4. Translate the following nucleotide sequences in 3 reading frames:
Use the translation link in the Applications page of CU’s
Bioinformatics web site. Cut and
paste this sequence into the box provided and select 3
phase.
a. Sequence 1
CACGGTGATCAAAGTGAGAATGAGCTCCCAGGATTGGGGGGCAAGGAAGATAGGAGGGTCAAACAGAGTCGGGGAGAAGCCAGGGAGAGCTACAGAGAAACCGGGTCCAGCAGAGCAAGTGATGAGAGAGCTGCCCATCTTCCAACCAGCACACCCCTAGACATTGACACTGCATCGGAGTCAGGCCAAGATCCGCAGGACAGTCGAAGGTCAGCTGACGCCCTGCTCAGGCTGCAAGCCATGGCAGGAATCTTGGAGGAACAAGGCTCAGACACGGACACCCCTAGGGTG
Phase 2 gives the following sequence of amino acids. The
highlighted section is the one “most likely to be correct”. However, an alignment should be run to check
the phase.
TVIKVRMSSQDWGARKIGGSNRVGEKPGRATEKPGPAEQVMRELPIFQPAHP*TLTLHRSQAKIRRTVEGQLTPCSGCKPWQESWRNKAQTRTPLG
b.
Sequence 2
CAGGGGCAGTGCGGGAGGTGCTGGGCTTTCTCAACAGCTGAGGTGATTTCCGATCGAACATGTATTGCAA
GCAATGGTACCCAACAACCAATCATCTCCCCAACTGATCTGCTCACTTGTTGTGGAATGTCATGCGGAGA
GGGCTGTAACGGCGGCTAT
Phase 1
QGQCGRCWAFSTAEVISDRTCIASNGTQQPIISPTDLLTCCGMSCGEGCNGGY
Phase 2
RGSAGGAGLSQQLR*FPIEHVLQAMVPNNQSSPQLICSLVVECHAERAVTAA
Phase 3
GAVREVLGFLNS*GDFRSNMYCKQWYPTTNHLPN*SAHLLWNVMRRGL*RRL
c.
Sequence 3
CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGATCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG
Phase 1
PCGATP*GWPIYSQEQGGQEPGLGIKVRAEPSIAYICF*HNCVH*QPQTDTMVHLTPEEKSAVTALWGKVNVDEVGGEALGRLVSRLQDRFKETNRNWACGDREDSWVSDRH*LSLPIGLFSHP*AAGDLPLDPEVL*VLWGSVHS*CCYG
Phase 2
PVEPHPRVGQSTPRSREGRSQGWA*KSGQSHLLLTFASDTTVFTSNLKQTPWCT*LLRRSLPLLPCGAR*TWMKLVVRPWAGWYQGYKTGLRRPIETGHVETEKTLGFLIGTDSLCLLVYFPTLRLLVIYPWTQRFFESFGDLSTPDAVM
Phase 3
LWSHTLGLANLLPGAGRAGARAGHKSQGRAIYCLHLLLTQLCSLATSNRHHGAPDS*GEVCRYCPVGQGERG*SWW*GPGQVGIKVTRQV*GDQ*KLGMWRQRRLLGF**ALTLSAYWSIFPPLGCW*STLGPRGSLSPLGICPLLMLLW
5. Find three article/journal citations for each the following topics:
Use the link to Journal Retrieval in the applications section
of CU’s Bioinformatics web page.
a. Nanotechnology in medicine and the biological sciences.
Search PubMed in NCBI and type nanotechnology in the search
box provided.
|
2: |
Biomedical microdevices and nanotechnology.
Trends Biotechnol. 2001
Feb;19(2):38-9. No abstract available.
[MEDLINE record in process]
PMID: 11252264
|
15: |
Nanotechnology. Molecular movers and shakers.
Nature. 2000 Dec
21-28;408(6815):904. No abstract available.
PMID: 11140655
|
7: |
Is nanotechnology dangerous?
Science. 2000 Nov
24;290(5496):1526-7. No abstract available.
PMID: 11185512
b.
Human Immunodeficiency Virus (HIV).
Type HIV in the search box and click ‘GO’. There are many interesting articles
including :
|
4: |
Freitag C, Chougnet C, Schito M, Near KA, Shearer GM, Li C,
Langhorne J, Sher A. |
|
Malaria Infection Induces Virus Expression in Human
Immunodeficiency Virus Transgenic Mice by CD4 T Cell-Dependent Immune
Activation.
J Infect Dis. 2001 Apr
15;183(8):1260-1268.
[Record as supplied by publisher]
PMID: 11262209
|
|
Dezzutti CS, Guenthner PC, Cummins Jr JE, Cabrera T,
Marshall JH, Dillberger A, Lal RB. |
|
Cervical and Prostate Primary Epithelial Cells Are Not
Productively Infected but Sequester Human Immunodeficiency Virus Type 1.
J Infect Dis. 2001 Apr
15;183(8):1204-1213.
[Record as supplied by publisher]
PMID: 11262202
|
16: |
|
Glucose disorders associated with HIV and its drug therapy.
Ann Pharmacother. 2001
Mar;35(3):343-51.
[MEDLINE record in process]
PMID: 11261533
6. Does the restriction enzyme XhoI cut this sequence, and if so how many times does it cut, where does it cut (give approx. base pair region), and does the cut leave a blunt ends or a staggered ends?
Go to the Max Heiman’s Webcutter 2.0 site (can get there from the restriction enzyme mapping link
on the Applications page of CU’s Bioinformatics web page.
Cut and paste the desired sequence into the box and select :
Linear sequence
Map of restriction site
All enzymes (w/ rainbow highlights is helpful and easier to
look at)
Only the following enzyme: XhoI
ATTCTTTTGAGTCGGGAGAACTAGGTAACAATTCGGAAACTCCAAAGGGTGGATGAGGGGCGCGCGGGGT
GTGTGTGGGGGATACTCTGGTCCCCCGTGCAGTGACCTCTAAGTCAGAGGCTGGCACACACACACCTTCC
ATTTTTTCCCAACCGCAGGATGGCGCCTCATCCCTTGGATGCGCTCACCATCCAAGTGTCCCCAGAGACA
CAACAACCTTTTCCCGGAGCCTCGGACCACGAAGTGCTCAGTTCCAATTCCACCCCACCTAGCCCCACTC
TCATACCTAGGGACTGCTCCGAAGCAGAAGTGGGTGACTGCCGAGGGACCTCGAGGAAGCTCCGCGCCCG
ACGCGGAGGGCGCAACAGGCCCAAGAGCGAGTTGGCACTCAGCAAACAGCGAAGAAGCCGGCGCAAGAAG
GCCAATGATCGGGAGCGCAATCGCATGCACAACCTCAACTCGGCGCTGGATGCGCTGCGCGGTGTCCTGC
CCACCTTCCCGGATGACGCCAAACTTACAAAGATCGAGACCCTGCGCTTCGCCCACAACTACATCTGGGC
ACTGACTCAGACGCTGCGCATAGCGGACCACAGCTTCTATGGCCCGGAGCCCCCTGTGCCCTGTGGAGAG
CTGGGGAGCCCCGGAGGTGGCTCCAACGGGGACTGGGGCTCTATCTACTCCCCAGTCTCCCAAGCGGGTA
ACCTGAGCCCCACGGCCTCATTGGAGGAATTCCCTGGCCTGCAGGTGCCCAGCTCCCCATCCTATCTGCT
CCCGGGAGCACTGGTGTTCTCAGACTTCTTGTGAAGAGACCTGTCTGGCTCTGGGTGGTGGGTGCTAGTG
GAAAGGGAGGGGACCACAGCC
XhoI cuts once, between basepairs 301 to 375, and leaves
blunt ends.
XhoI
gaagcagaagtgggtgactgccgagggacctcgaggaagctccgcgcccgacgcggagggcgcaacaggccc
aacttcgtcttcacccactgacggctccctggagctccttcgaggcgcgggctgcgcctcccgcgttgtccgggttc
Enzyme No. Positions Recognition
name cuts of
sites sequence
XhoI 1 330 c/tcgag
7. Determine the restriction digest map of the following sequence with restriction enzymes that recognize six base pair sequences.
Go to the Max Heiman’s Webcutter 2.0 site (can get there from the restriction enzyme mapping link
on the Applications page of CU’s Bioinformatics web page.
Cut and paste the desired sequence into the box and select :
Linear sequence
Map of restriction site
All enzymes (w/ rainbow highlights is helpful and easier to
look at)
Only enzymes with recognition sites greater than or equal to
6 bases long.
Clicks analyze sequence when ready.
>gi|7524758|ref|NC_001776.1||
Oryza sativa mitochondrial plasmid B2, complete sequence
CCGAGGTACCAGAGGATAGCCTTTAGGATATCCGTCGGTAGGAGATATCGAGTCGTTAGAGCACATATCT
CCGTGATCCCTTCCACAGCTAAACGTAGATCTCTATCTTGTAGTGGAGTGTAGAGTAAAGATATACCGGA
GTGGAGTGCAAGAAGGTTATAGGGGAGTGTGGCGCCCGTTGCCCGCCTTTCGTCTCGGTTGAAAAACAAA
ACTGTTGCTTGTTTTGTTCATTCAAGTAGTGGTTTAGCCCTCGAATCTATAGCGAAAGAGTTAGCTCAAC
CTATCTACTACTACCCTTGGAAAGGGGGTGTGTGTGGTATACCCGTTACGACTCCCCTCGAACTGAGCCA
TTTTCATGATTTTACAATCAAAGAAAGTTCCATACTTTTTTATTCCCGGTATGGAGCCAACGTCGCTTCG
CTTTCGCAGGACAATTTCCTTCTGGAAAAATATCCAGTGGCATATCCCATCTAGCTCTGGGAATCTTTCC
ACAACCAGCGCCTTCTTGTCCGATGAAGTTCAAGTCTCTAACCTGTTCTTGCCCGGGAATGTCCCTCCTT
CGAACTCGAAGTCAGAGTTTTCCGGTTATCCGATGGGCGATTTATGATAGTTCCGATGCACTTCTCATGC
TAGCCTATCGACAAGATCAGGGTTCAACTACTGTTGCCTCAACGGGTTCGCTGGTCACTTTAACAGTTTT
TTAAGCAGGCATTTTCGTGGTTCTCGTCATATAAACCAGTCAGGTACTAGCGTGATTTCATGCATTAATC
TATTGCTCGAAGGAACCTTTCGTCAGAAAACTTATTCAAAAGGTACCCGGTGTTCTAGCGCATTTCTGTT
AAGAGAAGTACCCCTTGTAGGCCTGAGATGCCCTCTATCGAGCGATATAACGACGGTTGCTATTTTTAGG
TACACGTGTTTTTTGAAGAAAGCACTTTTAAAAAGGAAAGTAGTAAAATGAAAATACTCAATTTAATAAA
TCCTAAATTATCTGTGGCATTGAATTATGAAAGTAAACACTGAAAGTTTACCACTTACAAAAGTAAATGT
ACTACCCGACTAAAAGGAGGAAATCCAATTAGAGGTGTAAACTCTTTTCTTTTCTATCGAGTAGATTCCC
TTTTATTTATAAATAGTAAACTAACGAATAGATAATTAGTATTCCCTCATATAATTAGTACTAGCTGTTC
CTTCATTAATGGTACGTGTATTCACGAGGGCTCTGCTTCGCTCTATTCCACCTCCCGGGTGTATGGAAAA
ATCTTTAAAAGAAATAAATCATGCGGTACTTCTTGTTTTAATCACTCTTCCTTTCCTCCCGGAACTGGGG
AAGAACCCTTCTGAAAAGGAATGGAGGTCTATCTTTACTCGGAGTGCAACGCAGAGATAGAGATCGAAGA
GCCGCTGTGCTTGGGGACGAGGGGACTCACTCAACCTCAAAGTTGAGGGAATCCCGAAGCCTCGGAATCT
AGCGTTAGCGTAAGG
SmaI cuts twice, once in the region of 526-600 and again
between 1201-1275; leaves blunt ends, the sequence is ccc/ggg.
8. Convert the following protein sequences back to their nucleotide sequences using the correct codon usage table. For example, use the Homo sapien codon usage table if the protein sequence corresponds to a human gene product.
Use the Backtranslation page in the Applications section of
CU’s Bioinformatics web page
>gi|126001|sp|P00709|LCA_HUMAN
ALPHA-LACTALBUMIN PRECURSOR (LACTOSE SYNTHASE B PROTEIN)
MRFFVPLFLVGILFPAILAKQFTKCELSQLLKDIDGYGGIALPELICTMFHTSGYDTQAIVENNESTEYGL
FQISNKLWCKSSQVPQSRNICDISCDKFLDDDITDDIMCAKKILDIKGIDYWLAHKALCTEKLEQWLCEKL
ATG AGA TTC TTC GTA CCT CTA TTT CTC GTGGGA ATA CTG TTT CCG GCA ATA CTG GCG AAACAG TTC ACA AAG TGC GAA TTA TCT CAA TTATTA AAA GAC ATA GAT GGC TAC GGA GGA ATCGCA CTT CCC GAA CTT ATA TGT ACG ATG TTTCAC ACA TCC GGT TAT GAT ACC CAA GCC ATCGTC GAA AAT AAT GAG AGC ACT GAA TAC GGGCTC TTT CAG ATT AGT AAC AAG CTT TGG TGTAAA TCG AGT CAA GTT CCA CAG AGT AGA AACATA TGC GAT ATC TCA TGC GAC AAA TTC TTGGAC GAT GAC ATT ACA GAT GAC ATC ATG TGCGCA AAA AAG ATT CTA GAT ATT AAA GGA ATAGAC TAT TGG TTG GCA CAT AAG GCT TTG TGTACA GAG AAG CTG GAG CAA TGG CTA TGT GAGAAG CTC
>gi|2119314|pir||I61690
myosin - human (fragment)
CVIISGESGAGKTVAAKYIMSYISRVSGGGTKVQHVKDIILQSNPLLEAFGNAKTVRNNNSSRFGKYFEI
QFSPGGEPDGGKISNFLLEK
TGT GTA ATT ATA TCT GGG GAA AGC GGC GCGGGT AAA ACA GTA GCC GCA AAG TAC ATC ATGAGT TAT ATA TCC AGA GTA AGT GGG GGT GGAACA AAG GTC CAG CAT GTG AAG GAT ATC ATTTTA CAA AGT AAC CCA TTA TTA GAG GCA TTCGGA AAT GCT AAG ACA GTT AGA AAT AAC AACAGT TCA AGA TTT GGC AAA TAT TTC GAG ATACAA TTT AGT CCA GGC GGA GAA CCA GAC GGTGGG AAA ATA TCG AAT TTT TTA TTA GAA AAA
>gi|746398|gb|AAA97505.1|
ampicillin-binding protein (E. coli)
MRFSRFIIGLTSCIAFSVQAANVDEYITQLPAGANLALMVQKVGASAPAIDYHSQQMALPASTQKVITAL
AALIQLGPDFRFTTTLETKGNVENGVLKGDLVARFGADPTLKRQDIRNMVATLKKSGVNQIDGNVLIDTS
IFASHDKAPGWPWNDMTQCFSAPPAAAIVDRNCFSVSLYSAPKPGDMAFIRVASYYPVTMFSQVRTLPRG
SAEAQYCELDVVPGDLNRFTLTGCLPQRSEPLPLAFAVQDGASYAGAILKDELKQAGITWSGTLLRQTQV
NEPGTVVASKQSAPLHDLLKIMLKKSDNMIADTVFRMIGHARFNVPGTWRAGSDAVRQILRQQAGVDIGN
TIIADGSGLSRHNLIAPATMMQVLQYIAQHDNELNFISMLPLAGYDGSLQYRAGLHQAGVDGKVSAKTGS
LQGVYNLAGFITTASGQRMAFVQYLSGYAVEPADQRNRRIPLVRFESRLYKDIYQNN
ATG AGG TTC AGC AGG TTC ATC ATT GGG CTTACA TCC TGC ATC GCC TTC AGT GTT CAG GCAGCA AAT GTG GAC GAG TAT ATA ACG CAA CTCCCT GCT GGG GCT AAC CTG GCA TTA ATG GTCCAA AAG GTC GGC GCA TCT GCA CCA GCG ATTGAT TAT CAT TCA CAG CAG ATG GCC TTG CCAGCG TCC ACG CAA AAG GTG ATT ACA GCG TTAGCG GCA CTC ATA CAG CTC GGC CCA GAT TTTCGG TTC ACT ACG ACC CTA GAG ACG AAA GGTAAT GTA GAG AAC GGA GTT TTA AAG GGA GATTTA GTT GCA AGA TTT GGG GCA GAC CCC ACACTG AAG AGG CAA GAC ATA AGG AAT ATG GTAGCG ACA CTG AAA AAA AGC GGA GTG AAT CAAATA GAC GGG AAC GTT CTA ATT GAT ACC AGTATT TTT GCA TCG CAT GAT AAA GCG CCT GGATGG CCG TGG AAT GAT ATG ACC CAA TGT TTCAGT GCC CCT CCC GCG GCC GCT ATA GTC GATCGT AAC TGT TTT TCA GTG TCA CTG TAT AGCGCG CCG AAG CCG GGT GAC ATG GCC TTT ATTAGA GTA GCC TCC TAC TAT CCG GTT ACC ATGTTT AGC CAA GTG AGA ACA CTC CCC CGC GGTTCG GCT GAA GCC CAG TAT TGC GAG TTA GATGTG GTT CCC GGA GAT CTA AAC CGC TTT ACCCTA ACT GGG TGT CTG CCA CAA CGT TCA GAACCC CTA CCT CTT GCT TTC GCT GTA CAG GACGGA GCC TCT TAC GCA GGT GCT ATC CTT AAAGAC GAA CTC AAG CAA GCA GGA ATC ACA TGGAGT GGG ACA CTG TTA AGA CAA ACA CAA GTCAAT GAA CCA GGT ACT GTA GTG GCG AGT AAGCAG TCC GCA CCA TTA CAT GAT CTC CTT AAGATT ATG TTA AAG AAA TCG GAC AAC ATG ATCGCT GAC ACT GTA TTC CGA ATG ATA GGT CACGCA CGC TTT AAT GTG CCG GGC ACG TGG CGTGCT GGG TCG GAC GCC GTA CGG CAG ATT CTGCGA CAA CAA GCG GGA GTC GAC ATC GGG AATACC ATA ATC GCG GAT GGC AGT GGT TTG TCTCGG CAT AAT CTT ATT GCT CCT GCA ACG ATGATG CAG GTT TTG CAA TAC ATC GCT CAG CACGAC AAC GAG TTG AAC TTT ATA TCT ATG TTGCCG TTA GCT GGC TAC GAT GGT TCC CTA CAGTAT CGC GCC GGC CTT CAC CAG GCC GGA GTCGAT GGC AAA GTC AGC GCG AAA ACT GGG TCGCTC CAG GGC GTT TAC AAT CTT GCG GGT TTCATA ACT ACA GCC TCT GGA CAA AGA ATG GCATTT GTA CAG TAT CTA AGT GGC TAT GCC GTCGAA CCT GCT GAT CAG AGA AAC CGA CGT ATACCC TTG GTA CGG TTC GAA TCA CGA TTG TACAAA GAC ATC TAC CAA AAT AAC
>gi|476966|pir||A47398
serotonin transporter - human
METTPLNSQKQLSACEDGEDCQENGVLQKVVPTPGDKVESGQISNGYSAVPSPGAGDDTRHSIPATTTTL
VAELHQGERETWGKKVDFLLSVIGYAVDLGNVWRFPYICYQNGGGAFLLPYTIMAIFGGIPLFYMELALG
QYHRNGCISIWRKICPIFKGIGYAICIIAFYIASYYNTIMAWALYYLISSFTDQLPWTSCKNSWNTGNCT
NYFSEDNITWTLHSTSPAEEFYTRHVLQIHRSKGLQDLGGISWQLALCIMLIFTVIYFSIWKGVKTSGKV
VWVTATFPYIILSVLLVRGATLPGAWRGVLFYLKPNWQKLLETGVWIDAAAQIFFSLGPGFGVLLAFASY
NKFNNNCYQDALVTSVVNCMTSFVSGFVIFTVLGYMAEMRNEDVSEVAKDAGPSLLFITYAEAIANMPAS
TFFAIIFFLMLITLGLDSTFAGLEGVITAVLDEFPHVWAKRRERFVLAVVITCFFGSLVTLTFGGAYVVK
LLEEYATGPAVLTVALIEAVAVSWFYGITQFCRDVKEMLGFSPGWFWRICWVAISPLFLLFIICSFLMSP
PQLRLFQYNYPYWSIILGYCIGTSSFICIPTYIAYRLIITPGTFKERIIKSITPETPTEIPCGDIRLNAV
ATG GAG ACG ACC CCG TTG AAC TCC CAA AAGCAG TTA AGT GCC TGT GAG GAT GGG GAG GACTGC CAG GAA AAT GGT GTC CTA CAA AAG GTGGTA CCT ACG CCT GGT GAC AAA GTA GAA TCAGGT CAA ATC TCT AAC GGC TAT TCA GCT GTTCCA AGT CCA GGT GCG GGC GAC GAC ACT AGACAT TCC ATA CCC GCG ACC ACG ACG ACC TTGGTA GCC GAA CTA CAT CAA GGT GAG AGA GAAACC TGG GGC AAG AAG GTG GAC TTC TTG CTAAGC GTG ATT GGC TAC GCA GTC GAT CTG GGAAAC GTT TGG CGT TTT CCA TAC ATC TGC TACCAG AAT GGC GGA GGG GCC TTT TTG TTG CCGTAC ACG ATC ATG GCA ATT TTC GGA GGG ATTCCC CTG TTT TAC ATG GAG CTC GCA TTA GGGCAG TAC CAT CGC AAT GGA TGC ATA TCT ATATGG AGG AAG ATT TGT CCA ATA TTC AAG GGCATT GGA TAT GCC ATT TGC ATA ATT GCG TTTTAC ATA GCA TCC TAT TAT AAC ACA ATC ATGGCT TGG GCA TTA TAT TAC TTA ATA TCG TCTTTT ACC GAC CAA TTA CCA TGG ACC AGT TGCAAA AAT AGT TGG AAC ACA GGG AAT TGC ACGAAT TAC TTT TCC GAG GAT AAT ATA ACT TGGACA TTA CAC AGT ACA TCA CCC GCA GAG GAATTT TAT ACT CGA CAC GTG CTT CAA ATT CATCGT TCG AAA GGC CTT CAG GAC CTG GGG GGAATT AGT TGG CAA CTT GCA CTA TGT ATC ATGCTT ATA TTC ACT GTA ATC TAT TTC AGC ATCTGG AAG GGG GTT AAA ACG TCG GGC AAG GTTGTT TGG GTT ACG GCT ACA TTT CCA TAT ATTATC CTG AGC GTG CTG CTA GTG CGA GGC GCTACT CTC CCC GGA GCC TGG AGA GGT GTA CTATTC TAC CTC AAG CCC AAT TGG CAG AAA CTTTTA GAG ACT GGT GTA TGG ATA GAT GCG GCCGCT CAA ATC TTT TTT TCA TTG GGG CCT GGTTTT GGG GTG CTG TTA GCC TTC GCC AGT TACAAT AAA TTT AAC AAC AAT TGT TAT CAG GACGCG CTG GTC ACT TCT GTC GTA AAC TGT ATGACA AGC TTC GTG TCA GGA TTT GTC ATA TTCACT GTC CTT GGT TAC ATG GCT GAG ATG AGAAAC GAA GAT GTC TCG GAA GTT GCC AAA GATGCT GGC CCT AGT CTA TTA TTC ATA ACT TACGCG GAA GCA ATA GCA AAT ATG CCG GCA AGTACC TTT TTC GCG ATC ATT TTC TTC TTG ATGCTG ATT ACC CTA GGA CTG GAT AGT ACA TTCGCG GGT TTG GAA GGA GTA ATC ACA GCA GTGTTG GAC GAA TTC CCC CAC GTA TGG GCT AAAAGG CGG GAG CGG TTC GTA CTC GCT GTG GTTATT ACA TGT TTC TTC GGT AGT TTA GTT ACCCTA ACA TTT GGC GGG GCG TAT GTA GTA AAACTC CTT GAG GAA TAT GCT ACC GGC CCC GCCGTC TTA ACA GTG GCC CTT ATT GAG GCG GTTGCG GTT TCA TGG TTT TAC GGA ATT ACA CAGTTC TGT CGC GAT GTC AAA GAG ATG TTG GGATTT AGC CCT GGG TGG TTT TGG CGA ATA TGTTGG GTC GCT ATA TCC CCA CTT TTT CTC CTCTTC ATC ATA TGC TCG TTC CTG ATG TCT CCGCCA CAA CTC CGT CTC TTT CAG TAT AAC TATCCT TAC TGG AGC ATA ATT CTT GGG TAT TGTATT GGA ACG TCG TCC TTT ATC TGC ATC CCTACT TAT ATT GCA TAT CGC CTA ATA ATC ACTCCA GGT ACC TTT AAA GAA CGG ATC ATC AAGTCT ATC ACA CCG GAA ACG CCG ACG GAA ATACCG TGC GGA GAT ATA AGG CTC AAC GCA GTC
9. Determine all possible properties of the 4 protein sequences from Problem #8. Molecular weight, isoelectric point, extinction coefficients, titration curve, etc.
For all cases go to Determining Properties of a Protein
Sequence link in the Applications page.
Click on the links to the desired properies (MW, pI, extinction
coefficients can be calculated with the first link, the titration curve can be found
with the second link) Cut and paste the
nucleotide sequence into the box and follow the directions in the Applications.
a.)LCA_HUMAN ALPHA-LACTALBUMIN PRECURSOR (LACTOSE SYNTHASE B
PROTEIN)
Number of amino acids: 426
Molecular weight: 35010.9
Theoretical pI: 5.26
Formula: C1300H2176N426O542S80
Total number of atoms: 4524
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride 0.02 M phosphate buffer pH 6.5 Extinction coefficients are in units of M-1 cm-1 .The first table lists values computed assuming ALL Cys residues appear as half cystines, whereas the second table assumes that NONE do. 276 278 279 280 282 nm nm nm nm nmExt. coefficient 5800 5080 4800 4800 4800Abs 0.1% (=1 g/l) 0.166 0.145 0.137 0.137 0.137 276 278 279 280 282 nm nm nm nm nmExt. coefficient 0 0 0 0 0Abs 0.1% (=1 g/l) 0.000 0.000 0.000 0.000 0.000
Valeurs utilisées des pK :
alpha-NH3 : 9,69 alpha-COOH : 2,34 Arg : 12,40 Lys : 10,50 His : 6,00 Asp : 3,86 Glu : 4,25 Cys : 8,33 Tyr : 10,00
This
site is funded by the National
Science Foundation .