Beginners Problem Set 1 Answers

 

1. Find the nucleotide sequences of the following genes:

 

Use WU-blastn or NCBI’s blastn (Sequence Identification Application in the Applications link in CU’s Bioinformatics web page).

 

a. Human ornithine aminotransferase (OAT), nuclear gene encoding mitochondrial protein

 
BASE COUNT      584 a    373 c    449 g    624 t
ORIGIN      
        1 gcgtgtaccc ggttgtcctc aggcgctgtc agatctgtgg tttttctact tgaaggacac
       61 aatgttttcc aaactagcac atttgcagag gtttgctgta cttagtcgcg gagttcattc
      121 ttcagtggct tctgctacat ctgttgcaac taaaaaaaca gtccaaggcc ctccaacctc
      181 tgatgacatt tttgaaaggg aatataagta tggtgcacac aactaccatc ctttacctgt
      241 agccctggag agaggaaaag gtatttactt atgggatgta gaaggcagaa aatattttga
      301 cttcctgagt tcttacagtg ctgtcaacca agggcattgt caccccaaga ttgtgaatgc
      361 tctgaagagt caagtggaca aattgacctt aacatctaga gctttctata ataacgtact
      421 tggtgaatat gaggagtata ttactaaact tttcaactac cacaaagttc ttcctatgaa
      481 tacaggagtg gaggctggag agactgcctg taaactagct cgtaagtggg gctataccgt
      541 gaagggcatt cagaaataca aagcaaagat tgtttttgca gctgggaact tctggggtag
      601 gacgttgtct gctatctcca gttccacaga cccaaccagt tacgatggtt ttggaccatt
      661 tatgccggga ttcgacatca ttccctataa tgatctgccc gcactggagc gtgctcttca
      721 ggatccaaat gtggctgcgt tcatggtaga accaattcag ggtgaagcag gcgttgttgt
      781 tccggatcca ggttacctaa tgggagtgcg agagctctgc accaggcacc aggttctctt
      841 tattgctgat gaaatacaga caggattggc cagaactggt agatggctgg ctgttgatta
      901 tgaaaatgtc agacctgata tagtcctcct tggaaaggcc ctttctgggg gcttataccc
      961 tgtgtctgca gtgctgtgtg atgatgacat catgctgacc attaagccag gggagcatgg
     1021 gtccacatac ggtggcaatc cactaggctg ccgagtggcc atcgcagccc ttgaggtttt
     1081 agaagaagaa aaccttgctg aaaatgcaga caaattgggc attatcttga gaaatgaact
     1141 catgaagcta ccttctgatg ttgtaactgc cgtaagagga aaaggattat taaatgctat
     1201 tgtcattaaa gaaaccaaag attgggatgc ttggaaggtg tgtctacgac ttcgagataa
     1261 tggacttctg gccaagccaa cccatggcga cattatcagg tttgcgcctc cgctggtgat
     1321 caaggaggat gagcttcgag agtccattga aattattaac aagaccatct tgtctttctg
     1381 agggtagcca gctgttttca gtggtccctg ggagccagct ggagacaggt ggtcctgtaa
     1441 aagctttatt cctaatgtgg gcacattcca ctcccatgag tcttcaaaaa cttttttttt
     1501 gaatatattt ttttcagttg atacataata gaacaacgtt tatgaacctg ccgtttgctt
     1561 tgtaacgtaa ctaaataatg taatggcatc tatattcagt tgaagtgttt tgatgtgcat
     1621 gtgtacttcc taaggtgaaa tgcatctata tacagacagc ctctaaatca agtccttcag
     1681 tataattgat atatgttttt ataatttcct cactggtata agtgtttcat atttgaaaaa
     1741 gttatctctg ggtattgcat aaaaggcttc atcttataaa gtgaaatcat tgttattgaa
     1801 ttttaggaag gattaatggt taagtgtata taaaatacta atattaagta aacttcatat
     1861 tggccaacac cagggttgta ttctatggat gtcattattt tgaattaaga attagtgttt
     1921 aacattccta aattgttttg agtgcttgat tataatttgt aaaaaatgtt tattttcaat
     1981 acttctttaa atttaaaata aagcttatat ttcaaaaaaa aaaaaaaaaa 

 

 

 

b. YEL076C gene in chromosome V of Saccharomyces cerevisiae.

 

Sequence is too long to copy into this link. For correct answer do the following.

 

Copy and paste (or type) the following into the search box:           

YEL076C AND Saccharomyces cerevisiae AND chromosome V

and click on “GO”.

 

 

c. Homo sapiens herpesvirus protein B (HVEB) mRNA

 
BASE COUNT      388 a    623 c    600 g    317 t
ORIGIN      
        1 gagcagaaca gggaggctag agcgcagcgg gaaccggccc ggagccggag ccggagcccc
       61 acaggcacct actaaaccgc ccagccgatc ggcccccaca gagtggcccg cgggcctccg
      121 gccgggccca gtcccctccc gggccctcca tggcccgggc cgctgccctc ctgccgtcga
      181 gatcgccgcc gacgccgctg ctgtggccgc tgctgctgct gctgctcctg gaaaccggag
      241 cccaggatgt gcgagttcaa gtgctacccg aggtgcgagg ccagctcggg ggcaccgtgg
      301 agctgccgtg ccacctgctg ccacctgttc ctggactgta catctccctg gtgacctggc
      361 agcgcccaga tgcacctgcg aaccaccaga atgtggccgc cttccaccct aagatgggtc
      421 ccagcttccc cagcccgaag cctggcagcg agcggctgtc cttcgtctct gccaagcaga
      481 gcactgggca agacacagag gcagagctcc aggacgccac gctggccctc cacgggctca
      541 cggtggagga cgagggcaac tacacttgcg agtttgccac cttccccaag gggtccgtcc
      601 gagggatgac ctggctcaga gtcatagcca agcccaagaa ccaagctgag gcccagaagg
      661 tcacgttcag ccaggaccct acgacagtgg ccctctgcat ctccaaagag ggccgcccac
      721 ctgcccggat ctcctggctc tcatccctgg actgggaagc caaagagact caggtgtcag
      781 ggaccctggc cggaactgtc actgtcacca gccgcttcac cttggtgccc tcgggccgag
      841 cagatggtgt cacggtcacc tgcaaagtgg agcatgagag cttcgaggaa ccagccctga
      901 tacctgtgac cctctctgta cgctaccctc ctgaagtgtc catctccggc tatgatgaca
      961 actggtacct cggccgtact gatgccaccc tgagctgtga cgtccgcagc aacccagagc
     1021 ccacgggcta tgactggagc acgacctcag gcaccttccc gacctccgca gtggcccagg
     1081 gctcccagct ggtcatccac gcagtggaca gtctgttcaa taccaccttc gtctgcacag
     1141 tcaccaatgc cgtgggcatg ggccgcgctg agcaggtcat ctttgtccga gaaaccccca
     1201 gggcctcgcc ccgagatgtg ggcccgctgg tgtggggggc cgtggggggg acactgctgg
     1261 tgctgctgct tctggctggg gggtccttgg ccttcatcct gctgagggtg aggaggagga
     1321 ggaagagccc tggaggagca ggaggaggag ccagtggcga cgggggattc tacgatccga
     1381 aagctcaggt gttgggaaat ggggaccccg tcttctggac accagtagtc cctggtccca
     1441 tggaaccaga tggcaaggat gaggaggagg aggaggagga agagaaggca gagaaaggcc
     1501 tcatgttgcc tccaccccca gcactcgagg atgacatgga gtcccagctg gacggctccc
     1561 tcatctcacg gcgggcagtt tatgtgtgac ctggacacag acagagacag agccaggccc
     1621 ggccctcccg cccccgacct gaccacgccg gcctagggtt ccagactggt tggacttgtt
     1681 cgtctggacg acactggagt ggaacactgc ctcccacttt cttgggactt ggagggaggt
     1741 ggaacagcac actggacttc tcccgtctct agggctgcat ggggagcccg gggagctgag
     1801 tagtggggat ccagagagga cccccgcccc cagagacttg gttttggctc cagccttccc
     1861 ctggccccgt gacactcagg agttaataaa tgccttggag gaaaacaaaa aaaaaaaaaa

     1921 aaaaaaaa

 

 

d. Simian immunodeficiency virus of Chimpanzee (Pan troglodytes), related to HIV.

 

Sequence is too long to copy into this link. For correct answer do the following.

 

Copy and paste (or type) the following into the search box:           

Pan trogloytes AND immunodeficiency

and click on “GO”.

 

 

 

 

2. Find the possible origin and gene function of the following nucleotide sequences:

 

Use WU-blastn or NCBI’s blastn (Sequence Identification Application)

 

a. Sequence 1

 

CACGGTGATCAAAGTGAGAATGAGCTCCCAGGATTGGGGGGCAAGGAAGATAGGAGGGTCAAACAGAGTCGGGGAGAAGCCAGGGAGAGCTACAGAGAAACCGGGTCCAGCAGAGCAAGTGATGAGAGAGCTGCCCATCTTCCAACCAGCACACCCCTAGACATTGACACTGCATCGGAGTCAGGCCAAGATCCGCAGGACAGTCGAAGGTCAGCTGACGCCCTGCTCAGGCTGCAAGCCATGGCAGGAATCTTGGAGGAACAAGGCTCAGACACGGACACCCCTAGGGTG

 

Measles Virus (NE-94N) nucleoprotein mRNA, length 1578

 

b. Sequence 2

 

CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGATCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG

 

Human beta-globin gene from a thalassemia patient, length 3703

 

c. Sequence 3

 

AGTACGGTACACCAATTATTTTGAAGGCCGCTTATGGAGGAGGTGGTCGTGGAATTCGTCGTGTTGATAAATTGGAAGAAGTTGAAGAGGCATTCCGTAGATCCTACTCGGAAGCTCAAGCTGCGTTTGGAGACGGAAGTCTTTTCGTTGAAAAGTTTGTTGAGAGACCAAGACATATTGAAGTTCAGCTGCTTGGAGACCATCATGGAAATATTGTTCATTTGTATGAGCGTGATTGTTCAGTGCAACGTCGTCATCAAAAGGTTGTTGAAATTGCTCCAGCGCCAGCTCTCCCAGAAGGTGTTCGTGAGAAAATTTTGGCAGACGCTCTTCGACTTGCAAGACATGTTGGATACCAAAATGCTGGTACAGTCGAATTCCTGGTTGATCAGAAGGGCAACTACTATTTCATCGAAGTGAATGCACGTCTTCAAGTCGAGCATACAGTAACTGAAGAGATCACTGGTGTGGATCTTGTCCAAGCTCAAATTCGTATCGCCGAAGGAAAATCTCTGGATGATCTGAAGCTTTCACAGGAAACTATTCAAACTACTGGCTCAGCTATTCAATGTCGTGTCACAACTGAAGATCCAGCTAAAGGATTCCAGCCAGATTCCGGAAGAATTGAAGTTTTCCGATCCGGAGAGGGAATGGGAATTCGTCTTGATTCAGCATCTGCCTTCGCAGGATCAGTCATTTCACCTCACTACGATTCTTTGATGGTCAAAGTAATTGCATCGGCTAGAAATCATCCGAACGCTGCCGCAAAAATGATTCGTGCCCTCAAAAAGTTCCGTATCCGAGGCGTAAAGACAAACATCCCATTTCTGCTCAACGTTCTTCGCCAGCCCAGCTTCCTTGATGCATCCGTCGATACGTATTTCATTGATGAGCATCCAGAGTTGTTCCAATTCAAACCAAGCCAAAACCGTGCTCAAAAGTTGTTGAACTATTTGGGAGAAGTTAAGGTGAACGGTCCAACTACTCCTCTTGCTACTGACCTGAAACCAGCAGTTGTTT

 

Caenorhabditis elegans pyruvate carboxylase (pyc-1) mRNA, length 3864

 

 

 

 

3. Align the following sequences (single sequence alignment):

 

a. Sequence 1

 

CAGGGGCAGTGCGGGAGGTGCTGGGCTTTCTCAACAGCTGAGGTGATTTCCGATCGAACATGTATTGCAA

GCAATGGTACCCAACAACCAATCATCTCCCCAACTGATCTGCTCACTTGTTGTGGAATGTCATGCGGAGA

GGGCTGTAACGGCGGCTAT

 

b. Sequence 2

 

ATGTCTACTATTCCATCAGAAATAATCAATTGGACAATTTTGAACGAGATAATTTCCATGGATGACGATG

ACAGTGATTTTAGCAAAGGATTAATCATTCAATTCATAGATCAGGCTCAAACCACGTTTGCACAAATGCA

GAGACAACTAGACGGCGAAAAGAATCTTACTGAACTGGATAACTTGGGGCATTTCTTAAAAGGATCGTCT

GCCGCGCTCGGCTTGCAAAGGATCGCTTGGGTTTGTGAGCGTATTCAGAATTTAGGGAGAAAGATGGAAC

ACTTTTTCCCTAACAAAACAGAACTAGTAAATACCCTTTCAGATAAGTCCATTATAAATGGAATCAACAT

TGACGAGGATGATGAAGAAATAAAAATTCAAGTCGACGATAAAGATGAAAATAGTATCTATTTGATTTTA

ATAGCAAAGGCCCTGAACCAATCTCGATTGGAGTTTAAATTAGCTAGAATTGAACTATCAAAGTACTATA

ATACTAATCTT

 

c. Sequence 3

 

ATTCTTTTGAGTCGGGAGAACTAGGTAACAATTCGGAAACTCCAAAGGGTGGATGAGGGGCGCGCGGGGT

GTGTGTGGGGGATACTCTGGTCCCCCGTGCAGTGACCTCTAAGTCAGAGGCTGGCACACACACACCTTCC

ATTTTTTCCCAACCGCAGGATGGCGCCTCATCCCTTGGATGCGCTCACCATCCAAGTGTCCCCAGAGACA

CAACAACCTTTTCCCGGAGCCTCGGACCACGAAGTGCTCAGTTCCAATTCCACCCCACCTAGCCCCACTC

TCATACCTAGGGACTGCTCCGAAGCAGAAGTGGGTGACTGCCGAGGGACCTCGAGGAAGCTCCGCGCCCG

ACGCGGAGGGCGCAACAGGCCCAAGAGCGAGTTGGCACTCAGCAAACAGCGAAGAAGCCGGCGCAAGAAG

GCCAATGATCGGGAGCGCAATCGCATGCACAACCTCAACTCGGCGCTGGATGCGCTGCGCGGTGTCCTGC

CCACCTTCCCGGATGACGCCAAACTTACAAAGATCGAGACCCTGCGCTTCGCCCACAACTACATCTGGGC

ACTGACTCAGACGCTGCGCATAGCGGACCACAGCTTCTATGGCCCGGAGCCCCCTGTGCCCTGTGGAGAG

CTGGGGAGCCCCGGAGGTGGCTCCAACGGGGACTGGGGCTCTATCTACTCCCCAGTCTCCCAAGCGGGTA

ACCTGAGCCCCACGGCCTCATTGGAGGAATTCCCTGGCCTGCAGGTGCCCAGCTCCCCATCCTATCTGCT

CCCGGGAGCACTGGTGTTCTCAGACTTCTTGTGAAGAGACCTGTCTGGCTCTGGGTGGTGGGTGCTAGTG

GAAAGGGAGGGGACCACAGCC

 

 

 

 

 

 

4. Translate the following nucleotide sequences in 3 reading frames:

 

Use the translation link in the Applications page of CU’s Bioinformatics web site.  Cut and

paste this sequence into the box provided and select 3 phase. 

 

 

a. Sequence 1

 

CACGGTGATCAAAGTGAGAATGAGCTCCCAGGATTGGGGGGCAAGGAAGATAGGAGGGTCAAACAGAGTCGGGGAGAAGCCAGGGAGAGCTACAGAGAAACCGGGTCCAGCAGAGCAAGTGATGAGAGAGCTGCCCATCTTCCAACCAGCACACCCCTAGACATTGACACTGCATCGGAGTCAGGCCAAGATCCGCAGGACAGTCGAAGGTCAGCTGACGCCCTGCTCAGGCTGCAAGCCATGGCAGGAATCTTGGAGGAACAAGGCTCAGACACGGACACCCCTAGGGTG

 

Phase 2 gives the following sequence of amino acids. The highlighted section is the one “most likely to be correct”.  However, an alignment should be run to check the phase.

 

TVIKVRMSSQDWGARKIGGSNRVGEKPGRATEKPGPAEQVMRELPIFQPAHP*TLTLHRSQAKIRRTVEGQLTPCSGCKPWQESWRNKAQTRTPLG

 

 

 

b. Sequence 2

 

CAGGGGCAGTGCGGGAGGTGCTGGGCTTTCTCAACAGCTGAGGTGATTTCCGATCGAACATGTATTGCAA

GCAATGGTACCCAACAACCAATCATCTCCCCAACTGATCTGCTCACTTGTTGTGGAATGTCATGCGGAGA

GGGCTGTAACGGCGGCTAT

 

Phase 1

QGQCGRCWAFSTAEVISDRTCIASNGTQQPIISPTDLLTCCGMSCGEGCNGGY

 

Phase 2

RGSAGGAGLSQQLR*FPIEHVLQAMVPNNQSSPQLICSLVVECHAERAVTAA

 

Phase 3

GAVREVLGFLNS*GDFRSNMYCKQWYPTTNHLPN*SAHLLWNVMRRGL*RRL

 

 

 

 

c. Sequence 3

 

CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGATCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG

 

Phase 1

PCGATP*GWPIYSQEQGGQEPGLGIKVRAEPSIAYICF*HNCVH*QPQTDTMVHLTPEEKSAVTALWGKVNVDEVGGEALGRLVSRLQDRFKETNRNWACGDREDSWVSDRH*LSLPIGLFSHP*AAGDLPLDPEVL*VLWGSVHS*CCYG

                                                 

Phase 2

PVEPHPRVGQSTPRSREGRSQGWA*KSGQSHLLLTFASDTTVFTSNLKQTPWCT*LLRRSLPLLPCGAR*TWMKLVVRPWAGWYQGYKTGLRRPIETGHVETEKTLGFLIGTDSLCLLVYFPTLRLLVIYPWTQRFFESFGDLSTPDAVM

 

Phase 3

LWSHTLGLANLLPGAGRAGARAGHKSQGRAIYCLHLLLTQLCSLATSNRHHGAPDS*GEVCRYCPVGQGERG*SWW*GPGQVGIKVTRQV*GDQ*KLGMWRQRRLLGF**ALTLSAYWSIFPPLGCW*STLGPRGSLSPLGICPLLMLLW

 

 

 

 

5. Find three article/journal citations for each the following topics:

 

Use the link to Journal Retrieval in the applications section of CU’s Bioinformatics web page. 

 

 

a. Nanotechnology in medicine and the biological sciences.

 

Search PubMed in NCBI and type nanotechnology in the search box provided.

 

 

2:

woolley AT.

Related Articles

Biomedical microdevices and nanotechnology.
Trends Biotechnol. 2001 Feb;19(2):38-9. No abstract available.
[MEDLINE record in process]
PMID: 11252264

 

15:

Ball P.

Related Articles

Nanotechnology. Molecular movers and shakers.
Nature. 2000 Dec 21-28;408(6815):904. No abstract available.
PMID: 11140655

 

7:

Service RF.

Related Articles

Is nanotechnology dangerous?
Science. 2000 Nov 24;290(5496):1526-7. No abstract available.
PMID: 11185512

 

 

 

b. Human Immunodeficiency Virus (HIV).

 

Type HIV in the search box and click ‘GO’.  There are many interesting articles including :

 

 

4:

Freitag C, Chougnet C, Schito M, Near KA, Shearer GM, Li C, Langhorne J, Sher A.

 

Malaria Infection Induces Virus Expression in Human Immunodeficiency Virus Transgenic Mice by CD4 T Cell-Dependent Immune Activation.
J Infect Dis. 2001 Apr 15;183(8):1260-1268.
[Record as supplied by publisher]
PMID: 11262209

 

Dezzutti CS, Guenthner PC, Cummins Jr JE, Cabrera T, Marshall JH, Dillberger A, Lal RB.

 

Cervical and Prostate Primary Epithelial Cells Are Not Productively Infected but Sequester Human Immunodeficiency Virus Type 1.
J Infect Dis. 2001 Apr 15;183(8):1204-1213.
[Record as supplied by publisher]
PMID: 11262202

 

 

 

16:

Hardy H, Esch LD, Morse GD.

 

Glucose disorders associated with HIV and its drug therapy.
Ann Pharmacother. 2001 Mar;35(3):343-51.
[MEDLINE record in process]
PMID: 11261533

 

 

 

 

 

6. Does the restriction enzyme XhoI cut this sequence, and if so how many times does it cut, where does it cut (give approx. base pair region), and does the cut leave a blunt ends or a staggered ends?

 

Go to the Max Heiman’s Webcutter 2.0 site (can get there from the restriction enzyme mapping link on the Applications page of CU’s Bioinformatics web page.

 

Cut and paste the desired sequence into the box and select :

Linear sequence

Map of restriction site

All enzymes (w/ rainbow highlights is helpful and easier to look at)

Only the following enzyme: XhoI

 

 

ATTCTTTTGAGTCGGGAGAACTAGGTAACAATTCGGAAACTCCAAAGGGTGGATGAGGGGCGCGCGGGGT

GTGTGTGGGGGATACTCTGGTCCCCCGTGCAGTGACCTCTAAGTCAGAGGCTGGCACACACACACCTTCC

ATTTTTTCCCAACCGCAGGATGGCGCCTCATCCCTTGGATGCGCTCACCATCCAAGTGTCCCCAGAGACA

CAACAACCTTTTCCCGGAGCCTCGGACCACGAAGTGCTCAGTTCCAATTCCACCCCACCTAGCCCCACTC

TCATACCTAGGGACTGCTCCGAAGCAGAAGTGGGTGACTGCCGAGGGACCTCGAGGAAGCTCCGCGCCCG

ACGCGGAGGGCGCAACAGGCCCAAGAGCGAGTTGGCACTCAGCAAACAGCGAAGAAGCCGGCGCAAGAAG

GCCAATGATCGGGAGCGCAATCGCATGCACAACCTCAACTCGGCGCTGGATGCGCTGCGCGGTGTCCTGC

CCACCTTCCCGGATGACGCCAAACTTACAAAGATCGAGACCCTGCGCTTCGCCCACAACTACATCTGGGC

ACTGACTCAGACGCTGCGCATAGCGGACCACAGCTTCTATGGCCCGGAGCCCCCTGTGCCCTGTGGAGAG

CTGGGGAGCCCCGGAGGTGGCTCCAACGGGGACTGGGGCTCTATCTACTCCCCAGTCTCCCAAGCGGGTA

ACCTGAGCCCCACGGCCTCATTGGAGGAATTCCCTGGCCTGCAGGTGCCCAGCTCCCCATCCTATCTGCT

CCCGGGAGCACTGGTGTTCTCAGACTTCTTGTGAAGAGACCTGTCTGGCTCTGGGTGGTGGGTGCTAGTG

GAAAGGGAGGGGACCACAGCC

 

 

XhoI cuts once, between basepairs 301 to 375, and leaves blunt ends.

 

                                      

                                                XhoI                       

gaagcagaagtgggtgactgccgagggacctcgaggaagctccgcgcccgacgcggagggcgcaacaggccc                                       aacttcgtcttcacccactgacggctccctggagctccttcgaggcgcgggctgcgcctcccgcgttgtccgggttc                                                     


 

Enzyme       No.  Positions                        Recognition

 

name         cuts of sites                         sequence

XhoI          1   330                              c/tcgag            

 

 

 

 

 

 

7. Determine the restriction digest map of the following sequence with restriction enzymes that recognize six base pair sequences.

 

Go to the Max Heiman’s Webcutter 2.0 site (can get there from the restriction enzyme mapping link on the Applications page of CU’s Bioinformatics web page.

 

Cut and paste the desired sequence into the box and select :

Linear sequence

Map of restriction site

All enzymes (w/ rainbow highlights is helpful and easier to look at)

Only enzymes with recognition sites greater than or equal to 6 bases long.

Clicks analyze sequence when ready.

 

 

>gi|7524758|ref|NC_001776.1|| Oryza sativa mitochondrial plasmid B2, complete sequence

 

CCGAGGTACCAGAGGATAGCCTTTAGGATATCCGTCGGTAGGAGATATCGAGTCGTTAGAGCACATATCT

CCGTGATCCCTTCCACAGCTAAACGTAGATCTCTATCTTGTAGTGGAGTGTAGAGTAAAGATATACCGGA

GTGGAGTGCAAGAAGGTTATAGGGGAGTGTGGCGCCCGTTGCCCGCCTTTCGTCTCGGTTGAAAAACAAA

ACTGTTGCTTGTTTTGTTCATTCAAGTAGTGGTTTAGCCCTCGAATCTATAGCGAAAGAGTTAGCTCAAC

CTATCTACTACTACCCTTGGAAAGGGGGTGTGTGTGGTATACCCGTTACGACTCCCCTCGAACTGAGCCA

TTTTCATGATTTTACAATCAAAGAAAGTTCCATACTTTTTTATTCCCGGTATGGAGCCAACGTCGCTTCG

CTTTCGCAGGACAATTTCCTTCTGGAAAAATATCCAGTGGCATATCCCATCTAGCTCTGGGAATCTTTCC

ACAACCAGCGCCTTCTTGTCCGATGAAGTTCAAGTCTCTAACCTGTTCTTGCCCGGGAATGTCCCTCCTT

CGAACTCGAAGTCAGAGTTTTCCGGTTATCCGATGGGCGATTTATGATAGTTCCGATGCACTTCTCATGC

TAGCCTATCGACAAGATCAGGGTTCAACTACTGTTGCCTCAACGGGTTCGCTGGTCACTTTAACAGTTTT

TTAAGCAGGCATTTTCGTGGTTCTCGTCATATAAACCAGTCAGGTACTAGCGTGATTTCATGCATTAATC

TATTGCTCGAAGGAACCTTTCGTCAGAAAACTTATTCAAAAGGTACCCGGTGTTCTAGCGCATTTCTGTT

AAGAGAAGTACCCCTTGTAGGCCTGAGATGCCCTCTATCGAGCGATATAACGACGGTTGCTATTTTTAGG

TACACGTGTTTTTTGAAGAAAGCACTTTTAAAAAGGAAAGTAGTAAAATGAAAATACTCAATTTAATAAA

TCCTAAATTATCTGTGGCATTGAATTATGAAAGTAAACACTGAAAGTTTACCACTTACAAAAGTAAATGT

ACTACCCGACTAAAAGGAGGAAATCCAATTAGAGGTGTAAACTCTTTTCTTTTCTATCGAGTAGATTCCC

TTTTATTTATAAATAGTAAACTAACGAATAGATAATTAGTATTCCCTCATATAATTAGTACTAGCTGTTC

CTTCATTAATGGTACGTGTATTCACGAGGGCTCTGCTTCGCTCTATTCCACCTCCCGGGTGTATGGAAAA

ATCTTTAAAAGAAATAAATCATGCGGTACTTCTTGTTTTAATCACTCTTCCTTTCCTCCCGGAACTGGGG

AAGAACCCTTCTGAAAAGGAATGGAGGTCTATCTTTACTCGGAGTGCAACGCAGAGATAGAGATCGAAGA

GCCGCTGTGCTTGGGGACGAGGGGACTCACTCAACCTCAAAGTTGAGGGAATCCCGAAGCCTCGGAATCT

AGCGTTAGCGTAAGG

 

SmaI cuts twice, once in the region of 526-600 and again between 1201-1275; leaves blunt ends, the sequence is ccc/ggg.

 

 

 

 

 

 

8. Convert the following protein sequences back to their nucleotide sequences using the correct codon usage table. For example, use the Homo sapien codon usage table if the protein sequence corresponds to a human gene product.

 

Use the Backtranslation page in the Applications section of CU’s Bioinformatics web page

 

>gi|126001|sp|P00709|LCA_HUMAN ALPHA-LACTALBUMIN PRECURSOR (LACTOSE SYNTHASE B PROTEIN)

MRFFVPLFLVGILFPAILAKQFTKCELSQLLKDIDGYGGIALPELICTMFHTSGYDTQAIVENNESTEYGL

FQISNKLWCKSSQVPQSRNICDISCDKFLDDDITDDIMCAKKILDIKGIDYWLAHKALCTEKLEQWLCEKL

 
ATG AGA TTC TTC GTA CCT CTA TTT CTC GTG
GGA ATA CTG TTT CCG GCA ATA CTG GCG AAA
CAG TTC ACA AAG TGC GAA TTA TCT CAA TTA
TTA AAA GAC ATA GAT GGC TAC GGA GGA ATC
GCA CTT CCC GAA CTT ATA TGT ACG ATG TTT
CAC ACA TCC GGT TAT GAT ACC CAA GCC ATC
GTC GAA AAT AAT GAG AGC ACT GAA TAC GGG
CTC TTT CAG ATT AGT AAC AAG CTT TGG TGT
AAA TCG AGT CAA GTT CCA CAG AGT AGA AAC
ATA TGC GAT ATC TCA TGC GAC AAA TTC TTG
GAC GAT GAC ATT ACA GAT GAC ATC ATG TGC
GCA AAA AAG ATT CTA GAT ATT AAA GGA ATA
GAC TAT TGG TTG GCA CAT AAG GCT TTG TGT
ACA GAG AAG CTG GAG CAA TGG CTA TGT GAG
AAG CTC 

 

>gi|2119314|pir||I61690 myosin - human (fragment)

CVIISGESGAGKTVAAKYIMSYISRVSGGGTKVQHVKDIILQSNPLLEAFGNAKTVRNNNSSRFGKYFEI
QFSPGGEPDGGKISNFLLEK
 
TGT GTA ATT ATA TCT GGG GAA AGC GGC GCG
GGT AAA ACA GTA GCC GCA AAG TAC ATC ATG
AGT TAT ATA TCC AGA GTA AGT GGG GGT GGA
ACA AAG GTC CAG CAT GTG AAG GAT ATC ATT
TTA CAA AGT AAC CCA TTA TTA GAG GCA TTC
GGA AAT GCT AAG ACA GTT AGA AAT AAC AAC
AGT TCA AGA TTT GGC AAA TAT TTC GAG ATA
CAA TTT AGT CCA GGC GGA GAA CCA GAC GGT
GGG AAA ATA TCG AAT TTT TTA TTA GAA AAA

 

>gi|746398|gb|AAA97505.1| ampicillin-binding protein (E. coli)

MRFSRFIIGLTSCIAFSVQAANVDEYITQLPAGANLALMVQKVGASAPAIDYHSQQMALPASTQKVITAL
AALIQLGPDFRFTTTLETKGNVENGVLKGDLVARFGADPTLKRQDIRNMVATLKKSGVNQIDGNVLIDTS
IFASHDKAPGWPWNDMTQCFSAPPAAAIVDRNCFSVSLYSAPKPGDMAFIRVASYYPVTMFSQVRTLPRG
SAEAQYCELDVVPGDLNRFTLTGCLPQRSEPLPLAFAVQDGASYAGAILKDELKQAGITWSGTLLRQTQV
NEPGTVVASKQSAPLHDLLKIMLKKSDNMIADTVFRMIGHARFNVPGTWRAGSDAVRQILRQQAGVDIGN
TIIADGSGLSRHNLIAPATMMQVLQYIAQHDNELNFISMLPLAGYDGSLQYRAGLHQAGVDGKVSAKTGS
LQGVYNLAGFITTASGQRMAFVQYLSGYAVEPADQRNRRIPLVRFESRLYKDIYQNN
 
ATG AGG TTC AGC AGG TTC ATC ATT GGG CTT
ACA TCC TGC ATC GCC TTC AGT GTT CAG GCA
GCA AAT GTG GAC GAG TAT ATA ACG CAA CTC
CCT GCT GGG GCT AAC CTG GCA TTA ATG GTC
CAA AAG GTC GGC GCA TCT GCA CCA GCG ATT
GAT TAT CAT TCA CAG CAG ATG GCC TTG CCA
GCG TCC ACG CAA AAG GTG ATT ACA GCG TTA
GCG GCA CTC ATA CAG CTC GGC CCA GAT TTT
CGG TTC ACT ACG ACC CTA GAG ACG AAA GGT
AAT GTA GAG AAC GGA GTT TTA AAG GGA GAT
TTA GTT GCA AGA TTT GGG GCA GAC CCC ACA
CTG AAG AGG CAA GAC ATA AGG AAT ATG GTA
GCG ACA CTG AAA AAA AGC GGA GTG AAT CAA
ATA GAC GGG AAC GTT CTA ATT GAT ACC AGT
ATT TTT GCA TCG CAT GAT AAA GCG CCT GGA
TGG CCG TGG AAT GAT ATG ACC CAA TGT TTC
AGT GCC CCT CCC GCG GCC GCT ATA GTC GAT
CGT AAC TGT TTT TCA GTG TCA CTG TAT AGC
GCG CCG AAG CCG GGT GAC ATG GCC TTT ATT
AGA GTA GCC TCC TAC TAT CCG GTT ACC ATG
TTT AGC CAA GTG AGA ACA CTC CCC CGC GGT
TCG GCT GAA GCC CAG TAT TGC GAG TTA GAT
GTG GTT CCC GGA GAT CTA AAC CGC TTT ACC
CTA ACT GGG TGT CTG CCA CAA CGT TCA GAA
CCC CTA CCT CTT GCT TTC GCT GTA CAG GAC
GGA GCC TCT TAC GCA GGT GCT ATC CTT AAA
GAC GAA CTC AAG CAA GCA GGA ATC ACA TGG
AGT GGG ACA CTG TTA AGA CAA ACA CAA GTC
AAT GAA CCA GGT ACT GTA GTG GCG AGT AAG
CAG TCC GCA CCA TTA CAT GAT CTC CTT AAG
ATT ATG TTA AAG AAA TCG GAC AAC ATG ATC
GCT GAC ACT GTA TTC CGA ATG ATA GGT CAC
GCA CGC TTT AAT GTG CCG GGC ACG TGG CGT
GCT GGG TCG GAC GCC GTA CGG CAG ATT CTG
CGA CAA CAA GCG GGA GTC GAC ATC GGG AAT
ACC ATA ATC GCG GAT GGC AGT GGT TTG TCT
CGG CAT AAT CTT ATT GCT CCT GCA ACG ATG
ATG CAG GTT TTG CAA TAC ATC GCT CAG CAC
GAC AAC GAG TTG AAC TTT ATA TCT ATG TTG
CCG TTA GCT GGC TAC GAT GGT TCC CTA CAG
TAT CGC GCC GGC CTT CAC CAG GCC GGA GTC
GAT GGC AAA GTC AGC GCG AAA ACT GGG TCG
CTC CAG GGC GTT TAC AAT CTT GCG GGT TTC
ATA ACT ACA GCC TCT GGA CAA AGA ATG GCA
TTT GTA CAG TAT CTA AGT GGC TAT GCC GTC
GAA CCT GCT GAT CAG AGA AAC CGA CGT ATA
CCC TTG GTA CGG TTC GAA TCA CGA TTG TAC
AAA GAC ATC TAC CAA AAT AAC 

 

 

>gi|476966|pir||A47398 serotonin transporter - human

METTPLNSQKQLSACEDGEDCQENGVLQKVVPTPGDKVESGQISNGYSAVPSPGAGDDTRHSIPATTTTL
VAELHQGERETWGKKVDFLLSVIGYAVDLGNVWRFPYICYQNGGGAFLLPYTIMAIFGGIPLFYMELALG
QYHRNGCISIWRKICPIFKGIGYAICIIAFYIASYYNTIMAWALYYLISSFTDQLPWTSCKNSWNTGNCT
NYFSEDNITWTLHSTSPAEEFYTRHVLQIHRSKGLQDLGGISWQLALCIMLIFTVIYFSIWKGVKTSGKV
VWVTATFPYIILSVLLVRGATLPGAWRGVLFYLKPNWQKLLETGVWIDAAAQIFFSLGPGFGVLLAFASY
NKFNNNCYQDALVTSVVNCMTSFVSGFVIFTVLGYMAEMRNEDVSEVAKDAGPSLLFITYAEAIANMPAS
TFFAIIFFLMLITLGLDSTFAGLEGVITAVLDEFPHVWAKRRERFVLAVVITCFFGSLVTLTFGGAYVVK
LLEEYATGPAVLTVALIEAVAVSWFYGITQFCRDVKEMLGFSPGWFWRICWVAISPLFLLFIICSFLMSP
PQLRLFQYNYPYWSIILGYCIGTSSFICIPTYIAYRLIITPGTFKERIIKSITPETPTEIPCGDIRLNAV

 

ATG GAG ACG ACC CCG TTG AAC TCC CAA AAG
CAG TTA AGT GCC TGT GAG GAT GGG GAG GAC
TGC CAG GAA AAT GGT GTC CTA CAA AAG GTG
GTA CCT ACG CCT GGT GAC AAA GTA GAA TCA
GGT CAA ATC TCT AAC GGC TAT TCA GCT GTT
CCA AGT CCA GGT GCG GGC GAC GAC ACT AGA
CAT TCC ATA CCC GCG ACC ACG ACG ACC TTG
GTA GCC GAA CTA CAT CAA GGT GAG AGA GAA
ACC TGG GGC AAG AAG GTG GAC TTC TTG CTA
AGC GTG ATT GGC TAC GCA GTC GAT CTG GGA
AAC GTT TGG CGT TTT CCA TAC ATC TGC TAC
CAG AAT GGC GGA GGG GCC TTT TTG TTG CCG
TAC ACG ATC ATG GCA ATT TTC GGA GGG ATT
CCC CTG TTT TAC ATG GAG CTC GCA TTA GGG
CAG TAC CAT CGC AAT GGA TGC ATA TCT ATA
TGG AGG AAG ATT TGT CCA ATA TTC AAG GGC
ATT GGA TAT GCC ATT TGC ATA ATT GCG TTT
TAC ATA GCA TCC TAT TAT AAC ACA ATC ATG
GCT TGG GCA TTA TAT TAC TTA ATA TCG TCT
TTT ACC GAC CAA TTA CCA TGG ACC AGT TGC
AAA AAT AGT TGG AAC ACA GGG AAT TGC ACG
AAT TAC TTT TCC GAG GAT AAT ATA ACT TGG
ACA TTA CAC AGT ACA TCA CCC GCA GAG GAA
TTT TAT ACT CGA CAC GTG CTT CAA ATT CAT
CGT TCG AAA GGC CTT CAG GAC CTG GGG GGA
ATT AGT TGG CAA CTT GCA CTA TGT ATC ATG
CTT ATA TTC ACT GTA ATC TAT TTC AGC ATC
TGG AAG GGG GTT AAA ACG TCG GGC AAG GTT
GTT TGG GTT ACG GCT ACA TTT CCA TAT ATT
ATC CTG AGC GTG CTG CTA GTG CGA GGC GCT
ACT CTC CCC GGA GCC TGG AGA GGT GTA CTA
TTC TAC CTC AAG CCC AAT TGG CAG AAA CTT
TTA GAG ACT GGT GTA TGG ATA GAT GCG GCC
GCT CAA ATC TTT TTT TCA TTG GGG CCT GGT
TTT GGG GTG CTG TTA GCC TTC GCC AGT TAC
AAT AAA TTT AAC AAC AAT TGT TAT CAG GAC
GCG CTG GTC ACT TCT GTC GTA AAC TGT ATG
ACA AGC TTC GTG TCA GGA TTT GTC ATA TTC
ACT GTC CTT GGT TAC ATG GCT GAG ATG AGA
AAC GAA GAT GTC TCG GAA GTT GCC AAA GAT
GCT GGC CCT AGT CTA TTA TTC ATA ACT TAC
GCG GAA GCA ATA GCA AAT ATG CCG GCA AGT
ACC TTT TTC GCG ATC ATT TTC TTC TTG ATG
CTG ATT ACC CTA GGA CTG GAT AGT ACA TTC
GCG GGT TTG GAA GGA GTA ATC ACA GCA GTG
TTG GAC GAA TTC CCC CAC GTA TGG GCT AAA
AGG CGG GAG CGG TTC GTA CTC GCT GTG GTT
ATT ACA TGT TTC TTC GGT AGT TTA GTT ACC
CTA ACA TTT GGC GGG GCG TAT GTA GTA AAA
CTC CTT GAG GAA TAT GCT ACC GGC CCC GCC
GTC TTA ACA GTG GCC CTT ATT GAG GCG GTT
GCG GTT TCA TGG TTT TAC GGA ATT ACA CAG
TTC TGT CGC GAT GTC AAA GAG ATG TTG GGA
TTT AGC CCT GGG TGG TTT TGG CGA ATA TGT
TGG GTC GCT ATA TCC CCA CTT TTT CTC CTC
TTC ATC ATA TGC TCG TTC CTG ATG TCT CCG
CCA CAA CTC CGT CTC TTT CAG TAT AAC TAT
CCT TAC TGG AGC ATA ATT CTT GGG TAT TGT
ATT GGA ACG TCG TCC TTT ATC TGC ATC CCT
ACT TAT ATT GCA TAT CGC CTA ATA ATC ACT
CCA GGT ACC TTT AAA GAA CGG ATC ATC AAG
TCT ATC ACA CCG GAA ACG CCG ACG GAA ATA
CCG TGC GGA GAT ATA AGG CTC AAC GCA GTC

 

 

 

9. Determine all possible properties of the 4 protein sequences from Problem #8. Molecular weight, isoelectric point, extinction coefficients, titration curve, etc.

 

For all cases go to Determining Properties of a Protein Sequence link in the Applications page.  Click on the links to the desired properies (MW, pI, extinction coefficients can be calculated with the first link, the titration curve can be found with the second link)  Cut and paste the nucleotide sequence into the box and follow the directions in the Applications.

 

 

a.)LCA_HUMAN ALPHA-LACTALBUMIN PRECURSOR (LACTOSE SYNTHASE B PROTEIN)

 

Number of amino acids: 426
 
Molecular weight: 35010.9
 
Theoretical pI: 5.26

 

Formula: C1300H2176N426O542S80
Total number of atoms: 4524
 
Extinction coefficients:
 
Conditions: 6.0 M guanidium hydrochloride
            0.02 M phosphate buffer
            pH 6.5
 
Extinction coefficients are in units of  M-1 cm-1 .
The first table lists values computed assuming ALL Cys 
residues appear as half cystines, whereas the second table 
assumes that NONE do. 
 
                      276     278     279     280     282
                       nm      nm      nm      nm      nm
Ext. coefficient     5800    5080    4800    4800    4800
Abs 0.1% (=1 g/l)   0.166   0.145   0.137   0.137   0.137
 
 
 
                      276     278     279     280     282
                       nm      nm      nm      nm      nm
Ext. coefficient        0       0       0       0       0
Abs 0.1% (=1 g/l)   0.000   0.000   0.000   0.000   0.000

 

 

 

 

Courbe de tirage (titration curve)

Valeurs utilisées des pK :

     alpha-NH3 :  9,69        alpha-COOH :  2,34
     Arg : 12,40     Lys : 10,50     His :  6,00
     Asp :  3,86     Glu :  4,25     Cys :  8,33     Tyr : 10,00

 

 

 

 

 

This site is funded by the National Science Foundation .