Frequently Asked Questions

 


What is BLAST?

What are the different types of BLAST and what do they do?

What are Similarity, Identity and Homology?

What is FASTA?

What do the scores mean in BLAST and FASTA?

What is the difference between BLAST and FASTA?

What is NCBI?

What is GenBank?

What is Entrez?

What is PubMed?

Related Sites


What is BLAST?

BLAST (Basic Local Alignment Search Tool) compares a query protein or nucleotide sequence to a database of known sequences. Often, a search is done using this tool to determine similarity to previously published sequences, and if the sequence has not yet been published, can give some insight into the function of the DNA or protein sequence.

What are the different types of BLAST and what do they do?

blastn compares your query nucleotide sequence with database nucleotide sequences

blastp compares your query protein sequence with database of protein sequences that were derived form cDNA of interest

blastx first translates your query sequence into amino acids in six reading frames (three forward and three back) then compares the protein sequences with protein databases

tblastn compares your query protein sequence with the database after translating each nucleotide sequence into protein using all six reading frames (This algorithm takes a long time, but is more likely to find distantly related sequences than the blastn, blastx, and blastp.)

tblastx translates both the query nucleotide sequence and the database sequences in all six reading frames and then compares the protein sequences (This, like tblastn, is very time consuming, but finds more results.)

What are similarity, homology and identity?

Identity is the degree of correspondence between two sub-sequences (no gaps between the sequences). An identity of 25% or higher implies similarity of function, while 18-25% implies similarity of structure or function. Keep in mind that two completely unrelated or random sequences (that are greater than 100 residues) can have higher than 20% identity.

Similarity is the degree of resemblance between two sequences when they are compared. This is dependant on their identity.

Homology refers to the resemblance or similarity between two sequences due to the organisms being of common ancestry (or descending from common evolutionary ancestor).

What is FASTA?

STILL WORKING ON ANSWER FOR THIS QUESTION

What do the scores mean in BLAST and FASTA?

STILL WORKING ON ANSWER FOR THIS QUESTION.

 

What is Entrez?

Entrez is a search engine that searches numerous databases supported by NCBI including PubMed, OMIM, GenBank, MMDB, and more.

What is NCBI?

The National Center for Biotechnology Information was established in 1988 as a national resource for molecular biology information. It creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease.

What is GenBank?

GenBank is a large database, kept by the US government for public use, filled with nucleotide sequences submitted by scientists from around the world. Each of the sequences are given an ID or gi number for easy identification in the database.

What is PubMed?

PubMed is the literature citation database at the National Center for Biotechnology Information. The PubMed database can be easily searched with Entrez by a simple keyword search. Full articles are not provided in this database, only citations and abstracts are available to view.

 

For more information on these and many other related topics, please visit the following sites.

Glossary on CU Bioinformatics Website

 

MCDB 4660 Developmental Biology Lab 5 (Spring 2001)

(This will require your computer to have Adobe Acrobat to download the file)