Textbased
and bibliographic search engines
- PubMed MEDLINE
NLM's search service to access the 9 million citations in MEDLINE
and Pre-MEDLINE (with links to participating on-line journals), and
other related databases.
- Entrez Search
System WWW Entrez allows you to
retrieve molecular biology data and bibliographic citations from the NCBI's
integrated databases.
- BioABACUS is a
searchable database of abbreviations and acronyms in Biotechnology that
contains terms in such categories as: Biochemistry, Bioinformatics,
Cell Biology, Computers and Internet, Diseases, Grants,
Journals, Laboratories, Medicine, Molecular Biology/Genetics,
Neuroscience, Other Organizations, Professional Societies and US
Government.
Mass
Spectrometry sites
Protein
characterization and identification
From ExPASy:
- AACompIdent
- Identify a protein by its amino acid composition
- MultiIdent
- Identify proteins with pI, Mw,
amino acid compositon, sequence tag and peptide mass fingerprinting data
- TagIdent
- Identify proteins with pI, Mw
and sequence tag, or generate a list of proteins close to a given pI
and Mw.
- Translate
a DNA sequence.
Top of the
document
Sequence
homology search engines
- BCM
(USA) General protein sequence/pattern searches. Programs
include fast methods (BLAST, FASTA, PROSITE) and full dynamic
programming methods (FASTA, BLAST, BLITZ, MPSEARCH).
- GeneQuiz
from Sanders group at EBI provides highly automated analysis of
biological sequences
- SRS Sequence
Retrieval System at EMBL. SRS mirrors
worldwide.
- Similarity
Search Engines. Search Engines classified by e.g. Organism or
Protein family.
Blast Searches
- From NCBI
Basic BLAST search and
Advanced BLAST search.
- BLAST
2.0 EMBL
(Bork), EBI,
ISREC
(embnet).
- PSI-BLAST
search Position Specific Iterated BLAST. Especially
sensitive searches for subtle similarity signales.
- PHI-BLAST
search . PHI-BLAST (Pattern-Hit Initiated BLAST)
is a search program that combines matching of regular expressions with
local alignments surrounding the match.
- BLAST 2
SEQUENCES. This tool produces the alignment of two given
sequences using BLAST engine for local alignment.
Currently only blastn and blastp programs are available. Using
sequences > 150 Kb is not recommended.
- BEAUTY
The BEAUTY (BLAST Enhanced Alignment Utility)
Post-Processor adds a variety of very useful information to BLAST
search results returned by the NCBI's BLAST server.
- BioSCAN BioSCAN
(Biological Sequence Comparative Analysis Node) is a computer system
employing special-purpose VLSI hardware to quickly search a biological
sequence database using the program AllSeg (similar to Blast) for
entries with similarities to a query sequence.
- SALSA Protein
Sequence Database Search Version 1.8.2 .
Fasta Searches
Smith-Waterman searches
- Bioccelerators
(BICs) ( Crake
(EMBL),
Shag (EMBL), Croma
(EBI), Weizmann
Inst. ) Bioccelerators are dedicated purpose build search
computer similar to Blitz. To use them is the onliest possibility to
get the sensetivity provided by a Smith- Waterman search.
- Bic-SW uses
the Smith-Waterman algorithm to search SwissProt and sends the results
back . Bic
Also available from Compugen.
Top of the document!
Multiple
alignment
Top of the document!
Pattern
and profile searches
Databases
Search engines
- Protein
Predict (PP). PP is an automatic service for protein
database searches and the prediction of aspects of protein structure,
e.g. motifs and domains, secondary structure, solvent accessibility,
transmembrane helices and coiled-coil regions.
- ScanProsite
- Scan a sequence against the patterns from PROSITE or a pattern from
SWISS-PROT and TrEMBL at ExPASy
- ProfileScan
- Scan a sequence against the profile entries in PROSITE. This server
uses the
pfscan program to search a single sequence against currently
available profile databases.
- FindMod
- Predict potential protein post-translational modifications and
potential single amino acid substitutions in peptides. Experimentally
measured peptide masses are compared with the theoretical peptides
calculated from a specified SWISS-PROT entry or from a user-entered
sequence, and mass differences are used to better characterize the
protein of interest.
- MEME
Multiple EM for motif elicitation: Versio 2.2. Motif
discovery tool. Use this form to submit DNA or protein sequences to MEME.
MEME will analyze your sequences for similarities among them and
produce a description (motif)
for each pattern it discovers. Your data will be processed on the Cray
T3E supercomputer at the San
Diego Supercomputer Center and the results will be sent to
you by e-mail.
- MAST
-- Motif Alignment and Search Tool: Version 2.2. Motif search
tool. Use this form to submit motifs to MAST to
be used in searching a sequence database. Your data will be processed
at the San Diego
Supercomputer Center and the results will be sent to you via
e-mail.
- Repeat
Finder using Blast. Finding repeats in Protein or DNA
sequences. All significant local alignments are reported following the
full end-to-end self-alignment.
- TargetFinder.
A new tool to perform database searches for candidate target genes of
DNA-binding proteins. The use of this program allows to search a
database of annotated sequences for binding sites located in context
with other important transcription regulatory signals and regions, like
the TATA element, the transcription start site, the promoter and so on,
thereby greatly reducing the background usually associated with this
kind of searches
- Pfam-A
HMM search at Washington
University - Scan a sequence against the PFAM HMM
protein families using Hidden Markov Models.
- FPAT -
Regular expression searches in protein databases.
- PRATT -
Interactively generates conserved patterns from a series of unaligned
proteins.
- CBS
SignalP 1.1
Recognition of prokaryotic and eukaryotic signal peptides.
- CBS
TMHMM
0.1 This server is for prediction of
transmembrane helices in proteins.
- CBS
NetOGlyc 2.0
Glycosylation of mammalian
proteins.
- CBS
DictyOGlyc
1.1 O-glycosylation
sites in Dictyostelium discoideum proteins.
- CBS
NetPicoRNA 1.0
Posttranslational cleavage by
picornaviral proteases.
Top of the document!
Structure
prediction
- CBS
CPHmodels
Protein structure from sequence: distance constraints.
- Swiss-Model
- an automated knowledge-based protein modelling server.
- Also found at ExPASy :
- ProtParam
- Physico-chemical parameters of a protein sequence (composition,
extinction coefficient, etc.)
- ProtScale-
Amino acid scale representation (Hydrophobicity, other conformational
parameters, etc.)
- SAPS
- Statistical analysis of protein sequences at ISREC (Also available at
EBI)
- PSORT - Prediction
of protein sorting signals and localization sites
- Coils
- Prediction of coiled coil regions in proteins (Lupas's method)
- Paircoil
- Prediction of coiled coil regions in proteins (Berger's method)
- Multicoil
- Prediction of two- and three-stranded coiled coils
- Vector
Alignment Search Tool. Protein structure neighbors in Entrez
are determined by direct comparison of 3-dimensional protein structures
with the VAST algorithm.
- Protein Data Bank (PDB)
at Brookhaven National Laboratory. The Protein Data Bank is an archive
of experimentally determined three-dimensional structures of biological
macromolecules.
- SCOP. Structural
Classification of Proteins. This is a service that among
other things allows you to enter a sequence and sequence related
information and to find proteins in SCOP which have sequence similarity.
- The IMB Jena Image
Library of Biological Macromolecules contains visual and
other information on three-dimensional biopolymer structures. It
provides access to all structure entries deposited at the Protein
Data Bank (PDB) or at the
Nucleic Acid Database (NDB).
In addition, general information on the architecture of biopolymer
structures is available.
- Services at
NIH Center for Molecular modelling. Links to a lot of 3D
sites.
Top of the document!
Gene
finding
- A bibliography
on computational gene recognition. The papers listed in this
bibliography are an accumulation of more than 15 years of research in
computational molecular biology on this topic.
- GRAIL
version 1.3, at Oak Ridge National Laboratory (ORNL), is a suite
of tools designed to provide analysis and putative annotation of DNA
sequences both interactively and through the use of automated
computation.
- GenQuest,also
at ORNL, is an integrated sequence comparison server which
allows users to make use of a wide variety of sequence comparison
methods and target databases.
- PROCRUSTES
WWW server. PROCRUSTES is
based on the so-called spliced alignment algorithm which explores all
possible exon assemblies and finds the multi-exon structure with the
best fit to a related protein. Basic
mode. Maps of predicted exon-intron structures aligned
against each other (the highest scoring / most reliable) prediction is
highlighted), predicted proteins, spliced alignments, alignment plots. Test
mode. User-defined exon-intron structure of genomic sequence
is used for computing the quality of prediction.
- Genome
analysis on the web. A website with a lot of different tools
for genefinding at the Genome Sequencing Centre Jena.
- Entrez Genomes
was created to provide a practical approach to the handling of complete
genomes (large and small) as well as genetic and physical maps. This
document provides details on the organization of this information.
- The Expressed
Genome Anatomy Database, EGAD, was constructed by extraction
and curation of sequences from GenBank
to create a non-redundant set of human (HT) and non-human (ET)
transcript sequences. In some cases, transcripts were created by
splicing together distinct GenBank accessions for each exon in those
transcripts, or by splicing exons from a genomic sequence.
- PEDANT,
Protein Extraction, Description, and ANalysis Tool. Computational
analysis of complete genomic sequences and experimental unfinished
genomic sequences.
- OMIM
Online Mendelian Inheritance in Man. This database is a catalog of
human genes and genetic disorders authored and edited by Dr. Victor A.
McKusick and his colleagues at Johns Hopkins and elsewhere. The
database contains textual information, pictures, and reference
information. It also contains copious links to NCBI's Entrez
database of MEDLINE articles and sequence information.
- DerBrowser
a (German) Genome Navigator.
Cancerous gene links
- The Cancer Genome Anatomy Project (CGAP) is an interdisciplinary program to establish the
information and technological tools needed to decipher the molecular
anatomy of the cancer cell.
- SAGE,
Serial Analysis of Gene Expression, is an experimental technique designed to gain a
quantitative measure of gene expression. The SAGE technique itself
includes several steps utilizing molecular biological, DNA sequencing
and bioinformatics techniques. Visit the analysis
tools for the SAGE library data.
Top of the document!
Selected
Databases
Human
Mouse
Yeast
Top of the document!
Some
other databases
- Sequenced genomes database Fully
sequenced genomes present in public databases.
- All
known Genome databases, found at
ProteoMetrics, LLC.
- Databanks
of Gene Structure and Regulation at Sanger
Centre.
- MIPS
genome project page with several search engines,
for example ALERT. The Alert utility is
designed to keep you abreast , when new protein sequences are added to
the protein and nucleic acid databases, by sending you once per week,
via email, the new database entries related to your field of interest.
- TIGR Database (TDB)
is a collection of curated databases containing DNA and protein
sequence, gene expression, cellular role, protein family, and taxonomic
data for microbes, plants and humans. See also What's New at TIGR.
- The
Merck gene Index.
- Celis´
Proteome Database at University of Aarhus.
- SwissProt
- OWL non-redundant protein sequence database
- DIP -
Database of Interacting Proteins.
- Swiss2D
SWISS-2DPAGE contains data on
proteins identified on various 2-D PAGE reference maps.
Other
very useful servers
Top of the document!
|
|