SEQATOMS -- a web tool for identifying missing regions in PDB in sequence context
What you can do:
Visualize all "missing" protein regions in PDB.
Highlights:
- With over 46 000 proteins, the Protein Data Bank (PDB) is the most important database with structural information of biological macromolecules. PDB files contain sequence and coordinate information.
- Residues present in the sequence can be absent from the coordinate section, which means their position in space is unknown.
- Similarity searches are routinely carried out against sequences taken from PDB SEQRES. However, there no distinction is made between residues that have a known or unknown position in the 3D protein structure.
- We present a FASTA sequence database that is produced by combining the sequence and coordinate information.
- All residues absent from the PDB coordinate section are masked with lower-case letters, thereby providing a view of these residues in the context of the entire protein sequence, which facilitates inspecting 'missing' regions.
- We also provide a masked version of the CATH domain database.
- A user-friendly BLAST interface is available for similarity searching.
- In contrast to standard (stand-alone) BLAST output, which only contains upper-case letters, our output retains the lower-case letters of the masked regions.
- Thus, our server can be used to perform BLAST searching case-sensitively.
- Here, we have applied it to the study of missing regions in their sequence context.
Keywords:
- PDB
- protein data bank
- sequence similarity
- missing protein regions
- SEQRES
- protein sequence
- protein coordinate
- similarity searching
Literature & Tutorials:
This record last updated: 07-24-2008