UniqueProt -- creating representative protein sequence sets
What you can do:
Create representative, unbiased data sets of protein sequences.
Highlights:
- UniqueProt takes a submitted protein list and derives a list of sequence-unique proteins by first comparing the sequences with BLAST and secondly using a greedy algorithm to get a representative set from the BLAST output reaching maximum coverage and minimum redundancy for the representative set.
- The largest possible representative sets are found through a simple greedy algorithm using the HSSP-value to establish sequence similarity.
- UniqueProt is not a real clustering program in the sense that the 'representatives' are not at the centres of well-defined clusters since the definition of such clusters is problem-specific.
Keywords:
- protein sequence analysis tool
- protein sequence alignment tool
- protein family
Literature & Tutorials:
This record last updated: 04-21-2014