Superior
homology search
Software
PatternHunter
Benchmark
Here we provide a comparison of PatternHunter with Blastn and MegaBlast via BL2SEQ, using the most favorable parameters for Blastn and MegaBlast and standard parameters for PatternHunter.
Computer: PIII 700Mhz Redhat 7.1, 1G main memory
| Sequence Length | Blastn | PatternHunter |
| 816k vs 580k | 47 sec | 9 sec |
| 4639k vs 1830k | 716 sec | 44 sec |
| 20M vs 18M | out of memory | 13 min |
Here's how the time and memory use compare with Megablast on long sequences:
The output quality is also on par with the default Blastn and much superior to MegaBlast; the next figure shows
a typical comparison of how alignment scores fall off (from best to worst):
At default Blastn sensitivity, PatternHunter runs at MegaBlast speed, using only 1/4 of the memory used by either program. For a genome of length N, PatternHunter requires about 8N bytes of internal memory. When given two inputs of lengths M and N, PatternHunter requires M+8N internal memory. Memory usage can be reduced with PatternHunter's
automatic database partitioning feature.
We also compared the time and sensitivity of different configurations of
PatternHunter with BLAST. In the following figure, Smith-Waterman algorithm's sensitivity
is set to be 100%. And the sensitivity curves of PatternHunter and BLAST indicate how
many of the homologies found by Smith-Waterman can be found by PatternHunter and BLAST,
respectively. The data we used in this comparison are approximately 30k mouse EST sequences (25Mb) and 4k human EST sequences (3Mb).
According to the figure, PatternHunter with 4 seeds run at the same speed of BLAST but with sensitivity close to Smith-Waterman.
 
|