Features & Details
Paracel BLAST Overview
Researchers are now regularly investigating the largest and most complex genome sequences on the planet. As the size and complexity of sequences increase, the large database and high throughput requirements of performing even a simple BLAST sequence similarity search can dramatically escalate.
The best solution to these increased computational demands is the use of high performance parallel computing to split large, complex tasks across multiple processors. Paracel BLAST is the only commercially-supported BLAST solution for parallel computing. Paracel recoded the original NCBI BLAST source code from the ground up, eliminating important performance bottlenecks and optimizing it for use on parallel platforms.
- Search large databases without loss of performance
Searches that previously failed because of their sheer size can be executed rapidly with Paracel BLAST. Researchers can routinely complete searches that would be prohibitively long using traditional methods.
- Automatic parallelization, queuing, and scheduling
Paracel BLAST automatically executes your searches in parallel on multiple processors, giving you the highest performance without complex administration.
Paracel BLAST scales up to run very rapidly on multiple CPU systems. A 32-CPU linux cluster runs a large database search in several hours that would take days running NCBI code on an ordinary single processor system.
Enhancements in Parallelism
One of the key reasons why Paracel BLAST is far superior to other solutions is its ability to run incredibly large sequences on massively-parallel systems. Paracel BLAST's parallel-computing enhancements include:
- Incorporates a custom scheduler that is tightly coupled with the application to promote job parallelism and dynamically control query parallelism.
- Performs splitting and merging of results internally, alleviating the need for external parsing, in addition to only generating a report once.
- Splits the database dynamically, saving the user the time of manually splitting and assigning processes at the time of database loading, and re-splitting when the number of processors changes.
- Keeps track of which nodes have which sections of the database loaded into their RAM, allowing the integrated scheduler to choose the best node to perform the given task.
- Partitions jobs over multiple processors. Paracel BLAST tends to over-partition so that, instead of becoming idle, each node has a new task to work on immediately after finishing its initial task.