How Magic-BLAST works
Index vs. BLAST database
Unlike most mapping tools, Magic-BLAST does not build an index of a genome and instead it builds an index of a batch of reads and scans a BLAST database for potential matches. BLAST database can be created from a FASTA file in seconds or minutes instead of hours for most indices. It also allows for mapping to or searching arbitrarily large collections of sequences.
Magic-BLAST can also work with a genome as a FASTA file. However it is not recommended for more than a few million bases, because mapping to a FASTA file is much slower than to a BLAST database.
Seed and extend
Magic-BLAST works similarly to other BLAST programs. First it finds a seed alignment, an exact 16-base match and extends alignment to the left and right. Shorter alignments are combined over introns if splice signals are found. For paired reads, the cumulative pair score is used to select the best mapping.
Database word counts
To avoid mapping to repeats Magic-BLAST scans the database and counts 16-base words. Those that appear more than 10 times are not extended.
This functionality can be turned off with
-limit_lookup F option and should not be used when mapping to transcripts.