–1.
Create all sub-strings from DNA file, store efficiently into memory along
with their location in the file.
–2.
Search for exact matches of sub-strings in the data-set.
–3.
Search for partial matches of sub-strings in the data-set.
–4.
Parallel searches for multi-core systems.
–5.
Statistics generated for each input file (distance between matching
sub-strings, string frequency – i.e. how often does string X occur in the
DNA file?)
Future Features
1.
Statistic Graphs
2. Find matches in strings that differ in only N cases