(Go back to index)

Project Overview

This project is essentially about searching a huge sequence of characters (DNA), extracting exact or partial matches of sub-strings in the larger sequence, finding patterns in the larger sequence, and succinctly sharing the results with the user.

This project is an extension of work that Brian Delgado and several others worked on in CS 510 Multi-core programming in Summer 2008. Currently, we plan to just extend Brian's code but have gotten permission from two of the previous members (Geoff Shauger and Dave Revell) to leverage their prior work, if needed.

Ideal project outcomes: Provide a tool for researchers to quickly search through large genome files to find matches between different DNA genomes.

Finding a needle in a haystack (2.7M characters in a small sample DNA file)