A parallel reconfigurable platform for efficient sequence alignment

Bioinformatics is one of the emerging trends in today’s world. The major part of bioinformatics is dealing with DNA. Analysis of DNA requires more memory and high efficient computations to produce accurate outputs. Researchers use various bioinformatics algorithms for sequencing and pattern detection techniques, but still now it takes enormous amount of time for computations. In our method we are going to propose a time, memory and speed optimized algorithms for efficient repetitive finding in genomes and proteins. Then, another major aspect is the hardware implementation. It is a platform which reduces the complexity of process further. Therefore, we have proposed to implement the optimized algorithm in the reconfigurable and user friendly FPGA platform. Thus, our proposal mainly focuses on an efficient and optimized computation, analysis and sequencing of DNA pattern. The distinct feature is reducing the time consumption from several hours to few seconds.


INTRODUCTION
In the world of expanding set of biological species, finding repetitive structures in genomes and proteins is important to understand their biological functions.DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule.It includes any method or technology that is used to determine the order of the four bases Adenine, Guanine, Cytosine and Thymine in a strand of DNA (Surendar et al., 2013).The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.If the number of maximal repeat increases, then finding those structures becomes tedious.In existing method, Burrows Wheeler Transform and Wavelet Coding, the major disadvantage is time consumption.And it also needs huge computer space for processing the structures.One of the most important thing that decides our heredity is DNA.One of the well-known features of DNA is its repetitive structures.Many existing methods proposed different data compression formats to reduce the space consumption.Even though the method saves memory, time and speed efficiency cannot be obtained.To obtain optimization of the bioinformatics algorithms used: i) bloom filter; ii) content-addressable memory; iii) Aho-Corasick algorithm are used.

TOOLS FOR OPTIMIZATION A field-programmable gate array (FPGA)
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer.The FPGA configuration is generally specified using a hardware description language (HDL).FPGAs are reprogrammable silicon chips.It provides hardwaretimed speed and reliability.FPGAs are truly parallel in nature.Previously, a design may have included 6 to 10 ASICs, the same design can now be achieved using only Surendar et al. 3345 one FPGA (Surendar et al., 2013).

Altium -3000 nanoboard-XILINX variant
Altium -3000 nanoboard-XILINX variant is a perfect entry-point to discover and explore the world of soft design.It is a programmable hardware platform.Rapid and interactive implementation and debugging of digital designs can be achieved.It has a fixed user FPGA's on the motherboard and so, speed of processing is increased.Circuit can be probed, analyzed and debugged interactively using an array of virtual instruments and JTAG-based monitoring features.

Bloom filter
Bloom filter (Arun and Krishnan, 2011) is a spaceefficient probabilistic data structure that is used to test whether an element is a member of a set (Figure 1).This compact representation is the payoff for allowing a small rate of false positives in membership queries; that is, queries might incorrectly recognize an element as member of the set which can be made negligible by the intensive design effort.It consists of number of hash tables and hash functions that easily store and handle the incoming strings.Each hash table entry stores only a single bit of data, thus a hash table of size M would be made up of M entries, each of size one bit.A bloom filter must be trained with a dictionary of malicious strings before it can be used in a system.All hash table entries are initialized to 0 before training begins.During the training phase malicious strings are fed one at a time to the bloom filter.Each of the k hash functions then acts on every incoming dictionary string (Table 1) and computes an output in the range 0 to M − 1. Entries corresponding to these k outputs are set to 1 in the hash table .The bloom filter reports the current string in its window as a member of the dictionary on which the bloom filter was trained.If a non dictionary input string (Table 2) is such that it hashes to k hash table entries, each of which was set to 1 by one or more dictionary elements during the training phase, the bloom filter will erroneously report this string to be a member of the dictionary on which the bloom filter was trained.The bloom filter thus reports a false positive in this case.Though a bloom filter may occasionally report false positives, it does not allow for false negatives.Even though a bloom filter may sometimes report a non member to be a part of the dictionary set, it will never happen that a true member goes unreported.Thus, the working of bloom filter can be explained in following steps: 1) training the bloom filter with various strings; 2) defining the hash functions and building hash testing the incoming strings for the finding of given string with the help of hash functions; 4) determination of result using the behavior of the filter.

Content addressable memory
Manuscript of content-addressable memory (CAM) was received on July 17, 1987 and revised October 5, 1987.A content-addressable memory (CAM) is a high speed matching unit because it has parallel matching capability (Yoshiki et al., 2002).It speeds up the data searching and pattern matching.CAMs are storage devices that allow its contents to be accessible on the basis of a match between a specified key and the contents, a process called "content addressing".CAM architectures fall between two extremes: the bit serial CAM and the fully parallel CAM.In the bit serial CAM, the matching logic is associated with one bit position, and shared among all the bits in a word, in effect matching one bit ata-time simultaneously in all the CAM words.In the fully parallel CAM, each word has its own bit-parallel matching logic, allowing that match of all words to process.Here, initially the device is trained with certain 8 bit binary database.And then the input binary parameter is given.The number of 1's in the parameter is extracted by parameter extraction then it is stored in the parameter memory.Then 1's in the input parameter is compare with the trained database and produce a required result else next input parameter is given.Thus, the working of CAM filter can be expressed in following steps.Castelo et al., 2002 1.The device is trained with database, 2. Input is given, 3. The number of 1's and 0's is extracted by parameter extraction, 4. It is stored in memory, 5. Compare 1's in the given input with database, 6.Finally, it produced the required input if it is matched (Figure 3).

Aho-Corasick
Aho-Corasick algorithm (Komodia, 2012) is a dictionary matching algorithm that searches for elements of a finite set of strings in the input text, developed by Alfred V. Aho and Margaret J. Corasick.Since, it locates all patterns in one time, the time complexity of the algorithm (Jung et al., 2006) is proportional to sum of the length of the patterns, length of the input text and the number of matches.In this algorithm, a trie with suffix tree-like set of links is established from each node representing a string to the node corresponding to the longest proper suffix.Since, it also consists of links from each node to the longest suffix node that connect to a match string; all of the matches can be traversed by going along the resulting linked list.The trie is utilized at runtime to keep track of the longest match and the suffix links are used to make sure the computation is proportional to the length of the input.For every link along the dictionary suffix linked list and every node in the dictionary located, a match is found.Since most of the time, the pattern database is known ahead, program can be created to build the trie, compile it and save it for later use.In this case, the computational complexity in the runtime is proportional to the sum of the length of the inputs and the number of matched entries.Figure 6 shows an example of data structure made up from a couple of strings.Each row represents a node in the trie while each column indicates the distinct order of characters from root to the node.In every step, the current node will try to find its child recursively if the suffix child does not exist until it reach the root node.Steps taken when scanning "abccab" are shown below.

Simulation on an input text
Since there may be two or more dictionary entries at a character location in the input text, more than one dictionary suffix link may need to be followed.The working of Aho-Corasick can be explained as follows.
1.The data pattern to be analyzed is built as a dictionary, 2. The pattern to be find is given as input, 3. The node built based on suffix matching, 4. The input for next string is taken from previous node.

Conclusion
The filter used not only reduces the execution performance time but it stands out most in saving the memory.The flip flops and the latches used are triggered efficiently using perfect clocks.The total I/O ports used for this process are very less.The time of CPU processing is reduced very much and thus it enhances the output processing capability.The below table shows the requirements of optimized filters.The optimized algorithms are implemented in reconfigurable FPGA platform.The FPGA platform is in Altium Nanoboard 3000-Xilinx Spartan.The above proposed algorithms are analyzed to be efficient from their performance.When it is implemented in a reconfigurable platform it will work in more optimized way that produces accurate outputs.The Nano Board 3000 is a programmable design environment so it will be efficient for analysis.