The parallelization of a prefixspan method to discover motifs is proposed in this paper. The prefixspan method is used to extract the frequent pattern from a sequence database. This system requires the use of multiple computers connected in local area network. This algorithm includes multi-threads to achieve communication between a master process and multiple slave processes. This algorithm applies dynamic scheduling to avoid tasks idling. Moreover we employ a technique, called selective sampling. We implement this algorithm with using a 4G memory and AMD phenom X4. Our experimental results show that this algorithm attains good efficiencies on motifs extraction.
Key words: Motif discovery, parallel mining, wild cards, task scheduling, sequence mining, thread scheduling, parallel tree, DNA sequences.
Copyright © 2023 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0