Isolation and identification of microsatellite repeat motifs from the Epinephelus fuscoguttatus genome

Epinephelus fuscoguttatus belongs to one of the largest serranidae fish family. Genetic information regarding existing fish populations in the wild is crucial for the conservation, particularly since the species is listed under the IUCD Red List due to intense fishing. Microsatellites of E. fuscoguttatus were isolated using streptavidin-biotin enrichment method. In total, 378 microsatellites were identified and characterized. Of these 378 total microsatellites, 46 (12.2%) were mononucleotides, 175 (46.3%) were dinucleotides, 109 (28.8%) were trinucleotides, 36 (9.5%) were tetranucloetides, 7 (1.9%) were pentanucleotides, 4 (1.1%) were hexanucleotides and 1 (0.3%) was a heptanucleotide. The most abundant microsatellite present in E. fuscoguttatus was the dinucleotide motif, (AC)n.


INTRODUCTION
Groupers belong to the order of Perciformes, which the family contains about 500 species of serranides with approximately 159 species of Epinephelinae (Heemstra and Randall, 1993). Groupers have an oblong body with small scales, saw-toothed edges, a very large mouth, and coarse, spiny fins (Braise, 1987) that inhabit a stony environment in tropical areas with coral reefs. Groupers vary in color according to their habitat, water depth, and the extent of distressing conditions (Heemstra and Randall, 1993). Among Epinephelinae species that are commercially important is Epinephelus fuscoguttatus and cultured intensely in Kuwait, Indonesia, Malaysia, Thailand, the Philippines, Hong Kong, Taiwan, The Republic of China, Japan, and Mexico (INFOFISH, 1989;Tookwinas, 1989;Bombeo-Tuburan, et al., 2001;Lin and Shiau, 2003). In 2007, world grouper production increased by 12,358 tonnes from the previous year, with 22.9% alone comes from Taiwan (Froese and Pauly, 2010). International Union for Conservation of Nature *Corresponding author. E-mail: nataqain@ukm.my.Tel: +603-89214550. Fax: +603-89213398. and Natural Resources (IUCN) has status E. fuscoguttatus as 'being near threatened' due to extensive collection from the wild since every stages or size has its own value (Cornish, 2004).
Over the years, different types of genetic markers have been introduced in studying genetic structures of species or populations and among it is microsatellite. Microsatellites represent powerful markers in population genetics primarily because they evolve rapidly, are found throughout the nuclear genome, generally have several alleles per locus, and are typically inherited in a codominant fashion (Liu and Cordes, 2004). In fisheries and aquaculture industry, microsatellites have been used in the characterization of genetic stocks, broodstock selection, the construction of dense linkage maps, the mapping economically important quantitative traits, for identifying genes responsible for these traits, and for marker-assisted breeding programs (Chistiakov et al., 2006). The major disadvantage of microsatellites is the high cost and extensive time required to develop the marker as microsatellites must be developed for individual species, or at least for each group of closely related species (Parker et al., 1998). Although, the development of microsatellite markers may be a deterrent, many researchers realized the benefits of microsatellites. Numerous reports detail the processes of marker isolation and characterization from various fish species, such as Epinephelus quernus (Rivera et al., 2003), Plectropomus maculates (Zhu et al., 2005), Morone saxatilis (Rexroad et al., 2006), Prochilodus lineatus (Yazbeck and Kalapothakis, 2007), Leporinus macrocephalus (Morelli et al., 2007), Varicorhinus alticorpus (Chiang et al., 2008), Johnius belengerii (Xu et al., 2009), Trachinotus blochii (Gong et al., 2008), and Epinephelus awoara (Zhao et al., 2009).
This study reports on the isolation of a large number of microsatellite markers from E. fuscoguttatus using an enrichment method, followed by the identification and characterization of the markers according to type and repeat unit.

Sample collection
A fin sample was taken from Fisheries Research Institute (FRI), Besut, Terengganu, Malaysia and was identified as E. fuscoguttatus based on the FAO catalogue by Heemstra and Randall (1993). The fin from the fish was preserved in 100% ethanol while transporting to the laboratory.

Library development
Development of the genomic DNA library was done following the Glenn and Schable (2005) enrichment method, with some modifications. Genomic DNA was extracted using MasterPure TM Complete DNA and RNA Purification Kit (Epicentre, USA), based on manufacturer's instructions. Electrophoresis was done on 1% agarose at 80 V for 40 min. The concentration and purity of the extracted genomic DNA was analyzed by NanoDrop TM Spectrophotometer ND-1000 (NanoDrop Technologies, USA). Approximately 10 µg of genomic DNA was digested using RsaI. Oligonucleotide linkers designated as SuperSNX24 (5'-GTTTAAGGCCTAGCTAGCAGAATC-3') and SuperSNX24+4p, with a 5' modification by the addition of phosphate (5'-pGATTCTG CTAGCTAGGCCTTAAACAAAA-3'), were ligated to the digested DNA using Fast-Link TM DNA Ligation Kit (Epicentre, USA). PCR amplification was done using SuperSNX24 as the primer with the following reaction condition: initial denaturation at 95°C for 2 min, 30 cycles of denaturation at 95°C for 20 s, annealing at 60°C for 20 s and elongation at 72°C for 1 min 30 s, and a final extension at 72°C for 5 min. PCR products were visualized by separation on 1% agarose at 80 V for 40 min. The microsatellite enrichment method was done using a 5' terminal biotinylated microsatellite oligo probe according to the protocol of Glenn and Schable (2005): i) (AG)12, (TG)12, (AAC)6, (AAG)8, (AAT)12, (ACT)12, (ATC)8, ii) (AAAC)6, (AAAG)6, (AATC)6, (AATG)6, (ACAG)6, (ACCT)6, (ACTC)6, (ACTG)6, and iii) (AAAT)8, (AACT)8, (AAGT)8, (ACAT)8, (AGAT)8. PCR was employed to anneal probes to the digested genomic DNA. The PCR program used included an initial denaturation at 95°C for 5 min, followed by ramp to 70°C, and further steps down of 0.2°C every 5 s for 99 cycles. A final hold at 50°C for 10 min was used. These conditions were followed by ramping down 0.5°C every 5 s for 20 cycles with a final hold at 4°C. Isolation of microsatellite DNA fragments was done using Dynabeads M-280 Streptavidin (Invitrogen, USA), following the manufacturer's instructions. The repeat enriched DNA was eluted from the beads and ethanol precipitated (Sambrook et al., 1989). The product purified after ethanol precipitation was amplified with PCR again, using only SuperSNX24 as the primer. Amplicons were purified using the Wizard ® Gel and PCR Clean Up System (Promega, USA), based on the manufacturer's instructions. PCR products were visualized on 1% agarose gels. The enriched, fragmented DNA was cloned into pGEM ® -T Easy Vector (Promega, USA), transformed into JM109 competent cells (Promega, USA), and spread onto LB plates containing Ampicillin (100 µg/ml), bromo-chloro-indolylgalactopyranoside (X-Gal) (80 µg/ml), and Isopropyl β-D-1thiogalactopyranoside (IPTG) (0.5 mM). White colonies (331) representing possible recombinant plasmids in transformed cells were picked, and 36 random colonies were amplified with PCR using M13 universal primer. Colonies were grown overnight at 37°C and proceeded to plasmid extraction. Plasmids were extracted using the Wizard® Plus SV Miniprep DNA Purification System Kit (Promega, USA) and were sequenced in both directions using M13F and M13R via an ABI PRISM ® 3100 DNA Sequencer.

Sequence analysis
Sequences were analyzed using Biology Workbench 3.2 and Microsatellite Repeat Finder 9 http://biophp.org/minitools/ microsatellite_repeats_finder.php). Four sequence analyses were carried out: (i) Ability for forward and reverse sequences to be aligned,(ii) Presence of SuperSNX24 linker at the 5' and 3' terminal, (iii) Identification of the type of microsatellite as mono-, di-, tri-, tetra-, penta-or hexanucleotide, and, (iv) Identification of the repeat successfully isolated as either perfect, imperfect or a compound repeat. Duplicate sequences were eliminated for microsatellite identification. Sequences were analyzed through BLASTn to find similarities within the GenBank Database. Novel sequences were subsequently deposited into the GenBank database (Accession no: GU799128 -GU799322, HM149347).

Library development
The enrichment method was employed for its ability to isolate different types of microsatellite repeats, compared to degenerate primer. Even though the use of degenerate primers is a fast and robust technique, it is limited by its use of only specific repeated sequences (Fisher et al., 1996). Different biotinylated oligo probes were used to target various repeated sequences present in the E. fuscoguttatus genome. Glenn and Schable (2005) suggested three mixtures of oligo repeats to be used that would cover different microsatellite repeats present in the genome. The incorporation of the 'GTTT' sequence at the 5'-end of the SuperSNX linker has made cloning more simple and has eliminated a purification step by reducing small PCR products (Glenn and Schable, 2005).
Approximately, 500 colonies were grown on LB plates. Overnight incubation resulted in 331 white colonies. PCR of 13 randomly selected colonies showed differently sized PCR products for each colony, indicating that differently sized fragments were cloned. These size differences occur as a result of random cuts by the restriction enzyme RsaI.
Extracted plasmids were sequenced in both directions using the M13 universal primer. From the 331 sequence pairs, 45 sequence pairs were not complimentary and did not carry either of the SuperSNX24 or SuperSNX24+4p linkers. This is often as a result of poor sequencing quality that is likely caused by low plasmid purity after ethanol precipitation. Besides, size fractionation was suggested to be done after digestion in order to reduce the number of low read length sequences. Alignments also demonstrated that 91 sequences were duplicated. Sequences that were unable to be aligned, as well as duplicated sequences, were eliminated during microsatellite identification.

Microsatellite identification
Microsatellite finder is a useful tool for microsatellite sequence searches. It has the ability to determine whether the sequence was a perfect or imperfect repeat as well as identify the type of microsatellite sequence. After the elimination of unusable sequences, 195 sequences were considered valid because they were able to align between forward and reverse sequences and they carried the linker sequence. From the 195 sequences, there were 378 microsatellites present. Of these 378, 46 (12.2%) microsatellites were mono-, 175 (46.3%) were di-, 109 (28.8%) were tri-, 36 (9.5%) were tetra-, 7 (1.9%) were penta-, 4 (1.1%) were hexa-and 1 (0.3%) was a heptanucleotide repeat. Table  1 shows the microsatellite distribution in the E. fuscoguttatus genome. In brief, identified microsatellites can also be categorized under type of microsatellite. In this case, 276 sequences (75.2%) were perfect, 44 (11.9%) were imperfect, and 49 sequences (12.9%) were compound repeats. According to repeat motifs, common perfect microsatellites are, for mono, (T) n /(A) n , at 7.2% and for di-, (AC) n /(TG) n , at 20.6% of the total perfect dinucleotide microsatellites. These were followed by 12.3% of (CA) n /(GT) n and 3.9% of (GA) n /(CT) n repeats. Whereas, for tri-nucleotide, repeat motifs, 3.9% for two motifs, (CTC) n /(GAG) n and (CTG) n /(GAC) n , and, for tetranucleotide repeat motifs, 1.4% of (GGGA) n /(CCCT) n and 0.7% for both (CAAA) n /(GTTT) n and (AAAG) n /(TTTC) n , respectively. The longer the repeat motif, the less chance it can occur in the genome of E. fuscoguttatus. From 378 microsatellites identified, only 162 were suitable to be used in designing primer for future study.
Based on microsatellite data accessed through the GenBank Database (until 6 April 2011), there are three other research groups also focusing on E. fuscoguttatus. The first group (unpublished data) (accession no: GQ912319 -GQ912328) was able to isolate more trinucleotide repeats (81.8%) than we did because the used of a 5'-anchored degenerate primer. A second group (Lo and Yue, 2008) (accession no: EU016533 -EU016545) was able to isolate more dinucleotide repeats (55.2%) than we did because they employed an enrichment method. The third group (Koedprang et al., 2007) (accession no: AY736039) only deposited one sequence containing a dinucleotide repeat. A total of 63 microsatellites were deposited in GenBank; of these, 2