Combination of genetic tools to discern Bacillus species isolated from hot springs in South Africa

Using phylogenetic analysis of the 16S rRNA gene 43 Gram-positive, spore-forming bacteria of the phylum Firmicutes were isolated, cultured and identified from five hot water springs in South Africa. Thirty-nine isolates belonged to the family Bacillaceae, genus Bacillus (n = 31) and genus Anoxybacillus (n = 8), while four isolates belonged to the family Paenibacillaceae, genus Brevibacillus. The majority of isolates fell into the Bacillus Bergey’s Group A together with Bacillus subtilis and Bacillus licheniformis. One isolate matched Bacillus panaciterrae which has not previously been described as a hot-spring isolate. Three unknown isolates from this study (BLAST <95% match) and three “uncultured Bacillus” clones of isolates from hot springs in India, China and Indonesia listed in NCBI Genbank, were included in the analysis. When bioinformatic tools: Basic Local Alignment Search Tool (BLAST), in silico amplified rDNA restriction analysis (ARDRA), guaninecytosine (GC) percentage and phylogenetic analysis are used in combination, but not independently, differentiation between the complex Bacillus and closely related species was possible. Identification that relies solely on BLAST of the 16S rRNA sequence can be misleading.


INTRODUCTION
Microbes from extreme environments are interesting because they often have unique properties including extremozymes and new drug discovery potential relevant in biotechnology (Gerday, 2002;Jardine et al., 2018). López-López et al. (2013) suggested that the diversity of hot spring environments is not fully appreciated with an estimate of <1% of bacteria in hot springs that are isolated and identified using traditional culture-based methods.
Most commonly used for bacterial identification, is a comparison of the 16S rRNA (ribosomal RNA) gene sequences with known public databases, which offers no information on the physiology and biochemistry. This gene is selected because it has not changed over time and is highly conserved in different bacteria. It is however large enough (1500 bp) to allow for extraction of bioinformatic information (Janda and Abbott, 2007). Although metagenomics reveal the variety of genetic *Corresponding author. E-mail: euniceubombajaswa@yahoo.com.
Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License diversity within a microbial population, it does not take into account viability and circumstantial contamination. Therefore, any novel bacteria isolated from these unique environmental sites are essential contributions to the current database and general understanding of microbial communities. Furthermore, cultured viable bacteria are critical in understanding the biochemical potential and production of bioactive molecules related to gene expression (Handelsman, 2004).
There are additional tools to differentiate bacterial genera and species. The guanine-cytosine percentage (GC%) of the DNA of bacterial genomes varies with different genera and is useful in bacterial systematics. Also, the GC% has been correlated with the thermostability of a genome and is higher in thermophiles (Wang et al., 2006). Amplified rDNA restriction analysis (ARDRA) allows for a more accurate, rapid and efficient identification compared with the more traditional microbiological and biochemical methods (Rajendhran and Gunasekaran, 2011). A computer-simulated restriction fragment length polymorphism (RFLP) analysis of the polymerase chain reaction (PCR) -amplified 16S rDNA which is the same as ARDRA, is a valid means of identifying unknown organisms (Moyer et al., 1996). The phylogenetic analysis of the 16S rRNA gene allows for maximum discrimination between closely related individual isolates taking into account each base of the entire gene, while ARDRA represents only variations in the restriction enzyme sites.
The classification of the genus Bacillus was transformed by major changes where several new genera were proposed (Ludwig et al., 2009). In addition, this is a highly diverse and expanding group, with 25 new genera being described in the past two years (Mandic-Mulec et al., 2015). The relatively new genus Anoxybacillus, was established in 2000 and is growing rapidly with six new species being described since 2011 (Mandic-Mulec et al., 2015). Of the 115 endosporeforming Bacillus isolates from geothermal regions in Turkey, Anoxybacillus was the most abundant, being represented by 53 isolates (Cihan, 2013) suggesting that geothermal environments could be a niche for the discovery of new Anoxybacillus species. Because the genera Bacillus and Anoxybacillus have been reclassified and novel species are being described at a rapid rate, there may be some incongruence and confusion when comparing the nomenclature of this group from studies prior to the reclassification, and between different studies where the new nomenclature is not taken into account.
In South Africa, more than a third of the 80 hot springs are located in the Limpopo Province. Metagenomic studies of four hot springs revealed only a very low abundance of the phylum Firmicutes which includes Bacillus and Bacillus-related species (Tekere et al., 2011(Tekere et al., , 2012. Various other phyla were reported but in very small percentages of the total rRNA sequences (<0.2%). The discovery that these hot springs hold a great diversity of bacteria suggests that it may be a resource for potential thermophiles that could have novel biotechnological applications. The aims of this study were to use conventional culture techniques for the isolation of bacteria and to use the 16S rRNA gene sequence analysis for genotypic identification of the isolates. Besides, the sequences were analysed for GC%, ARDRA, and phylogenetic analysis. The use of a combination of tools for identification was investigated.

Sampling and sampling sites
Water and sediment samples from five hot springs (Tshipise, Siloam, Mphephu, Lekkerrus, and Libertas) in the Limpopo Province, South Africa were sampled. Their geographical location with GPS coordinates, average water temperature and pH conditions and local site description have been previously described Jardine et al., 2017).

Isolation of bacteria and determination of optimal growth conditions
Aliquots of 100 mL of water were passed through a 0.22 µm membrane filter and the membrane filters were then placed on the surface of different agar media (Himedia, India): nutrient agar, Actinomycete isolation agar, minimal Luria broth media, potato dextrose agar and cyanobacterial agar for 48 h at 37 and 53°C exactly as described by Jardine et al. (2017). Bacterial isolates from sediment samples were obtained using the streak plate method. Once pure cultures of the isolates were obtained, they were studied for optimal conditions of growth relating to temperature, pH and salinity in order to maintain them in the laboratory.
The optimum pH, temperature and salinity for growth of the bacteria initially isolated at 53°C were determined, by growth in nutrient broth from pH 6 to 10 in intervals of one unit, temperatures between 45 and 70°C in intervals of 5°C, and sodium chloride (NaCl) at concentrations ranging from 0 to 15% w/v, respectively. A bacterial suspension at an optical density (OD) at 600 nm (OD600) of approximately 0.3 was made, and 1 mL volumes of nutrient broth were inoculated with 10 µL of the bacterial suspension, incubated under various conditions. The OD at 600 nm was measured using a spectrophotometer (Phillips PU8620 UV/VIS/NIR) to determine whether growth had occurred.

DNA extraction protocol, 16S rRNA gene sequencing and phylogeny
DNA extraction and 16S rRNA gene sequencing have been previously described in Jardine et al. (2017) without any modifications. DNA was extracted by the method described by Dashti et al. (2009), and the 16S rRNA gene was subjected to PCR with universal primers 8F, 27F and 1472R (Galkiewicz and Kellogg, 2008) with the cycling conditions as described by Jardine et al. (2017). The amplicon was Sanger sequenced with Big Dye Terminator 3.1 cycle sequencing kit (ABI) according to the manufacturer"s instructions, at the African Centre for DNA Barcoding (ACDB), University of Johannesburg. The resulting consensus sequence of approximately 1400 bp was compared with those in the NCBI database (Genbank) using the Basic Local Alignment Search Tool (BLAST) (McGinnis and Madden, 2004), and EzTaxon-e (Kim et al., 2012). Isolates with a >99% match to the published sequences were identified to the species level, and those with a >97% match were identified to the genus level (Yarza et al., 2014). Alignments were made by CLUSTAL OMEGA (www.ebi.ac.uk), and manually refined using SeaView (Gouy et al., 2010). Statistical confidence in branching points was determined by 1000 bootstrap replicates. Complete and partial sequences from this study were submitted to Genbank. The Genbank accession numbers of the type strains used in the phylogenetic trees are listed in Appendix A.

Computer-simulated PCR-RFLP or amplified rDNA restriction analysis (ARDRA)
Computer-simulated PCR-RFLP patterns were generated from the approximately 1400 bp sequence of the 16S rRNA gene (using the computer program RestrictionMapper version 3 "www.Restrictionmapper.org") and restriction enzymes, Alu1 (15 sites), Taq1 (18 sites) (Wu et al., 2006), HaeIII (24 sites), Hinf1 (21 sites), Rsa1 (18 sites) (Wahyudi et al., 2010), Hph1 (23 sites), MboII (15 sites) and Fok1 (14 sites). The presence and absence of the simulated band was used to create a binary data file and the results were present together as a composite. Several bacterial strains from published data were included in the study, to determine the phylogenetic groups into which the isolates fell. The SeaView program was used to analyze the binary data, and a distance neighbour-joining tree was created for detection of clusters.

Guanine-cytosine (GC) content (in percentage)
The GC% for the Firmicutes group was calculated with the 1400bp 16S rRNA gene fragment (Yamane et al., 2011) using the ENDMEMO GC calculating tool (www.endmemo.com/bio/gcratio) for all the isolates.

Bacterial strains in this study
The 16S rRNA gene sequences of hot spring isolates from South Africa were allocated accession numbers and deposited in Genbank as indicated in Table 1

Optimal growth conditions for bacterial isolation and growth
The optimal growth conditions for the isolates were determined as the bacteria needed to be grown as inoculum for further experiments and investigations. The average optimal pH was 7, the average optimal temperature was 55°C, and the average optimal salinity was 5%. However, 19% were also able to grow in 10% salinity. These results are available in Jardine (2017).

16S rRNA gene sequencing
The contiguous sequences were compared to two databases, namely Genbank and EzTaxon-e and the highest percentage similarities and accession numbers are listed in Table 1. Values >97% suggest a match to the genus level, while a value of >99% suggests a match to species level (Yarza et al., 2014). Where no PCR product was obtained, the sequencing was not determined (nd), and in some cases, sequencing was incomplete which did not allow for a full consensus sequence to be constructed. Sequences from this study were submitted to Genbank with their relevant accession numbers as listed earlier.

Percentage guanine-cytosine (GC) content
The GC% and accession numbers for 31 Bacillus spp., eight Anoxybacillus spp., five Brevibacillus spp., one Aneurinibacillus spp., and reference strains (from Genbank) are listed in Appendix B for the approximately 1400 bp 16S rRNA gene fragment. Based on the GC% of the 16S rRNA sequences for the isolates in this study, they were grouped into four genera (Anoxybacillus, Bacillus, Aneurinibacillus, and Brevibacillus). The average and standard deviations for the GC% for these isolates together with reference strains were calculated (Appendix B) and plotted illustrated in Figure 1. The isolates that fell out of one standard deviation range of the average GC% were earmarked as potentially different from the group, that is, isolates 1T, 11T, 14S and 33Li (Appendix B as indicated by*). In all other respects, there was a  general match with the GC% and groupings into the four genera. Figure 1 shows GC% of the family Bacillaceae including genera Anoxybacillus and Bacillus, the family Paenibacillaceae including the genera Aneurinibacillus and Brevibacillus, and unclassified Bacillales genus Solibacillus where the standard deviations of the two genera within the families did not overlap and were therefore different. Therefore, Aneurinibacillus could be distinguished from Brevibacillus, and similarly, Anoxybacillus could be distinguished from Bacillus based on GC%.

Computer-simulated amplified (16S rRNA) ribosomal RNA restriction analysis or ARDRA
In this investigation, ARDRA analysis of the eight restriction enzyme patterns was compiled resulting in an accumulative 148 sites which were aligned and analyzed using SeaView (Gouy et al., 2010) and presented as a distance neighbourjoining dendogram (Figure 2). The three main groups (A, B, C) are listed in Table 1. Group A included the Bacillus reference type strains determined by Bergey"s classification (Ludwig et al., 2009) as well as the closely knit group of Anoxybacillus reference strain with Anoxybacillus spp. from this study. Three uncultured unknown Bacillus spp. previously reported from other hot spring studies (uncultured Bacillus clones TPB_GMAT_AC4, DGG30 and KSB12) fell into this group A together with isolate 24M. Bacillus spp. from this study fell into both groups B and C, however the Brevibacillus spp. and the single Aneurinibacillus spp. all fell into group C.

Phylogenetic analysis
Phylogenetic analysis of the 16S rRNA gene is commonly used for bacterial identification with greater accuracy than only a BLAST search as it defines the relationship between individual bacteria at every base. It is also more accurate than ARDRA which only provides information at the site of the restriction enzyme activity. A comparison of the three molecular tools (16S rRNA BLAST search, GC% and ARDRA) and phylogenetic tree analysis shows that there was, in general, good correlation with the grouping of the phylum Firmicutes into family Bacillaceae with genera Bacillus (n = 31) and Anoxybacillus (n = 8), and family Paenibacillaceae genera Brevibacillus (n = 3) and Aneurinibacillus (n = 1). A neighbour-joining phylogenetic tree of a 914 bp fragment of the 16S rRNA gene sequences between isolates from this study and representative members of type strains of Anoxybacillus, Bacillus, Brevibacillus, and Aneurinibacillus is presented in Appendix C supporting the information presented in Table 1.
In order to further discern whether the Bacillus spp. in the study fell into specific Bergey"s groupings (Ludwig et al., 2009), a maximum likelihood phylogenetic tree (% bootstrap values based on 1000 replicates) was drawn with additional reference strains as shown in Figure 3. Isolates 18S and 15S grouped with the reference strains (none Bergey"s Bacillus A group) and this is consistent with their low match by BLAST of 96

Analysis of unknown isolates
The GC% of four isolates with a <97% BLAST match to published 16S rRNA sequences was significantly different from within their group, that is, isolate 11T within the Anoxybacillus group, and isolates 1T, 14S, and 33Li within the Bacillus group (Table 1). By ARDRA analysis, the difference was confirmed with 11T that was similar to Anoxybacillus spp. by BLAST but fell in ARDRA group C with Bacillus/Brevibacillus spp. but no further information could be attained with the other three isolates. Therefore, isolates that were not definitively identified by BLAST (<97%) could be further differentiated by ARDRA. Three isolates (15S, 52M and 73T) were even more poorly matched (<95%) by BLAST.

Optimal growth conditions for bacterial isolation and growth
In order to maintain the stock cultures, it was necessary to determine the optimal temperature, pH and salinity conditions for growth. The results showed that most of the bacteria preferred a neutral pH of 7, an incubation temperature ranging between 50 and 55°C, and salinity of 5% NaCl (w/v). The range of temperature was selected because the average temperature of the five hot springs was 52°C. Obeidat et al. (2012) tested eight Geobacillus species. from hot springs in Jordan with temperatures ranging from 48 to 62°C, and pH between 6 and 7, and found the optimal temperatures to be between 60 and 65°C, and pH 6 to 8. Zhang et al. (2011) reported on two isolates of Anoxybacillus with an optimal growth temperature of 55°C and pH of 8. It therefore appears that these spore-forming Bacillus and Bacillus-related organisms are robust with a tolerance for a wide range of environmental conditions. Extremophiles isolated in this study include the alkaliphilic thermophile Anoxybacillus flavithermus with an optimal pH of 10 and a temperature of 50°C (isolate 17S), thermophilic Anoxybacillus rupiensis with an optimal temperature of 60°C (isolate 13S), and halotolerant thermophiles of B. licheniformis that could grow in 10% (w/v) NaCl (isolates 2T, 6T and 8T)

16S rRNA gene sequencing
Molecular techniques based on genetic sequencing have far surpassed the traditional culture methods to predict biochemical and phenotypic information of a single bacteria or a population. Handelsman (2004) elaborated that culture methods are dependent on environmental and external factors, can be timeconsuming, laborious and subject to error. Phenotypic characteristics related to colony morphology, biochemical reactions, serology, pathogenicity and antibiotic resistance can vary considerably, unlike DNA that remains relatively unchanged. The 16S rRNA gene sequence has been used as the gold standard for identification of microorganisms, because it is relatively conserved in all microorganisms, with similarities to allow for PCR with universal primers but enough variability to permit differentiation between species (Rajendhran and Gunasekaran, 2011;Yıldırım et al., 2011). Conventionally, to identify unknown bacterial isolates, the 16S rRNA gene sequences are compared with those in the Genbank database (BLAST), and the closest similarities are listed in Table 1. However, the disadvantage of this tool is that public contributions create the database, and therefore it is possible that unknown sequences may be compared to misidentified or incorrectly named strains. Therefore, to confirm the BLAST results, the sequences were also compared with a more specific 16S rRNA prokaryote gene sequence database using EzTaxon-e (ChunLab USA Inc), a Webbased tool for the identification of prokaryotes (including uncultured prokaryotes). This database is manually curated and quality controlled, and thus less susceptible to be contaminated by false species identifications made by the public and hence, it would be more accurate. For example, in the case of Bacillus spp., the results of the BLAST search will not take into account the new reclassification of genera in 2009, and current changes within the group's nomenclature, therefore the results may be erroneous or out of date. Whichever database is used, the cut-off value for the percentage similarity is also critical. Yarza et al. (2014) and López-López et al. (2013) described with statistical proof that >97-98% allows for determination at a species level.
Other investigators have used >97% as a cut-off value (Belduz et al., 2003;Drancourt and Raoult, 2005). When the value is lower than 95%, the result cannot be accurate at the genus or species level.

Percentage guanine-cytosine (GC) content
The GC% of a fragment of DNA or the whole genome refers to the proportion of DNA that is either G-C, but not A-T, with all the bases present. The G-C bond is stronger than an A-T bond in DNA resulting in a more stable DNA molecule. The GC% of a bacterial genome and the GC% of the stem of the 16S rDNA have been correlated with optimal growth temperatures (Galtier and Lobry, 1997;Wang et al., 2006). Furthermore, the GC% varies among different genera (Muto and Osawa, 1987) which has led to its inclusion as supportive information in the taxonomic classification of bacteria. The GC% of the 16S rRNA gene was included in this study (Figure 1) to establish whether this ratio could be useful in showing which strains were similar and whether it was useful in discriminating between different genera to determine if supportive data could be generated for the discrimination of different bacteria. The results of this study showed that different genera could be distinguished from each other, Aneurinibacillus from Brevibacillus, and similarly Anoxybacillus from Bacillus based on GC% providing supportive information but cannot be used in isolation for identification at a genus level.

Computer-simulated amplified ribosomal DNA restriction analysis (ARDRA)
ARDRA is based on the number and size of fragments that are generated when a PCR product of the 16S rRNA gene is digested with a restriction enzyme, and the fragments are separated according to their lengths by agarose gel electrophoresis. The generated pattern can discriminate between species depending on the enzyme used (Rajendhran and Gunasekaran, 2011). The use of computer-simulated fragments is a valid assessment of genotyping (Moyer et al., 1996;Wei et al., 2007;Sklarz et al., 2009), and is even faster and more cost-effective than digesting with the enzyme in the laboratory. In this study, ARDRA was not performed in the laboratory but in silico, due to limited resources. Restriction enzymes were selected based on previous investigations of ARDRA on Bacillus spp. namely Rsa1, Hinf1 and HaeIII (Wahuydi et al., 2010), Alu1, Taq1 and Rsa1 (Wu et al., 2006), HaeIII and Alu1 (Rai et al., 2015) and Hph1, MboII and Fok1 which resulted in several different fragments when processed by the online tool, Restrictionmapper. In this study, the eight restriction enzymes (Alu1, Taq1, HaeIII, Hinf1, Rsa1, Hph1, MboII and Fok1) used independently did not produce informative clustering since each enzyme only revealed information from 15 to 25 sites, and therefore a total of 148 sites from all the restriction enzyme patterns were analysed together. This produced a dendogram with three main groups (A, B, C) ( Figure 2). There was some overlap with the maximum likelihood phylogeny tree (Figure 3) although with a much lower expected resolution. Anoxybacillus spp. clustered separately from the Bacillus spp. with a convincing bootstrap value of 72%. In silico ARDRA with eight restriction enzymes could not discern the Bergey"s groupings of Bacillus (A-K) but were useful in demonstration similarities between isolates in this study. This will be discussed further in the identification of isolates that were poorly defined by <95% match with BLAST.
Although the usefulness of ARDRA to cluster related bacteria has been described (Rahmani et al., 2006), including Bacillus spp. (Rai et al., 2015) from hot springs (Pagaling et al., 2012), soil environments (Wu et al., 2006;Wahyudi et al., 2010), clinical, dairy and industrial settings (Logan et al., 2002), Sklarz et al. (2009) warned that its prediction power to identify clones should be cautioned. In the latter study of computer generated ARDRA of 48 759 sequences from a ribosomal database, they reported that clones could be separated into different genera, but the clusters did not overlap with phylogenetic analysis of sequence data. This was supported and confirmed by this study. In combination with GC%, BLAST and phylogeny, in silico ARDRA is a useful tool for identification of bacteria using 16S rDNA, with phylogenetic analysis providing the most discriminating and accurate information. Because in silico findings completely reflected experimental results reported in Lactobacillus species (Firmicutes) (Oztuk and Meterelliyoz, 2015), and the shortage of publications describing computer generated ARDRA in Bacillus spp., the following discussions include comparisons with investigations where ARDRA was not simulated.

Phylogenetic analysis
The family Bacillaceae are distinguished by their ability to form heat tolerant endospores, and as a result, they are abundant, robust and well distributed in many environmental niches, including hot springs. The prototype B. subtilis was first described in 1872, and prior to the 1990s, the genus Bacillus mainly constituted the family Bacillaceae. However, since then, many significant taxonomic changes have occurred, which has resulted in new genera being described and several species formerly "Bacillus" now reclassified into other genera. Consequently, a comparison with older published literature revealed that previously named "Bacillus" would appear as other genera in later publications. Mandic-Mulec et al. (2015) reviewed this group in great detail. Also, this family is expanding extremely rapidly with 25 new genera described in 2013 and 2014, and a total of 62 genera listed in 2015. The genus Bacillus within the family Bacillaceae is the largest group with 226 species described in 2015, and it is expanding rapidly with 38 new species having been described between August 2013 and March 2015.
Some of these species are represented by only one isolate making verification challenging and increasing the complexity of Bacillus phylogeny. This confusion regarding the phylogeny of the genus Bacillus was reported by Maughan and Van der Auwera (2011) who observed that phenotypic groupings are not congruent with 16S rRNA groupings because this group is phenotypically so variable. A comparison of publications on Bacillus phylogeny in 1991 (Rössler et al., 1991(Xu and Cote, 2003 and 2009 (Ludwig et al., 2009) confirm the exploding evolutionary changes in this group's nomenclature. As a result, the nomenclature and classification of this group are challenging, and difficult to keep up to date. Therefore, a literature review of Bacillus spp. isolated from hot springs will result in different nomenclature used depending on the date of publication. What may have been previously called Bacillus could be named "Geobacillus" or "Paenibacillus" (meaning "almost Bacillus") in later publications introducing incongruence between different studies. A significant proportion of publications on the identification of Bacillus spp. from hot springs rely on only one tool of identification, a comparison of 16S rRNA gene sequence to a public database, that is, BLAST (Ghalib et al., 2014;Obeidat et al., 2012) which has it shortcomings as previously mentioned. However, this study will show that other means of genotyping, such as phylogenetic analysis, can disprove conclusions that are based only on the BLAST tool. The consensus of an identification using the16S rRNA gene sequence on Genbank BLAST, is >97% match (Yarza et al., 2014), and if studies are not stringent in applying this cut-off value, and merely report bacterial identification based on any genetic similarity, this leads to more "misidentification" within this group.

Family Bacillaceae genus Anoxybacillus
As compared to other groups, the Anoxybacillus group is relatively new, having been established in 2000. Cihan et al. (2012) suggested that Anoxybacillus is the most dominant genus in hot springs. Twelve of the 15 new species of Anoxybacillus listed in Appendix A were isolated from hot springs. Thirty-five of the 53 isolates of Anoxybacillus from hot springs in Turkey showed uniquely different patterns with ARDRA compared with 12 type species (Cihan, 2013) providing further evidence that new species of Anoxybacillus can be found in hot springs and that differentiation from reference strains is discernible by 16S rRNA phylogeny and ARDRA. A BLAST search confirmed that eight isolates from this study were Anoxybacillus spp. including A. flavithermus and A. rupiensis. The neighbour-joining phylogenetic tree grouped the eight isolates, with convincing bootstrap values (Appendix C).
However, the GC% of isolate 11T differed by more than one standard deviation from published Anoxybacillus spp. data (Appendix B), and the other Anoxybacillus isolates from this study. Results from ARDRA analysis confirmed that isolate 11T did indeed group separately from the Anoxybacillus cluster A (Figure 2), and requires further investigation.

Family Bacillaceae genus Bacillus
The majority of the isolates in this study fell within the genus Bacillus, more specifically into Bergey"s Group A which includes B. subtilis and B. licheniformis, two very closely related species (Ludwig et al., 2009). These two species are commonly described as isolates from hot springs in many investigations. In order to ensure that the isolates did not fall into other Bergey"s groups not represented in the phylogenetic tree of Appendix C, another phylogenetic tree was drawn with only the Bacillus isolates from this study and reference type strains from Bergey"s Group B (Bacillus lentus), Group C (Bacillus megaterium), Group D (Bacillus cereus), Group E (Bacillus aquimaris), Group F (Bacillus coagulans), Group G (Bacillus halodurans), Group H (Bacillus arsenicus), Group I (Bacillus smithii) and Group J (Bacillus panaciterrae) (Figure 3). The sequences were obtained from published databases as listed in Appendix A. It confirmed that all the isolates in this study clustered with Group A: B. subtilis/B. licheniformis by phylogeny and that the single isolate 32Le was found not to be Bergey's Group A Bacillus spp. but clustered with B. panaciterrae as confirmed by the Genbank BLAST result. Isolate 32Le was not differentiated from the rest of the Bacillus reference strains with respect to GC% (Appendix B) and ARDRA clustering did not correlate with the phylogeny tree. B. panaciterrae is represented by only one type strain (Gsoil1517) isolated from a ginseng field (Ten et al., 2006), and has not been previously reported as an isolate from a hot spring environment. However, only a tentative conclusion can be made that this is the first report of B. panaciterrae being isolated from hot springs because there is only one type strain represented in this group and therefore statistically inconclusive. However, its novelty and difference from the other Bacillus isolates need to further investigation. Another example of the complexity of Bacillus identification relates to isolate 24M which, with a BLAST search, convincingly matched (99.85%) with Bacillus aerophilus and Bacillus stratosphericus. However, these reference strains were isolated (Shivaji et al., 2006) from samples of high altitude atmospheric cryotubes. Recently, Branquinho et al. (2015) suggested that the nomenclature of these be dropped from bacterial systematics as B. aerophilus and B. stratosphericus were not represented in any typeculture collection and that they should be absorbed into the group of B. pumilus. However, Liu et al. (2015) reported that B. aerophilus was actually Bacillus altitudinis, and B. stratosphericus was a Proteus spp. This finding is a prime example where a Genbank BLAST result matches up to a nomenclature that is already dubious and questionable. By maximum likelihood phylogeny the placement of 24M is inconclusive and did not cluster with B. pumilus. B. aerophilus has not been reported as an isolate of hot springs although B. pumilus has (Aanniz et al., 2015). One needs to be aware of the fact that a bacterial isolate that has a 99.85% match to 16S rDNA sequences in databases within the public domain can be a different species.

Family Paenibacillaceae genus Brevibacillus
Brevibacillus is generally thought to be mesophilic although in this study isolates 16S and 36Li were isolated at 53°C. Isolation of mesophilic Brevibacillus from higher temperatures is typical due to the presence of heat-tolerant spores, and Brevibacillus spp. has been reported previously from hot springs (Derekova et al., 2007;Cihan et al., 2012).
Even though there are challenges in precise identification of isolates, the aerobic Gram-positive spore-forming bacteria isolated in this study were similar to those reported in other investigations. Narayan et al. (2008) reported that of 104 isolates from hot springs in Fiji, 58% were A. flavithermus and 19% were B. licheniformis/Geobacillus stearothermophilus. Anoxybacillus, Brevibacillus, Geobacillus, and Bacillus made up the 76 isolates cultured from hot springs in Turkey (Derekova et al., 2008). Of 115 isolates, Cihan et al. (2012) listed seven genera in hot springs in Turkey which included Anoxybacillus, Brevibacillus, Geobacillus, and Paenibacillus. From hot springs in Morocco, Aanniz et al. (2015) found that 97.5% of 240 isolates were Bacillus spp. including B. licheniformis (n = 119), B. subtilis (n = 6) and B. pumilus (n = 3).

Analysis of unknown isolates
Unknown isolates in this investigation (15S, 52M and 73T) and three "unknown Bacillus" sequences obtained from Genbank that remain as yet unidentified: clone TPB_GMAT_AC4; Genbank HG327138.1 from hot springs in India); clone KSB12; Genbank JX047075.1 from Indonesia and clone DGG30; Genbank AY082370.1 from China were included in the analysis.
Isolate 15S matched with B. licheniformis using BLAST and GC%, but not with phylogenetic analysis where it clustered with 18S supported by a 75% bootstrap value suggesting that it was possibly not a "group A type Bacillus". Similarly, isolate 52M also matched to Brevibacillus spp. using BLAST, but did not phylogenetically or by GC% group with the Brevibacillus reference strain (Appendix C). ARDRA results suggested it was associated with Bergey"s group A Bacillus and not Brevibacillus. Had the identification of both isolates 15S and 52M exclusively relied on BLAST results, the outcomes could be mis-identification. Solibacillus, an undefined member of the family Bacillaceae was included in this study because isolate 73T was similar using BLAST. Its "different" status was confirmed by a lower GC% of 53.91%, but ARDRA did not add any further discerning information. Neighbourjoining phylogenetic analysis placed isolate 73T with Aneurinibacillus spp. (Appendix C) and therefore this isolate could not be assigned to any genus with any degree of certainty.
Clone DGG30 from China, had a GC% of 53.2% similar to Solibacillus and a standard deviation different to that of Bacillus, Anoxybacillus and Brevibacillus. It clustered with the ARDRA group A suggesting it was not related to Bergey"s group A Bacillus spp. Clone KSB12 from India was confirmed and grouped with Bacillus spp. by GC%; and by ARDRA, clustered with B. megaterium and B. lentus with a 56% bootstrap value suggesting that it could be related to Bergey"s Bacillus group B or C. Indonesian isolate Clone TPB_GMAT_AC4 was confirmed to be Bacillus spp. by GC% but no further resolution could be obtained about its identification. Thus, published data from 16S rDNA sequences obtained from "uncultured bacteria" can be analyzed retrospectively using a combination of tools.
In conclusion, 43 isolates from Limpopo hot springs were cultured, and by comparison with 16S rDNA sequences in public databases and phylogenetic analysis, grouped into four genera: Anoxybacillus, Bacillus, Brevibacillus, and Aneurinibacillus. More specifically, the following species were identified: A. flavithermus, A. rupiensis, B. subtilis and B. licheniformis. Singular Bacillus spp. that are phylogenetically related to B. panaciterrae, B. pumilus and B. methylotrophicus were also identified; however, these three isolates require further characterization. All, except B. panaciterrae have been previously isolated from hot-spring environments. However, when the 16S rRNA gene sequences were analyzed by simulated computer-generated ARDRA using a collection of eight different restriction enzymes, additional discernment of individuals was possible. Therefore, this study shows that the use of a single molecular tool may result in a misrepresentation of Bacillus and Bacillus-related identification and that, when possible, a combination of tools should be used.
The complexity and problems regarding the Bacillus phylogeny were discussed. Only a small portion of the microbial diversity present in hot springs can be cultured, compared with the more comprehensive assessment of microbial diversity obtained using the metagenomic approach. Improved isolation rates could include the use of different media and different incubation conditions. Three different types of extremophiles with different properties (alkaliphilic, thermophilic and halophilic) as well as three unknowns were isolated, suggests that hot-spring water is a resource for potentially important bacteria useful in biotechnology and as a supply of novel bacteria. Hot springs sites need to be protected, conserved and maintained as valuable indigenous and pristine natural resources.