Developing macrohabitat models for bats in parks using maxent and testing them with data collected by citizen scientists

Protected areas may function as islands of habitat in otherwise adverse environments for many species of North American animals. It is currently unclear how to maintain suitable foraging habitat for bats within these areas. Bats are nocturnal and highly mobile, making their specific needs difficult to determine. Using data collected from acoustically surveyed sites within protected areas in the Oak Openings Region of Northwest Ohio, a biodiversity hotspot, we developed spatially explicit macrohabitat models using maximum entropy modeling (Maxent). We then used data collected by citizen scientists to test these models to determine their success in predicting species presence. We found that the models were successful (AUC values > 0.75) at predicting the occurrence of the seven species for which models were developed, Lasionycteris noctivagans, Lasiurus borealis, Lasiuruscinereus, Myotis lucifugus, Myotiss eptentrionalis, Nycticeius humeralis, and Perimyotis subflavus. Within protected areas, it is important to manage for heterogeneous habitat composition at this intermediate scale to maintain potential for foraging areas for all occurring bat species. Data collected by citizen scientists is useful to test spatially explicit models and can potentially be used to monitor long term changes in bat species composition in these systems and across regions.


INTRODUCTION
Studies on the activity of bats in the summer are often based in areas that are relatively forested and intact (Brigham, 2007).However, data on activity in systems typified by habitat loss and urbanization are lacking (Avila-Flores and Fenton, 2005;Dixon, 2011;Duchamp et al., 2004;Gehrt andChelsvig, 2003, 2004;Sparks et al., 2005).Evidence from more thoroughly studied species suggests, however, that activity will differ depending on the landscape context (Estes and Mannan, 2003).
Within human dominated systems, protected areas for example metroparks and parkland, exist in pockets of relative isolation (Donnelly and Marzluf, 2004;Rothley et al., 2004) surrounded by a potentially hostile matrix of development and agriculture.How bats utilize protected areas is unclear, but studies have demonstrated higher species diversity inside parks compared to outside of them (Avila-Flores and Fenton, 2005;Duchamp and Swihart, 2008;Glendell and Vaughn, 2002;Jung and Kalko, 2011).This suggests that protected areas may serve as a critical refuge from human-mediated impacts *Corresponding author.E-mail: jessicasewald@missouristate.edu.(Glendell and Vaughan, 2002;Loeb et al., 2009).
It is important to identify the characteristics of these parks and their context that promote long-term viability of native species.Additionally, there may be more variability in bat activity and presence within a park than between parks (Gehrt andChelsvig, 2003, 2004;Johnson et al., 2008) and species diversity may be lower in urban rather than rural parks (Johnson et al., 2008;Kurta and Teramino, 1992;Loeb et al., 2009).Some of these findings may also reflect the variability of individual species for specific habitat characteristics (Lacki et al., 2007) as a result of different morphology and echolocation abilities (Aldridge and Rautenbach, 1987).In general, these species-specific differences are predictive of where a bat will forage, but this may vary depending on the location (Kurta and Whitaker, 1998) or species may show plasticity in their foraging (Ratcliffe and Dawson, 2003).
To determine potential effects of management, habitat changes, and prioritizing areas for protection, we need to understand these differences in species distributions and behavior.Previous models for predicting bat distributions have often been conducted at the landscape scale, which by their nature are relatively coarse, placing up to 1 km buffers around survey points (Ford et al., 2006).A focus on this large scale provides limited insight into the characteristics of the area immediately surrounding the foraging environment and this intermediate macro habitat (Saab, 1999) scale is often the target of management within protected areas (Abella et al., 2001).
To address this need, we relied on acoustic surveys to determine bat presence at the macrohabitat scale.Acoustic surveys can be conducted in areas where capturing bats with the traditional method of mist netting is difficult (that is within parks).The present data collected by acoustics has been successfully used to model species presence in association with habitat characteristics in a number of other studies (Brooks and Ford, 2005;Erickson and West, 2003;Ford et al., 2005;Ford et al., 2006;Francl et al., 2004;Johnson and Gates, 2008;Loeb and O'Keefe, 2006;Zimmerman and Glanz, 2000).
Linking species data to the associated habitat characteristic relies on any number of statistical approaches depending on the sample sizes and the structure of the data.We chose to use maximum entropy modeling in the program Maxent to build predictive habitat models based on presence of bats detected through acoustic monitoring.Maximum entropy modeling and the program Maxent is a way to model species distribution using presence data (Phillips et al., 2006) with small sample sizes (Hernandez et al., 2006;Kumar and Stohlgren, 2009).Although we had absence data for our sites, the data can be misleading as we cannot be confident that these are true absences (Anderson, 2003).Maxent has been used to successfully model the distribu-tions of a range of taxa: plants (Schetter, 2012;Kumar and Stohlgren, 2009), exotic ant species (Ward et al., 2007), birds (Elith et al., 2006), geckos (Pearson et al., 2007), as well as African (Lamb et al., 2008), Asian (Hughes et al., 2012), and European (Rebelo et al., 2010) bats.The program takes the user-defined environmental layers within a geographic area and estimates the probability distribution of maximum entropy (or closest to uniform).This approach allows us to maximize the use of our acoustic data while minimizing the assumptions necessary.
In most modeling situations, a subset of the originally collected data is withheld and then used to test (testing data) the model.We chose instead to use a novel approach of testing the model with a data set collected by citizen science volunteers.In this way, we demonstrate that data collected by citizen scientists can be used as an effective independent test of the model and increase our confidence in its application (Guisan and Zimmerman 2000).The data collected by citizen scientists are also easily added to the program Maxent and analyzed using "Area Under the Curve" (AUC) of a "Receiver Operating Characteristics" (ROC).
Acoustic surveys of bats conducted by volunteers have been a way of monitoring bat trends in England for a number of years (Walsh et al., 1993), but have not been widely used in the United States.The original goals of citizen science programs were education and outreach, but large amounts of scientifically useful data can also be collected (Bonney et al., 2009).Examples of this abound in the United States in which large scale studies of birds are quite successful (Lepczyk, 2005).Citizen science is now increasingly used in studies from classifying star systems (Raddick et al., 2010) and monitoring seismic activity (Cochran et al., 2009) to wildlife sightings on major roads (Lee et al., 2006).
Our goals were to develop a macrohabitat model of bat presence for all occurring bat species at the macrohabitat level using Maxent, and then demonstrate the usefulness of testing these models with data collected from citizen scientists.

Study area
The Oak Openings Region of Northwest Ohio (located in the north central portion of the United States) is a 476 km² area characterized by soil types from post glaciation events and contain a heterogeneous mix of habitats including vulnerable or imperiled plant communities (Noss et al., 2005) such as the critically endangered oak savanna (Brewer and Vankat, 2004).Considerable fragmentation has occurred due to increased urbanization and agricultural expansion (Brewer and Vankat, 2004) (Figure 1).This region remains an area of high biodiversity and protected areas within the region are considered critical stopover locations for migrating birds (Ewert et al., 2005) and potentially bats (V.Bingman, personal communication).

Acoustic monitoring to determine species presence
From June 1st to September 2nd, 2009, we acoustically surveyed 32 sites five times each with a broadband acoustic device (Anabat, Titley Electronic, Ballina, New South Wales, Australia).These sites were within two of the main protected areas within the region, an area of 1722 ha, comprising approximately 10% of the natural area remaining.These protected areas are owned and maintained by the Metroparks of The Toledo Area and are both within Lucas County, Ohio.
Sites within these parks were chosen because they encompassed all described habitat types (Ford et al. 2005, Loeb andO'Keefe, 2006) and were within 0.5 km of water, which all bats rely on to varying degrees (Francl, 2008;Vaughan et al., 1997).
Methods of echolocation monitoring followed those previously well established (Brooks and Ford, 2005;Brooks, 2009;Johnson and Gates, 2008;Johnson et al., 2008;Ford et al., 2005;Ford et al., 2006;Francl et al., 2004;Francl, 2008) for acoustic monitoring.Four sites within a quarter of a mile to each other were surveyed in the same night and each was actively surveyed for 20 minbefore moving onto the next site.Monitoring began approximately 0.5 h after sunset and ended 3 h thereafter, covering the time frame when bat activity is most homogeneous (Hayes, 1997).All sites were greater than 100 m apart, which is well outside the reception area of the Anabat (Livengood, 2003).We also avoided sampling during times of strong wind (for example > 3 on Beaufort scale) or rain.
All files with more than three identifiable calls were analyzed and taxa determined (by the primary author who compared calls to a known call library) to species level, both qualitatively (Analook version 3.7w), and quantitatively (Allen, Bat Call Identification version 2.0.5.2,Kansas City, MO).When the two methods disagreed on identification, the call file was qualitatively inspected again and the primary author determined identification.Each species was considered present if it was detected at least once during the five surveys at a given location.

Macrohabitat characteristics
We derived macrohabitat characteristics using ArcMap 9.2 software (ESRI, Redlands, California, USA).The original landcover map for the Oak Openings Region was developed by Schetter and Root (2011) using 30 m pixel Landsat data and contains a total of 15 different land classes, including asphalt, turf, residential, swamp, floodplain and upland forest, savanna, wet prairie, prairie, barren, meadow, shrub/scrub, conifer, crop and pond.We excluded wet prairie, barren, shrub/scrub and conifer cover in further analysis due to their low sample size and relatively low frequency within the Oak Openings Region.
We used the program FRAGSTATS (McGarigal and Marks, 1995) and a 60 m circular moving window to determine the percentage of land cover type around each 30 m pixel as well as measures of fragmentation including cohesion, number of patches, landscape shape index, and the Simpson diversity index for types of land cover.We also determined distance to nearest road, stream (US Census Bureau, 2009), residential and agricultural area.

Model development
We first ran correlation analysis on all environmental variables and those that were correlated r > 0.6, p<0.05 were assessed and variables chosen a priori to decrease overfitting due to model complexity (Merow et al., 2013).Distance to residential area, distance to roads, and the percentage of residential cover were all correlated.We chose to use distance to roads in all models since roads are a critical feature that indicate human influence on the landscape.Roads may also facilitate unimpeded movement, or foraging, as ditches (which may be used as alternative water and feeding sites, Vindigni et al., 2009) are commonly adjacent to most major roads in the area.Measures of fragmentation and heterogeneity were also found to be highly correlated so we only included the number of habitat patches as a general measure of fragmentation in model development.
For all species, we included distance to stream, and distance to agriculture, as these variables have been found to be important for bats in general (Yates and Muzika, 2006;Either and Fahrig, 2011;Grindal et al., 1999) and particularly those in urban/agricultural matrices (Gehrt and Chelsvig, 2004;Duchamp and Swihart, 2008).We retained all measures of forest for example upland, swamp and floodplain) and open cover (for example meadow and prairie), as well as savanna because of its unique status in this region To decrease the issues surrounding spatial autocorrelation (Veloz, 2009), we employed the methods of Parolo et al. (2008) for species in which we had greater than 15 presence points (northern Myotis, little brown bat and eastern red bat).We used GIS to determine the distance of each presence point from each other and randomly removed presence points within a threshold distance (30% of locations closest to each other).We chose not to remove presence points for the remaining species, choosing instead to maximize sample size, with the understanding that some overfitting in the models may occur fo rthose species.
We ran ten replicates with the default settings (Phillips and Dudik, 2008) in the Maxent program (v.3.3.3k, Phillips et al., 2006) to develop habitat distribution models for each bat taxon that was recorded during our acoustic surveys.The model outputs were on a logistic scale in which each map pixel was assigned a number between 0 (low habitat suitability) and 1 (high habitat suitability).Each model was then combined into an overall species richness model.This was done by summing the model output for each of the seven species with a resulting map made up of pixels ranging in number from zero to seven.A zero represents no species likely present, while a seven would be all species likely present.Our methodology did allow us to gather absences but due to the potential pitfalls of absence data (Anderson, 2003) we chose to use the Maxent method which uses only the presence points.However, we also conducted Wilcoxon-signed rank tests between the presence and absence of each species in association with the environmental variables to further support the Maxent models.

Model testing
From June1 th -August 15 th of 2011, a citizen science program held in conjunction with the Metroparks of the Toledo Area was initiated.Volunteers walked along ten park trails that occurred within two parks where data was originally collected, and two smaller areas not previously surveyed.Volunteers were trained on the use and how to hold the acoustic monitor while walking and the pace at which to walk.Volunteers began walking the trails between 15 minand a half hour after sunset and concluded 45 min to 1 h later.Each trail was surveyed between one and five times.This program was continued in 2012, and nine of the ten original trails were monitored using the same protocol as the previous year.
The data for each species along volunteer-monitored trails were used to test the relevant macrohabitat model.GPS coordinates corresponding to the detection of each species were taken and entered into Maxent as test data.The model performance in terms of the test data was evaluated using ROC curves.ROC curves balance both omission and commission errors in a model set generating a graph line that represents a random level of performance (Fawcett, 2006).The AUC are between 0 and 1 and values of 0.5 are considered a random prediction (Fawcett, 2006).A second evaluation of the test data given by the Maxent program is a threshold dependent evaluation (ROC is threshold independent).This uses a χ² test to determine the difference between the proportions of predicted area generated by the model, versus what would be predicted from random (Phillips et al., 2006).

Species detected
During the initial 2009 surveys, a total of 1,570 call files were recorded and identified to species.Species detected included big brown (Eptesicus fuscus) (1,195 files), Eastern red (Lasiurus borealis) (118 files), little brown (Myotis lucifugus) (81 files), tri-colored (Perimyotis subflavus) (54 files), Northern Myotis (Myotiss eptentrionalis) (39 files), silver-haired (Lasionycteris noctivagans) (34 files), hoary (Lasiurus cinereus) (26 files), and evening (Nycticeius humeralis) (23 files) bats.Three files keyed out to the endangered Indiana bat (Myotis sodalis), but because of the difficulty of distinguishing the calls of this species from the little brown bat (Britzke et al., 2002), we could not definitively determine its presence.Big browns have been found to be ubiquitous in many urban situations (Loeb et al., 2009;Johnson et al., 2008), and we had similar results.Big browns were present in every location in both the originally collected data and the citizen science collected data; therefore we dropped them from further habitat modeling since there was no difference in occupancy across the surveyed sites any resulting model would include all similar natural areas.The remaining seven species were present at a low of five sites for the hoary bat to a high of 19 sites for the little brown bat.

Habitat models
The percentage of contribution of the ten environmental variables to the Maxent models are shown in Table 1, while Figure 2a and b show the suitable area for each species.Those environmental factors associated with urban/agricultural areas, including distance to roads, distance to agriculture and the number of patches, had varying importance in models for each species.
Presence of Northern myotis, little brown, tri-colored and eastern red bats were most likely at intermediate distances from agriculture.Distance to roads provided a negligible contribution to all models, while the number of different habitat patches contributed to the models for evening, eastern red and silver-haired bats.The distance to water provided a large contribution to models for all seven species and presence was more likely closer to water, although this difference was not as evident when looking at only the results of the Wilcoxon-signed rank test (Tables 1 and 2).The type of forest covers that contributed to each species model generally aligned with expectations for that species based on previous literature.
Northern myotis and little brown bat models had contributions from upland forests.Open adapted bats (silver-haired, eastern red and hoary) had combi-nations of contributions resulting from upland forest, prairie, meadow and savanna.
The developed models for all seven species were significantly better than random when considering the threshold dependent χ² test at the 1, 5 and 10% omission thresholds (a proxy measure for the amount of suitable habitat misclassified as unsuitable), as well as when commission and omission rates are balanced (Table 3).
In all cases, the models were significantly better than a random model at predicting suitable habitat.The predicted amount of suitable habitat at the 10% threshold ranges from a low of 30% for the little brown bat to a high of 48% for the hoary bat.The multi-species model (Figure 3) demonstrates overlap in locations throughout the Oak Openings Region that are potentially suitable for all seven species.
The models using the training data all exceeded the "very good" threshold of 0.9 based on the threshold independent AUC tests (Swets, 1988); however, only two models using the test data met this threshold (Table 3) (evening and eastern red bats).The remaining models using the test data were still well above the cut-off of 0.75, though, which indicates that the discrimination ability of the model was still considered useful (Elith et al., 2006).(Vaughan and Omerod, 2005).Gathering these data can be expensive and time consuming.
In contrast, employing volunteers in citizen science data collection is relatively inexpensive and provides an opportunity to gather large data sets across space and time (Reisch et al., 2013).This volunteer program has The original acoustic data was used to determine the training AUC and this was what was used to develop the model.The test AUC used the citizen science collected acoustic data.Maxent statistically compares test data against a random prediction with the same fractional predicted area.All test data was significantly better than random at the <0.001 level for all omission thresholds.continued and we hope to expand it to not only other parks, but urban and suburban areas throughout the region.
Habitat models provide a powerful tool for monitoring and conserving vulnerable species in a rapidly changing world.Our models are a baseline, or a first step, in characterizing the diversity patterns in the area and modeling presence on a macrohabitat scale, something that has not been done for bats.These models can be refined and test data can be included into training data as new test data becomes available.We used variables that are easily extracted from GIS and updated over time.The models allow us to aid managers in finding potentially important foraging sites and assist in long-term monitoring.
Indeed, these models have already been used to identify areas of priority conservation within the protected areas (Lipps, unpublished data).We hope that they can be used to identify areas outside of parks that could be protected as commuting or foraging habitat.With this long term monitoring, there is also the potential to understand how changes such as management may increase (for example by removing structural clutter (Tichenell et al., 2011), or decrease (through loss of canopy cover (Smith and Gehrt, 2010) species presence.
The results of our models indicate a need to maintain heterogeneity in habitat types.It appears that protected areas within the Oak Openings Region can support a suite of species when considering foraging activity as long as a variety of successional states are maintained.At this macrohabitat scale, and within protected areas, the fragmentation and development that we measured did not appear to affect the presence of these species, although very few areas outside of the parks appear to be suitable habitat.
Distance to roads had a negligible contribution to all species models; the largest contribution of this factor was for little brown bats.This species is generally found further away from urban development (Duchamp and Swihart, 2008), but roads could serve as commuting areas to roosts (Riskin and Pybus, 1998); insect hot spots due to heat retention; openings within forests; or edges in areas where tree lines are next to roads.The concentration of suitable habitat in the northern part of the Oak Openings Region follows the drainage ditches that are unique to this area, which are also associated with roads.
Measures of forest cover were predictors for the taxa generally considered to be forest obligate (northern Myotis and little brown bats), while the presence of open areas -prairie and savanna-contributed to the models of the larger bodied eastern red bats.The presence of savanna also contributed to little brown, tri-colored, evening and silver-haired bats, indicating a need to explore the importance of this habitat type further.
Bats are an integral part of North American ecosystems as the primary predators of night flying insects, and as such, it is important that we understand how to maintain populations of these organisms across diverse contexts.This is an important consideration within protected areas as they are often considered islands of suitable habitat.Through this work, we found that the scale of consideration is important and may differ across species, but that the macrohabitat scale is generally predictive of species presence and can be used in predicting species occurrence within protected areas of this region.In terms of management, heterogeneity of land cover and successional states is important in supporting a diverse group of species.Using the combined approach of Maxent modeling and model testing using data collected by citizen scientists, we were able to increase our understanding of the important habitat components for bat species in protected areas to assist in conservation and management, while engaging and educating the local stakeholders.

Figure 1 .
Figure 1.Map of the Oak Openings Region showing the extent of fragmentation caused by roads (lines), agriculture and urban areas (inset of location within Ohio).

Figure 2 .
Figure 2. (A) Maxent model results for the four species that are considered open adapted of the bats within the Oak Openings Region of Northwest Ohio including silver-haired, red, tri-colored and hoary bat.Map showing both the full extent of the Oak Openings and that within protected areas.(B) Maxent model results for the three species that are considered forest adaptedof the bats within the Oak Openings Region of Northwest Ohio,including little brown, northern Myotis and Evening bats.Map showing both the full extent of the Oak Openings and that within protected areas.

Figure 3 .
Figure 3. Multispecies bat model with all protected areas indicated within the Oak Openings.Region of Northwest Ohio.Model developed by combining each individual species model from Maxent.Zero indicates no species likely present to a high of all seven species likely present.

Table 1 .
Percentage of contribution of ten environmental variables to maxent species distribution models developed within the Oak Openings region for each of seven species of bats.
Symbols that follow each percentage indicate the response curve given to each environmental variable by Maxent."+" indicates increasing, "-" indicates decreasing, "+/-" indicates an initial increase followed by a decrease and "n" is no change.

Table 2 .
Wilcoxon-signed rank test between the presence and absence of seven bat species across ten environmental macrohabitat variables.Tests were conducted on data collected at 32 acoustic survey sites in two protected areas within the Oak Openings Region in 2009.

Table 3 .
Results of maxent models and "area under the curve" ROC analysis for each of seven species of bats within the oak openings region of northwest Ohio.Also displayed are the percent of predicted suitable area under 1, 5 and 10% omission thresholds of the test data.