Advancement in field of GIS and Information Technology has taken conservation works and strategies a step further as most conservation works are now dependent on these technologies. The present study explores the prediction ability of MAXENT using a very low sample size by applying jackknife analysis over a well defined smaller region and using only climate data. <i> Vanda bicolor </i> is a horticulture important orchid grown in certain patches of North Eastern region of India and the species considered to be “ Vulnerable ” . Present study reports a distribution prediction model using different geo-climatic parameters for a small area. Model validation by ground truth ing gives a significant success ful result which clearly defines the ability of MAXENT prediction model to give high success rate (71%) with low training samples. Use of the low sample size over a larger area results in unstable models however application of these samples in smaller radius around the occurrence points could provide good working models.
Recent change in climatic condition has increased the pressure on plant species and many important species has been subjected to a lot of stress pushing them to the brink of extinction. In this aspect the issue of conservation has become a topic of utmost importance. Many plant species has been destroyed before they are documented or their value is realized, most of the medicinal and ornamental plants are highly exploited for their economic value. In this scenario knowledge of their distribution and the niche radius of target plant species will allow conservators to work out conservation strategies effectively for those species that are at risk. Species distribution modeling has become an important tool for conservation works as it provides an insight to the species geographical and climatic requisites and this data can be of immense help for conservationist.
For development of niche model large numbers of climatic variables are required. However, there are certain species especially rare and threatened species their distribution is restricted in narrow geographical area and not possible to have large numbers of variables to develop a robust distribution model. In the recent past there are few reports published on developing models using low sample size giving significant results e.g., sample size of 2 - 4 [
Vanda bicolor Griff. is a rare epiphytic orchid belonging to the genus Vanda under family-Orchidaceae, the plant has leafy stem enveloped almost fully covered by leaf sheaths and each with oblong, curved, and with little twisting in the middle, apically obliquely bilobed, each lobe tridentate leaves. Vanda bicolor flowers between March to June and inflorescence axillary, glabrous flowers white-purplish, mottled above, violet tinged beneath, with floral size of 4 - 6 cm (
The study was carried out in Nagaland, India which lies between 93˚20' - 95˚15' East Longitude and 25˚31'- 27˚1' North Latitude with a total geographical area of 16,579 km². The state has a forest cover of 13,044 km² (78.68%) according to Forest Survey of India, 2013 with a total forest cover loss of 274 Km2 since 2011 report. The state falls under seven forest types viz. Tropical Wet Evergreen, Tropical Semi Evergreen, Tropical Moist Deciduous, Subtropical Broad Leaved Hill, subtropical pine and Montane Temperate Forests with an average rainfall of 1583 mm [
In the present study only 4 occurrence points were used to develop the model, all presence points are geo-referenced from primary ground surveys using GPS. The occurrence points are subjected to quality test with respect to and their positional accuracy was ascertained through Google earth, duplicates are identified and removed thus maintaining only one point within 1×1 Km2 to avoid sampling bias which would otherwise favor the climatic of those sites where sampling is highly concentrated. As the number of presence points in below 20 1.5 × Interquartile range (1.5 IQR) method of identifying outliers is applied to check for outliers based on climate data developed from the environmental data obtained from WorldClim Website at 30''. All climate data are cross checked for resolution accuracy and corrected to 30'' pixel resolution.
19 bioclimatic variables of zone 29 was obtained from WorldClim at 30'' pixel resolution, which consist of an interpolated datasets of temperature and precipitation [
All modeling works was carried out using MAXENT Version 3.3.3 K as our works are based on presence points only and has low sample size [
The model calibration gives a test AUC of 0.984, with a standard deviation of 0.004. The AUC ranges from 0.5 for models that are no better than random to 1.0 for models with perfect predictive ability (
MAXENT jackknife test of variable (
Variable | % contribution | Permutation importance (%) |
---|---|---|
bio18 | 50.3 | 31.6 |
bio13 | 13.8 | 1.9 |
bio19 | 11.5 | 23.3 |
bio2 | 6.8 | 4.7 |
bio14 | 6.3 | 5.6 |
bio7 | 4.5 | 18.8 |
bio5 | 3.4 | 0.2 |
bio17 | 0.9 | 7.4 |
bio4 | 0.8 | 2 |
omitted was also observed in bio18, which therefore appears to have the most information that isn’t present in the other variables.
Same jackknife test, using test gain instead of training gain (
Jackknife test using AUC on test data, the AUC plot shows that Bio18 is the most effective single variable for predicting the distribution of the occurrence data that was left aside for testing, when the predictive performance is measured using AUC, though it was hardly used by the model built using all variables and the relative importance of Bio4 also increases in the test gain plot (
The model was developed using a very low occurrence points and most of the areas of Nagaland was predicted under high suitability threshold, thus to validate this, the model was subjected to intensive ground truthing and introduction in different prediction threshold to assess the model prediction ability (
truthing works by random selection of sites in different prediction threshold level give a significant result with 10 new occurrence records (
Though the model was developed using only four training sites, it was able to predict suitable sites in the neighboring Northeastern states of India and countries (
Area | Longitude | Latitude | Prediction Threshold | Niche Status |
---|---|---|---|---|
Izheto | 94.4200 | 26.2041 | High | Disturbed forest, frequent Jhuming |
Ghukimi | 94.3110 | 25.9683 | High | Disturbed forest, frequent Jhuming |
Tsupfume | 94.3443 | 25.5342 | High | Disturbed forest, frequent Jhuming and forest fire |
Aopao | 94.8919 | 26.5591 | Medium | Disturbed forest, frequent Jhuming |
Ghokhuye | 94.4832 | 25.9083 | High | Disturbed forest, frequent Jhuming |
Kengnyu | 94.9535 | 26.1159 | High | Disturbed forest, frequent Jhuming |
Reguri | 94.6557 | 25.5701 | High | Disturbed forest, frequent Jhuming |
Chisailhem | 93.4831 | 25.3424 | Very high | Disturbed forest, frequent Jhuming and forest fire |
Nsong | 93.5392 | 25.2798 | Very high | Disturbed forest, frequent Jhuming and forest fire |
Old Tesen | 93.6391 | 25.473 | Very high | Disturbed forest, frequent Jhuming and forest fire |
suitable sites over larger areas might be lowered as the training points are very less and confined over a smaller area (i.e., Nagaland), The model however shows a more robust prediction outside the target area in Bhutan and Myanmar.
During the present study it was observed that most of the occurrence areas are under high biological disturbances like logging, Jhuming and forest fire and these are some of the factors that are bringing noticeable changes to the forest over a short period. This spatially separated population shares similarity in host plant and seasonal climatic variables like precipitation and temperature. Most of the areas are under high suitability threshold but are under high anthropogenic disturbances and only a small portion of the study in very high suitability threshold a falls under undisturbed area and interestingly Intanki National Park, India fall under very high suitability threshold. Introduction of species to random forests will proved to be futile if careful assessment of the forest condition is not done areas like Intanki national park will serve as excellent sites for in-situ conservation and possible re-introduction for species recovery.
The study was able to produce significant prediction models using very small sample size over a defined area, which has been validated statistically and though ground truthing. Earlier studies on development of models using low sample size has also reported effective models by using sample size of minimum 4 and 5 study on cryptic geckos [
In the present study, the effectiveness of low sample size and climate data on MAXENT model development and its usability in real world application has been validated statistically through ground truthing and testing of sites by introducing plants to predicted sites. The model was able to bring out new insights on the climatic parameters which defines species survival, and successfully predicted new pollution in wild and those existing population in neighboring states and countries with success rate of 70% (calculated on stack developed using the MAXENT prediction map threshold value over the area of occurrence). Any conservation related works will be on those species that are under high threat and those species in high threat category usually have low occurrence and it will be insignificant for conservationist unless working models are developed for these threatened species and the present study gives a good example of how low sample size can be used to develop effective prediction models.
Present work is financially supported by Department of Biotechnology, Ministry of Science & Technology, Govt. of India, New Delhi through a research grant to Prof. C. R. Deb vide No. BT/ENV/BC/01/2010.
Deb, C.R., Jamir, N.S. and Kikon, Z.P. (2017) Distribution Prediction Model of a Rare Orchid Species (Vanda bicolor Griff.) Using Small Sample Size. American Journal of Plant Sciences, 8, 1388-1398. https://doi.org/10.4236/ajps.2017.86094