Velvetleaf ( Abutilon theophrasti Medic.) infestations negatively impact row crop production throughout the United States and Canada’s eastern provinces. To implement management strategies to control velvetleaf, managers need tools for differentiating it from crop plants. 5 Band, 7 Band, 8 Band, and 16 Band multispectral datasets simulating LANDSAT 3 plus a blue band, LANDSAT 8, WorldView 2, and WorldView 3 spectral bands, respectively were tested as input into the random forest algorithm for velvetleaf soybean [ Glycine max L. (Merr.)] discrimination. During two separate greenhouse experiments in 2014, leaf reflectance measurements were obtained at the vegetative growth stage of velvetleaf plants and two soybean varieties. The reflectance measurements were collected with a plant contact probe attached to a hyperspectral spectroradiometer. Leaf hyperspectral reflectance measurements were convolved to the four multispectral datasets with computer software. Overall, user’s, and producer’s accuracies and kappa coefficient were employed to determine classification accuracies. Using the multispectral datasets as input, the random forest algorithm differentiated velvetleaf from the soybean varieties with accuracies ranging from 86.7% to 100%. 7 Band, 16 Band, 8 Band, and 5 Band datasets ranked or tied for the highest accuracies seventeen, sixteen, twelve, and one time, respectively. Kappa coefficients indicated an almost perfect agreement (i.e., kappa value, 0.81 - 1.0) to substantial agreement (i.e., kappa value, 0.61 - 0.80) between reference data and model predicted classes. This study was the first to demonstrate the application of the random forest machine learner and leaf multispectral reflectance data as tools to distinguish velvetleaf from soybean and to identify multispectral band combinations providing the best accuracies. Findings support further application of the random forest machine learner along with remotely-sensed multispectral data as tools for velvetleaf soybean discrimination with future implications for site-specific management of velvetleaf.
Velvetleaf (Abutilon theophrasti Medic.), a broadleaf plant native to China, was introduced into the United States from India as a fiber crop. It escaped cultivation and now has become a problem weed in row crops, especially in corn (Zea mays L.) and soybean (Glycine max (L.) Merr.) fields throughout the United States and Canada’s eastern provinces. The summer annual weed grows to heights ranging from 0.3 to 2.0 m. The plant reproduces from seed and can develop up to 17,000 seeds that may remain viable for up to sixty years. Velvetleaf grows best in warm regions and invades vacant lots, gardens, and cultivated fields. Once established, it is a problem weed for many years to come.
Velvetleaf infestations negatively impact a crop and field in several ways. Velvetleaf plants emerging before or at the same time as crop plants are highly competitive for water and plant nutrients and thus can outgrow the crop. A 25% decrease in crop yield can occur when the velvetleaf plant population is equivalent to 1 plant per 30 cm [
Producers commonly use preemergence and postemergence measures to manage or control velvetleaf infestations. Detecting and eliminating the plant before seeding is vital because of the long-term dormancy of the seeds and the future problems they may cause. Therefore, field managers need additional techniques besides the com- mon field survey for detecting velvetleaf infestation in crop fields.
Remote sensing technology has gained popularity as a tool for weed detection in agricultural systems [
Soybean weed discrimination has been the focus of several remote sensing studies including velvetleaf as one of the weeds of interest. Reference [
Reference [
Reference [
Based on the above studies, more information is needed on the potential of using remote sensing technology for soybean weed discrimination, especially in the case of velvetleaf. Currently, information is lacking on the comparison of multispectral systems wavebands for soybean velvetleaf discrimination. Additionally, no information exists on including shortwave infrared spectral data to discriminate soybeans from velvetleaf. The shortwave infrared region of the light spectrum (1300 - 2500 nm) is sensitive to the water content in plants [
Another key aspect of using remote sensing technology is the computer algorithm employed to process the data. The success or failure of using the technology is affected by the algorithm selected to analyze the data. In this study, it is proposed to use the random forest machine learner for soybean velvetleaf discrimination. Random forest has gained popularity as a tool to use for classification problems because it is fully automated, and users have the ability to design powerful models with little experience in using the machine learner. Random forest has been ranked as one of the best learners to employ for classification and regression problems [
Random forest is an ensemble learning method based on the principle that a group of “weak learners” can come together to develop a “strong learner” [
Currently, no information is available on using leaf multispectral reflectance data as input into random forest for soybean velvetleaf discrimination. The objective of this investigation was to evaluate leaf multispectral reflectance data as input into the random forest machine learner to differentiate velvetleaf from soybean. Specifically, the study focused on evaluating multispectral data mimicking the spectral bands of satellite sensors to discriminate the velvetleaf from two soybean varieties. Spectral bands of satellite sensors were chosen because the bands are strategically placed in different regions of the light spectrum for land cover mapping, thus providing different spectral band combinations for the model to test for separating velvetleaf from soybean.
Two Progeny (P) brand LibertyLink (LL) soybean varieties (P4928LL and P5460LL, Progeny Ag Products, Wynne, Arkansas) and non-glyphosate resistant velvetleaf (United States Department of Agriculture, Agri- cultural Research Service, Stoneville, MS) were grown for the study. All three plants are characterized as pubescent plants, consisting of gray, light tawny, and white hairs for soybean P4928LL, soybean P5460LL, and velvetleaf, respectively. Soybean P4928LL is characterized as having an indeterminate growth habit (i.e., a continuation of vegetative growth after flowering) and soybean P5460LL as having a determinate growth habit (i.e., vegetative growth completed prior to flowering). The maturity group assigned to soybean P4928LL and soybean P5460LL are 4.9 and 5.4, respectively.
The study was conducted at the United States Department of Agriculture, Agricultural Research Service, Stoneville, MS facility. Data were collected from two separate greenhouse experiments initiated on June 13, 2014, and August 28, 2014. Soybean and velvetleaf seeds were sown in plugs containing commercial potting mix (Pro-Mix, Ultimate Potting Mix, Quakertown, Pennsylvania). Ten days after germination, thirty plants of each soybean variety and weed species were transplanted to individual 1 L pots filled with the commercial potting mix. Plants were watered at three- to four-day intervals. The potting mix consisted of a slow release nitrogen, phosphorus, and potassium fertilizer. The plants were grown at a temperature and photoperiod of 26.6˚C and 14-h, respectively.
Leaf reflectance measurements were obtained with a full range hyperspectral spectroradiometer (FieldSpec 3, PANalytical Boulder, Boulder, CO). The instrument’s fiber optic was attached to a plant probe (PANalytical Boulder, Boulder, CO) equipped with a light source. The plant probe has a 1 cm field of view. A leaf clip (PANalytical Boulder, Boulder, CO) was fastened to the contact probe. This device has a trigger lock/release gripping system designed to hold the leaf in place without removing it from the plant or causing damage to the plant. The leaf clip has a two-sided rotating head. One side of the head contains a black panel face, and the other side has a white panel face. The black and white panels are ideal for reflectance and transmittance measurements, respectively. The former was employed in this study.
The spectroradiometer obtained continuous spectra in the range of 350 - 2500 nm. Its sampling interval and spectral resolution were 1.4 nm and 3 nm, respectively, within the 350 nm to 1000 nm spectral range. The sampling interval and spectral resolution were 2 nm and 10 nm, respectively, within the 1000 nm to 2500 nm spectral range. The proprietary software operating the instrument resampled the reflectance data to 1 nm wavelengths.
Reflectance measurements were collected from the most recently matured leaf of each plant. Soybean has a trifoliate leaf, therefore, the center leaflet of the most recently matured leaf was chosen for data collection. At the selected sample spot of each plant leaf, reflectance measurements were an average of fifteen readings. Leaf reflectance measurements were obtained on June 30, 2014, and September 17, 2014, for the first and second experiments, respectively. For velvetleaf, it is important to identify and treat the plant prior to seeding. Measurements were obtained for all plants during the vegetative growth stage. The instrument was calibrated with a white spectralon panel (PANalytical Boulder, Boulder, CO) at 15-minute intervals.
The hyperspectral reflectance measurements of the soybean and velvetleaf leaves were resampled to four multispectral datasets (
The conditional inference version of random forest (cforest) was used to create the models evaluated in this study. Reference [
Spectral Band | Wavelengths of Each Dataset | |||
---|---|---|---|---|
5 Banda | 7 Band | 8 Band | 16 Band | |
Coastal | 430 - 450 nm | 400 - 450 nm | 400 - 450 nm | |
Blue | 400 - 500 nm | 450 - 510 nm | 450 - 510 nm | 450 - 510 nm |
Green | 500 - 600 nm | 530 - 590 nm | 510 - 580 nm | 510 - 580 nm |
Yellow | 585 - 625 nm | 585 - 625 nm | ||
Red | 600 - 700 nm | 640 - 670 nm | 630 - 690 nm | 630 - 690 nm |
Red-edge | 705 - 745 nm | 705 - 745 nm | ||
Near infrared 1 | 700 - 800 nm | 850 - 880 nm | 770 - 895 nm | 770 - 895 nm |
Near infrared 2 | 800 - 1100 nm | 860 - 1040 nm | 860 - 1040 nm | |
Shortwave infrared 1 | 1570 - 1650 nm | 1195 - 1225 nm | ||
Shortwave infrared 2 | 2110 - 2290 nm | 1550 - 1590 nm | ||
Shortwave infrared 3 | 1640 - 1680 nm | |||
Shortwave infrared 4 | 1710 - 1750 nm | |||
Shortwave infrared 5 | 2145 - 2185 nm | |||
Shortwave infrared 6 | 2185 - 2225 nm | |||
Shortwave infrared 7 | 2235 - 2285 nm | |||
Shortwave infrared 8 | 2295 - 2365 nm |
a5 Band-simulates LANDSAT 3 spectral bands plus an additional blue band, 7 Band-simulates LANDSAT 8 spectral bands, 8 Band-simulates WorldView 2 spectral bands, and 16 Band-simulates WorldView 3 spectral bands.
[
The number of samples to evaluate at each split of the tree (mtry) and the number of trees to use for creating the model (ntree) were the two parameters needed to be set before completing the classification. For this study, the default mtry value of 5 was used for each dataset. The default ntree value of 500 was employed as the starting point and was adjusted accordingly to obtain consistent variable importance rankings.
The following procedure was used to test the robustness of the models relative to variable importance [
Classification accuracies of the selected models were determined by evaluating the user’s, producer’s, and overall accuracies and kappa coefficient [
The accuracy assessment results of the random forest classification for the June 30, 2014, dataset are summarized in
The random forest classification results of the velvetleaf soybean P4928LL classes are tabulated in
Overall, user’s, and producer’s accuracies and the kappa coefficients are presented in
The September 17, 2014, dataset for the velvetleaf soybean P5460LL classification indicated that the 16 Band dataset model was ranked or tied for first in all of the accuracy categories (
Classification | Date | Accuracy Measurement | Multispectral Dataseta | |||
---|---|---|---|---|---|---|
5 Band | 7 Band | 8 Band | 16 Band | |||
Velvetleaf-soybean P4928LL | June 30, 2014 | User’s accuracy velvetleaf | 93.5% | 96.7% | 93.8% | 96.7% |
User’s accuracy soybean P4928LL | 96.6% | 96.7% | 100% | 96.7% | ||
Producer’s accuracy velvetleaf | 96.7% | 96.7% | 100% | 96.7% | ||
Producer’s accuracy soybean P4928LL | 93.3% | 96.7% | 93.3% | 96.7% | ||
Overall accuracy | 95.0% | 96.7% | 96.7% | 96.7% | ||
Kappa coefficient | 0.900 | 0.933 | 0.933 | 0.933 | ||
Velvetleaf-soybean P4928LL | September 17, 2014 | User’s accuracy velvetleaf | 90.3% | 90.6% | 90.3% | 90.3% |
User’s accuracy soybean P4928LL | 93.1% | 96.4% | 93.1% | 93.1% | ||
Producer’s accuracy velvetleaf | 93.3% | 96.7% | 93.3% | 93.3% | ||
Producer’s accuracy soybean P4928LL | 90.0% | 90.0% | 90.0% | 90.0% | ||
Overall accuracy | 91.7% | 93.3% | 91.7% | 91.7% | ||
Kappa coefficient | 0.833 | 0.867 | 0.833 | 0.833 |
aRefer to
Classification | Date | Accuracy Measurement | Multispectral Dataseta | |||
---|---|---|---|---|---|---|
5 Band | 7 Band | 8 Band | 16 Band | |||
Velvetleaf-soybean P5460LL | June 30, 2014 | User’s accuracy velvetleaf | 93.5% | 96.7% | 96.7% | 96.7% |
User’s accuracy soybean P5460LL | 96.6% | 100% | 100% | 100% | ||
Producer’s accuracy velvetleaf | 96.7% | 100% | 100% | 100% | ||
Producer’s accuracy soybean P5460LL | 93.3% | 96.7% | 96.7% | 96.7% | ||
Overall accuracy | 95.0% | 98.3% | 98.3% | 98.3% | ||
Kappa coefficient | 0.900 | 0.967 | 0.967 | 0.967 | ||
Velvetleaf-soybean P5460LL | September 17, 2014 | User’s accuracy velvetleaf | 87.1% | 93.1% | 93.1% | 93.3% |
User’s accuracy soybean P5460LL | 89.7% | 90.3% | 90.3% | 93.3% | ||
Producer’s accuracy velvetleaf | 90.0% | 90.0% | 90.0% | 93.3% | ||
Producer’s accuracy soybean P5460LL | 86.7% | 93.3% | 93.3% | 93.3% | ||
Overall accuracy | 88.3% | 91.7% | 91.7% | 93.3% | ||
Kappa coefficient | 0.767 | 0.833 | 0.833 | 0.867 |
aRefer to
For fourteen out of the sixteen classification models, the default mtry and ntree values were adequate for obtaining stable variable importance readings (
The variable importance rankings of the random forest models used for the June 30, 2014, velvetleaf versus soybean P4928LL were as follows (
Variable importance rankings of the random forest models are shown in
Classification | Dataseta | mtryb | Ntrees (June 30, 2014) | Ntrees (September 17, 2014) |
---|---|---|---|---|
Velvetleaf-soybean P4928LL | 5 Band | 5 | 500 | 500 |
Velvetleaf-soybean P4928LL | 7 Band | 5 | 500 | 500 |
Velvetleaf-soybean P4928LL | 8 Band | 5 | 500 | 3500 |
Velvetleaf-soybean P4928LL | 16 Band | 5 | 500 | 500 |
Velvetleaf-soybean P5460LL | 5 Band | 5 | 500 | 500 |
Velvetleaf-soybean P5460LL | 7 Band | 5 | 500 | 500 |
Velvetleaf-soybean P5460LL | 8 Band | 5 | 500 | 500 |
Velvetleaf-soybean P5460LL | 16 Band | 5 | 500 | 4500 |
aRefer to
portant to the model for the 5 Band multispectral dataset. Distinct differences were observed in the scores, with the G band ranked the most important. Essential spectral bands for the 7 Band dataset model in descending order were G, NIR1, SWIR1, and R. The 8 Band dataset random forest model selected the G, Y, and NIR1 and 2 spectral bands as valuable variables for the classification; the rankings appeared in four distinct tiers: tier one-G, tier two-Y, tier three-NIR2, and tier four-NIR1. Five class tiers was observed for the most important rankings for the 16 Band dataset including the G spectral band in tier one, the Y spectral band in tier two, NIR1 and 2 spectral bands in tier three, SWIR spectral bands one and three in tier four, and the SWIR4 band in tier five.
Variable importance scores of the random forest models are shown in
The objective of this study was to evaluate leaf multispectral reflectance data as input into the random forest classification algorithm to differentiate soybean from velvetleaf, an invasive weed affecting soybean production throughout the United States and eastern provinces of Canada. The study emphasized using different multispectral band combinations as input into the algorithm to differentiate velvetleaf from two different soybean varieties. The algorithm achieved overall, user’s, and producer’s accuracies that were greater than 85% for velvetleaf soybean discrimination (
Generally, for all the datasets, the G and NIR spectral bands were ranked as important variables to the models for discriminating velvetleaf from soybean. Plant leaf reflectance and absorption of green light are influenced by leaf chlorophyll content [
micellular spaces of plant leaves affect their ability to reflect and absorb near infrared light [
With the increase in the number of spectral bands, more variables were ranked important to the random forest models (Figures 1-4); however, the increase in the number of bands per se did not always result in an increase in classification accuracy. For example, the number of accuracy test results completed for both dates and soybean varieties equal twenty-four. The 7 Band, 16 Band, 8 Band, and 5 Band datasets ranked or tied for the highest accuracies seventeen, sixteen, twelve, and one time, respectively. The differences in overall, user’s, and producer’s accuracies ranged from 0% to 6.6%, with the lowest accuracies occurring 95% of the time for the 5 Band dataset. For the kappa coefficients, the 5 Band model ranked last 100% of the time. The lower classification accuracies observed for the 5 Band dataset were most likely a result of the broader bandwidths (i.e., 100 nm or greater). Also, the findings indicated that reliable accuracies generally can be achieved using the default mtry and ntree values (
To put this study into perspective, leaf multispectral reflectance data were used as input into the random forest model for differentiating the velvetleaf from the soybean varieties. Leaf reflectance measurements represent pure reflectance measurements. Plant canopy response is affected by leaf angle, leaf positioning in the plant canopy, inter-canopy shadowing, soil background, and intermixing of plant canopies. Those aspects could lead to a different variable importance ranking of the spectral bands for plant canopy studies. Additionally, the study focused on binary classifications of soybean versus velvetleaf. Future studies need to focus on determining the potential of discriminating more than one weed at a time from soybean. Overall, this study provided valuable information on using the machine learning technique and on the influence of using different multispectral band combinations as input into the model for velvetleaf soybean discrimination.
This study provided new information on using the random forest algorithm with leaf multispectral reflectance data for differentiating velvetleaf from soybean. It demonstrated that the random forest algorithm could be used with a complement of multispectral datasets to separate velvetleaf from soybean. The best accuracies were achieved with multispectral datasets sensitive to visible (green and yellow spectral bands), near infrared, and shortwave infrared light. Findings support further application of the random forest machine learner along with remotely-sensed multispectral data as tools for velvetleaf soybean discrimination with future implications for site-specific management of velvetleaf.
The author is grateful to Dr. Vijay Nandula for supplying the velvetleaf seed, Mr. Milton Gaston Jr., Mr. Arrington Smith, Ms. Keysha Hamilton, Mr. David Fisher, Ms. Raven Thompson, and Ms. Keyanna Nealon for their assistance in data collection, and Dr. Ken Fisher and Dr. Chenghai Yang for their critical review of the manuscript. Mention of trade names or commercial products in this report is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture.
Reginald S.Fletcher, (2015) Testing Leaf Multispectral Reflectance Data as Input into Random Forest to Differentiate Velvetleaf from Soybean. American Journal of Plant Sciences,06,3193-3204. doi: 10.4236/ajps.2015.619311