Ganoderma lucidum( G. lucidum) spores as a valuable Chinese herbal medicine have vast marketable prospect for its bioactivities and medicinal efficacy. This study aims at the development of an effective and simple analytical method to distinguish G. lucidum spores from its fruiting body, which is of essential importance for the quality control and fast discrimination of raw materials of Chinese herbal medicine. Attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy combined with the appropriate chemometric methods including penalized discriminant analysis, principal component discriminant analysis and partial least squares discriminant analysis has been proven to be a rapid and powerful tool for discrimination of G. lucidum spores and its fruiting body with classification accuracy of 99%. The model leads to a well-performed selection of informative spectral absorption bands which improve the classification accuracy, reduce the model complexity and enhance the quantitative interpretations of the chemical constituents of G. lucidum spores regarding its anticancer effects.
Ganoderma lucidum (G. lucidum), a fungus famous as traditional Chinese herbal medicine, has been widely used for preventing and treating a series of diseases. G. lucidum spores are the fungus’s reproductive cells ejected from the cap of G. lucidum after its fruiting body becomes mature. Though the fruiting body of G. lucidum has been widely utilized as a Chinese medicine for several thousand years, the spores of G. lucidum have been realized and utilized only since the 20th century. Recent studies demonstrated that the spores of G. lucidum not only inherit all active ingredients of G. lucidum, but also have stronger bioactivities, about 75 times more than G. lucidum’s fruiting body regarding its effect, such as enhancing immunity, antitumor, preventing diabetes, protecting liver and so on [
G. lucidum spores have a complicated system of compounds. The commonly investigated methods for the analysis of herbal medicines, like high performance liquid chromatography (HPLC), thin layer chromatography (TLC) and colorimetry, are found to be expensive, time-consuming, labour-intensive, and requiring large quantity of organic solvents. Also, the results are inadequate for classification purpose because of the limited amount of active chemical components that can be detected in what is a very complex system in G. lucidum spores [
Attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy is very efficient for revealing chemical compositions and structures of herbal medicine in terms of easy and direct usage of technique, non- destructiveness, small quantity of sample needed and short data acquisition time. Since chemical processing of the herbal material is not needed at all, the chemical composition of the material remains in its original form [
ATR-FTIR spectra of herbal medicines consist of many overlapping absorption bands representing the different modes of vibration of a large number of molecular constituents in the compounds. These vibrational bands are sensitive to the physical and chemical states of the compounds, and they can be detected at low levels [
In this paper a penalized discriminant analysis (PDA) model was developed to identify informative spectral features for distinguishing between spores and fruiting body of G. lucidum. The multivariate methods using principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA) were also explored based on the whole spectrum. The model performances based on the selected wavelength bands and the whole spectrum were compared in terms of classification accuracy and interpretation of spectral features. The established discrimination models explored in this paper would be an accurate, simple and robust tool for the quality control of G. lucidum spores. In particular, the discriminant vectors can be helpful for the interpretation of spectral features needed for discrimination and for providing a quantitative explanation of the major chemical constituents of G. lucidum spores regarding its anti-cancer effects.
Ten samples of G. lucidum fruiting body and ten G. lucidum spores samples were originated from Taishan, China. The fruiting body of G. lucidum was cross-sectioned into thin slices. From each sample, multiple spectra were taken from five different positions: top surface, middle area, bottom surface, outer stipe and inner stipe. In total, 80 spectra from G. lucidum fruiting body samples and 30 spectra from G. lucidum spores samples were collected.
A Fourier transform infrared (FTIR) spectrometer (Perkin-Elmer Spectrum 100 model) with an attenuated total reflectance (ATR) accessory was used to record the absorbance spectra of the G. lucidum fruiting body and spores directly without any processing. The ATR-FTIR spectra of all the G. lucidum samples were recorded in the mid-IR region of 4000 - 400 cm−1 at resolution of 4 cm−1 with 20 scans for each spectrum. Each spectrum with high signal-to-noise signal of about 50 was obtained by an average of these 20 scans. Background spectra were always recorded before running the sample spectra in order to obtain absorbance spectra with smooth baseline and with minimum detection of absorption bands of water vapor and carbon dioxide present in the optical path of the infrared beam in the spectrometer. Strong absorption bands in the absorbance spectra of the G. lucidum samples were accurately obtained by applying sufficient pressure on the sample onto the ZnSe crystal using a diamond tip in the ATR set-up [
ATR-FTIR spectra are affected by both the concentration of the chemical constituents and the physical properties of the analyzed product. The physical effects, such as baseline variation, light scattering, path length differences, etc., account for the majority of the variance among spectra while the variance due to chemical composition is considered to be small. Therefore mathematical pretreatments are essential to reduce the variation due to physical effects so as to enhance the contribution of the chemical composition [
The spectra were first smoothed using the Savitzky-Golay algorithm [
1) Principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA)
Multivariate statistical methods including principal component analysis (PCA), partial least squares (PLS) and linear discriminant analysis (LDA) [
Simple and direct implementation of LDA in high-dimensional spectroscopic data setting provides poor classification results and the interpretation of the results is challenging due to singularity problem and highly-correlated spectral features. To solve this problem, many traditional approaches to this problem involve performing feature selection to reduce the variable dimension before classification. PCA and PLS were used to reduce the dimension of the original spectral data matrix with little loss of information. Then LDA focuses on finding a linear combination of the new variables, provided either by PCA or PLS, to construct canonical variate which best separates the two groups. Using pretreated spectral data described in Section 2.3, classification rules were derived using principal component discriminant analysis (PCDA) [
2) Penalized discriminant analysis (PDA)
Though the spectroscopic data are highly correlated due to the presence of a large number of overlapping broad peaks, in many applications, only a small subset of variables (wavelengths/wavenumbers) contain sufficient information for discrimination. Hence there is interest in determining if a small subset of spectral wavenumbers containsas much information as the full spectrum does.
Penalized discriminant analysis (PDA) [
where
class covariance matrix, λ is non-negative tuning parameter. A diagonal estimate for the within-class covariance matrix is used here because it has been shown to give good results in the high dimensional setting [
Leave-one-out cross-validation was used to train the algorithm by carrying out the PCDA, PLSDA and PDA classification rules on all the data except one sample which was then tested. This was repeated until all samples have been tested and an overall model accuracy was determined. To ensure that the wavelengths selected by PDA model are not training set specific, the PDA model was also validated with 70% of the data being treated as training data and 30% as test data. To ensure the statistical robustness, this process was repeated 50 times with different random splits of training and test sets, and the average misclassification rates were presented to assess the classification performance.
All the algorithms for computations and analyses were implemented in R statistical programming language [
Polysaccharide, triterpene, sterols, amino acids, proteins, fatty acids have been known as the most biologically active substances in G. lucidum spores [
For PCDA and PLSDA model using the full spectrum region, the number of PCs or PLS components chosen is crucial to the discrimination performance. The discrimination results of cross-validation were used to optimize the number of PCs or PLS components. For PCDA model, the first seven PCs were used to construct the discrimination model and the leave-one-out cross-validation analysis gave a discrimination accuracy of 97%. With the relationship between the spectra variables and the responses taken into account for latent variable design, the PLSDA model used a fewer optimal number of latent variables (only three PLS components) when constructing the canonical variate and the leave-one-out cross-validation achieved a discrimination accuracy of 99%. In
Wave number (cm−1) | Functional Group Assignments |
---|---|
3500 - 3700 | O-H stretching vibration of hydroxyl groups (mainly lipids and proteins) |
3380 | O-H stretching, Amine (N-H stretching vibration) mainly carbohydrates proteins |
2957 | CH3 asym stretching (mainly lipids) |
2922 | CH2 asym stretching (mainly lipids) |
2873 | CH3 sym stretching (mainly proteins) |
2852 | CH2 sym stretching (mainly lipids) |
1710, 1733 | C=O carbonyl stretching of saturated aliphatic esters |
1630 | Amide I (protein C=O stretch) |
1555 | Amide II (C-N, N-H stretching) mainly proteins |
1415 | O-H bending, polysaccarides |
1377 | Symmetric bending of aliphatic CH3, triterpene compounds (CH2=CH-CH3) |
1250 | Pectic substances |
1235 | Amide III (C-N, N-H stretching) mainly proteins |
1145 | Cellulose (b-glucan), triterpene compounds (C-O) |
1101 | Antisym in-phase, pectic substances |
1073 | Rhamnogalactorunan, b-galactan |
1064 | C-O stretching , cell wall polysaccharides (glucomannan) |
1035 | OH and C-OH stretching in sugars, cell wall polysaccharides (arabinan) |
PDA model seeks to find an optimal parameter λ with lowest error rate.
It is possible that the superior performance of PDA model relative to the models using full wavelength is due to the fact that ATR-FTIR spectroscopic data consist of many overlapping absorption bands of which only a small proportion may be informative for explaining the response. Including those uninformative wavelength points in a model may introduce a great deal of noise and thus reduce the performance of the model.
The good discrimination results from all these models suggested that there may exist some inherent compositional differences between spores and fruiting body of G. lucidum.
Discrimination performance of the models may be explained by the correlation between spectral features and chemical constituents of G. lucidum spores. With non-zero features selected for discrimination, the discriminant vector of PDA model makes direct and valuable contribution to interpreting spectral features related to the medical
effect of G. lucidum spores. In
For the PCDA and PLSDA model, the PCDA or PLSDA loading of the original variables combines the loading from PCA or PLS and the loading from LDA when constructing a canonical variate. Therefore the PCDA loading and the PLSDA loading show the contribution at each wavelength to the linear diagnostic rule and thus can be related easily to the spectral features, which permits interpretation of its spectral basis. A comparison between PCDA loading and the PLSDA loading for discrimination between spores and fruiting body of G. lucidum can be seen in
emphasized by PCDA model, which are consistent with group differences between spores and fruiting body of G. lucidum as shown in
G. lucidum contains approximately 400 different bioactive compounds [
spores have immunomodulating properties. G. lucidum spores have been found to be effective in modulating the immune responses, and thus show efficacy of immunostimulatory and antitumor activities. Some comparative studies also reported that spores and fruiting body of G. lucidum showed different efficacy with regard to their antitumor effects and immunomodulatory activities [
The spores of G. lucidum have higher economic value compared with its fruiting body due to the fact that numbers of bioactive substances of the spores are much higher than those of the fruiting body of G. lucidum. Since a variety of commercial G. lucidum products are available in various forms, it is of essential importance to distinguish spores and fruiting body of G. lucidum for the purpose of quality assurance.
In this study, the combination of ATR-FTIR spectroscopy and chemometrics method has been proved to be a very powerful tool to distinguish G. lucidum spores from its fruiting body efficiently. An excellent classification performance of up to 99% accuracy can be achieved by the discrimination models using the spectral features either selected or emphasized by the proposed models. By imposing penalties on the discriminant vectors, the PDA model presented in this paper enables an automatic selection of a small number of informative wavelength points to construct an efficacious discrimination model, which gives comparable or even higher accuracy than the PCDA and the PLSDA models based on the full wavelength.
Most essential contribution of the model is that the selected spectral regions for discriminant analysis show a good link between spectral features and chemical components of G. lucidum spores, which provided some evidence for its anticancer effect. This is a novel and important finding, as it provides quantitative interpretation and scientific support to the claims on the health benefits and antitumor properties of G. lucidum spores. It is also a potentially useful tool for quality control and fast discrimination of raw materials of traditional herbal medicine. Identified spectral regions may be targeted for further analysis linked with its active biochemical components of herbal medicine.
This research is supported by Academic Research Funds (AcRF: RI 12/10 TTL and AcRF: RI 6/14 ZY) of National Institute of Education, Nanyang Technological University, Singapore.
YingZhu,Augustine Tuck LeeTan, (2015) Chemometric Feature Selection and Classification of Ganoderma lucidum Spores and Fruiting Body Using ATR-FTIR Spectroscopy. American Journal of Analytical Chemistry,06,830-840. doi: 10.4236/ajac.2015.610079