Journal of Data Analysis and Information Processing, 2013, 1, 30-34 http://dx.doi.org/10.4236/jdaip.2013.13005 Published Online August 2013 (http://www.scirp.org/journal/jdaip) Application of Extended Multiplicative Signal Correction to Short-Wavelength near Infrared Spectra of Moisture in Marzipan Pedro dos Santos Panero1, Francisco dos Santos Panero2, João dos Santos Panero3, Henrique Eduardo Bezerra da Silva4 1Câmpus Novo Paraíso, Instituto Federal de Educação, Ciência e Tecnologia de Roraima, Boa Vista, Brazil 2Instituto Federal de Educação, Ciência e Tecnologia de Roraima, Boa Vista, Brazil 3Departamento de Engenharia Elétrica, Universidade Federal de Roraima, Boa Vista, Brazil 4Departamento de Química, Universidade Federal de Roraima, Boa Vista, Brazil Email: fspaneroit@yahoo.com.br Received June 15, 2013; revised July 22, 2013; accepted August 12, 2013 Copyright © 2013 Pedro dos Santos Panero et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ABSTRACT Short-wavelength near infrared spectroscopy (SW-NIR) is a very rapid, versatile and precise technique, which can be used in many different situations and for very types of products and chemical compounds. Extended multiplicative sig- nal correction (EMSC) is a modification of the standard MSC pre-processing method that allows the separation of physical light scattering effects from chemical (vibrational) ligh t absorbance effects in spectra. In this paper, the EMSC is applied and compared with first derivate, second derivate, MSC and SNV in combination of PLSR to obtain robust models in terms of accuracy and predict ab ility with a reduced calibration data set using SW-NIR spectra of moisture in marzipan. The Extended Multiplicativ e Signal Correction—EMSC and combination methods provide the best results in terms of prediction ability and calibration SW-NIR spectra of moisture in marzipan. The b est classification results were obtained by Extended Multip licative Signal Correction followed by second derivates. Keywords: EMSC; PLSR; SW-NIR; Extended Multiplicative Signal Correction; Chemometrics 1. Introduction Multivariable electromagnetic spectrophotometry in the near or mid-infrared region offers great practical and economical advantages for analysis of large sample se- ries, as demonstrated by diffuse NIR reflectance or transmittance spectroscopy in areas such as agriculture, food technology, pharmaceutics, and petrochemistry [1]. Today, such high-speed instruments are routinely de- signed to yield precise quantitative determination for a variety of chemical and physical properties, using multi- variate calibration to solve the selectivity problems caused by the lack of sample preparation and for auto- matic detection of outliers [2]. Preprocessing of the sp ec- tral measurements is used for optimizing the subsequent multivariate calibration. When analyzing more or less intact complex samples by diffuse reflectance or transmittance spectroscopy, uncontrolled variations in light scattering are often a dominating artifact that co mplicates subsequent quantita- tive chemical analysis. This undesired scattering varia- tion is due to uncontrolled physical variations in the measured samples: particle size and shape, sample pack- ing, sample surface, etc. If the light scattering could be modeled and corrected for mathematically in a more elaborate preprocessing stage, these problems could be reduced or eliminated. The cost of NIR analysis could then be reduced, because the need for controlled sample preparation could be further reduced, the number of cali- bration samples could be reduced, and the statistical calibration modeling process could be simplified. More- over, a pragmatic but reasonably accurate model-based light scatter correction may shed new light on the light scattering processes themselves [3]. One of the main mistake sources found in quantitative determinations, through the spectroscopy or Short- Wavelength near Infrared (SW-NIR) spectroscopy (700 - 1100 nm), it is phenomenon of scattering of light, pro- voked by the non homogeneity of the sample, mainly for the granulometric differences, geometry, packing and C opyright © 2013 SciRes. JDAIP
P. DOS SANTOS PANERO ET AL. 31 orientation of the particles [4,5]. The light scattering al- ters the functional relationship between the intensity of the reflection measures and the concentration of the pre- sent absorbent species in the sample. To model light scattering and reflectance simultaneously is an extremely difficult task, because the geometry and the orientation of the particles vary randomize sample. Thus, in the con- struction of precise and robust models, it is necessary to minimize the effect of the scattering [4,5]. The short-wavelength near infrared (SW-NIR) spec- troscopy (700 - 1100 nm) is a relatively new analytical technique with a high potential for food analysis [6-10]. The SW-NIR presents several advantages over conven- tional near-infrared methods (900 - 2500 nm): 1) Ab- sorption in this wavelength region arises from the third- and fourth overtone vibrational (CH, OH, NH) transitions which are characterized by low extinction coefficients; 2) Signal-to-noise ratios on the order of 10,000:l can be ob- tained, thus making very subtle changes in the spectra useful for analysis; 3) Good quantitative results can be obtained for highly scattering samples [8]. These ab- sorbances are overtones and combination bands from fundamental molecular vibration bands in the IR region. The IR absorbances themselves are often too strong to allow simple, representative analysis of complex samples. But the NIR bands are su fficiently weaken ed to allow th e light to penetrate anywhere from a few millimeters to a couple of centimetres through the samples. The NIR measurements can be taken as reflectance or transmit- tance, depending on what is most practical. Finally, since NIR instruments require both multivariate calibration chemometrics and spectroscopic insight, this multidisci- plinary technique may fall between the traditionally spe- cialized academic chairs. One disadvantage of SW-NIR spectroscopy is that spectral features often appear quite overlapped and require the use of sophisticated data analysis techniques such as partial least-squares (PLS) calibration methods to obtain meaningful correlations [8,9]. The paper presents the new application spectral pre- processing methods for improving the multivariate cali- bration of multichannel analytical instruments based on spectroscopic background knowledge: extended multi- plicative signal correction (EMSC) is designed to im- prove the separation of light scattering and light absorb- ance. Conventional projection on latent structures regres- sion (PLSR) is then used for the subsequent empirical “soft modelling” calibration. Near infrared (NIR) data are used for illustrating their applications. Before this ghastly group to be modeled by chemometrics technical, with the purpose of minimizing the effects caused by difficulty in the obtaining of an ideal spectrum. This study compares the EMSC (Extended Multiplicative Signal Correction) with first derived, second derived, SNV (Standard Normal Variate) and MSC (Multiplica- tive Signal Correction) methods in terms of robustness and prediction ability of the final PLS models in SW-NIR spectra of marzipan. This template, created in MS Word 2003 and saved as “Word 97 - 2003 & 6.0/95-RTF” for the PC, provides authors with most of the formatting specifications needed for preparing elec- tronic versions of their papers. All standard paper com- ponents have been specified for three reasons: 1) ease of use when formatting individual papers; 2) automatic compliance to electronic requirements that facilitate the concurrent or later production of electronic products; and 3) conformity of style throughout a journal paper. Mar- gins, column widths, line spacing, and type styles are built-in; examples of the type styles are provided throughout this do cument and are iden tified in italic typ e, within parentheses, following the example. Some com- ponents, such as multi-leveled equations, graphics, and tables are not prescribed, although the various table text styles are provided. The formatter will need to create these components, incorporating the applicable criteria that follow. 2. Methodology 2.1. PLS Regression PLSR is today probably the most widely applied multi- variate calibration method in chemometrics [11,12]. It is commonly used in quantitative spectroscopy to correlate spectroscopic data (X) with related physico-chemical data (Y). It is based on so-called latent variables like PCA and PCR [12], but for PLSR the decomposition of X during regression is guided by the variation in Y: the explained covariance between X and Y is maximized, so that the variation in X directly correlating with Y is ex- tracted. An important feature of PLSR is that, as mentioned, it is based on latent variables and can therefore handle the usually highly collinear sp ectroscopic data, in contrast to MLR [13]. The linear model between the vector Yc con- taining the centered reference data and the matrixXc containing the centered spectral data can be described by: YcXcb e (1) where b is a vector which contains the regression coeffi- cients to be determined during the calibration, and e is the residual. In order to obtain a good estimation of b, the PLSR model needs to be calibrated on samples that span the variation in Y well and in general are representative of the future samples. Depending on the complexity of the future samples, this may require a huge number of samples [14]. 2.2. Extended Multiplicative Signal Correction A number of chemometric preprocessing methods have Copyright © 2013 SciRes. JDAIP
P. DOS SANTOS PANERO ET AL. 32 been proposed to explicitly model the effect of multipli- cative light scattering [15]. One of the most frequently reported techniques in the literature is that of multiplica- tive signal correction (MSC) [16]. The methodology in- volves regressing each spectrum in a set of related sam- ples, i.e., the samples comprise the same chemical com- ponents, on a reference spectrum (for example, the mean spectrum) to estimate the intercept and slope of the esti- mated regression equation that will theoretically capture the information relating to the effect of multiplicative light scattering. Each individual spectrum is then cor- rected by subtracting the intercept and dividing by the slope. An alternative procedure proposed to correct for multiplicative light scattering, that is similar to MSC, is the inverted scatter correction (ISC), more recently is the Extended multiplicative signal correction (EMSC) is a modification of the standard MSC pre-processing method [4,5,15-18] that allows the separation of physical light scattering effects from chemical (vibrational) light ab- sorbance effects in spectra. It was developed by Martens and Stark [19,20] the methodology to identify and sepa- rate various effect in multi-chanel measurements making the measurements suitable lives it goes multivariate cali- bration, improving robustness and predictive ability [4, 5,19]. This approach is able to estimate and separate multiplicative physical effects (path length, light scatter- ing, sample thickness, etc.) from additive chemical effect s (absorbance of analytes and interferants) and additive physical effects (temperature shifts, baseline variations, etc.) [4,5,14,21]. It can also be used to remove identified but undesired “physical” and “chemical” interference effects, while retaining identified, but desired effects as well as unidentified effects in the data. For these pur- poses, EMSC allows the use of previous knowledge about the system and its components (constituents’ spec- tra) in the correction, which can sometimes be very use- ful and yield good calibration results [4,5]. EMSC ap- pears to be applicable to different types of spectroscopic data (UV, VIS, NIR, IR, Raman), chromatography, elec- trophoresis and sensory data [14]. 3. Experimental 3.1. Data Sets The spectral measurement compositional analysis of 32 marzipan samples. Traditional moisture was performed on all samples. The Spectral data matrix is obtened in a Instrument: Infratec 1255, Dispersive scanning is a opti- cal principle, the Available Spectral Range is 850 - 1050 nm, the Spectral Sampling is 2 nm. The data set was produced by J. Chri stensen et al. [22].The file of this data set was obtended in the Public data sets for multivariate data analysis, located in the home page of the Quality and Technology, Department of Food Science, Faculty of Science, University of Copenhagen, the matrix (32 × 100), more information http://www.models.kvl.dk/research/data/Marzipan /index. asp. 3.2. Model of Calibration and Prediction The 32 spectral of marzipan samples was utilized for construction of the models of calibration and prediction, using full Cross Validation (CV), was construction one models of calibration and prediction of each on preproc- essing: Raw, First Derivative (1st), Second Derivative (2nd), SNV, MSC, EMSC, EMSC + 1st, 1st + EMSC, EMSC + 2nd and 2nd + EMSC. All the techniques of preprocessing were evaluated in terms of robustness and prediction ab ility of the final PLSR. For the construction of the models, the date set was centered in the average. The robustness of preprocessing techniques was evalu- ated for RMSEC (Root Mean Square Error of Cross Validation) and RMSEP (Root Mean Error of Prediction) and correlation (calibration and prediction). For the con- struction of the models of calibration, prediction and comparison of the preprocessing techniques was utilized the UNSCRAMBLER V 9.2. 4. Results and Discussion Figure 1 presents different preprocessing methods ap- plied to the short-wavelength near infrared (SW-NIR) spectra (850 - 1050 nm) of moisture in marzipan. The Figure 1(a) is the SW-NIR spectra without any pr eproc- essing, the raw spectra of the 32 samples of marzipan in the calibration set. From the figure, large additive offset, multiplicative scaling effects and light scattering are readily observed. Figure 1(b) displays the same SW-NIR spectra after SNV pre-transformation. In comparison, Figure 1(c) shows MSC pre-tran sformed spect ra. Figure 1(d) displays the same SW-NIR spectra after EMSC pre- transformation. Table 1 presents the performance statistics of the PLSR models goes quantization of the marzipan moisture predictions using SW-NIR (850 - 1050 nm) spectra from the calibration and the prediction s (32 sample), using full cross validation. CV is cross validation, RMSEC is the root mean square error of cross validation and RMSEP is the root mean error of prediction. The calibration and prediction test set results for all the tested pre-transformation methods are summarized in Table 1 in terms of the RMSEC (CV) and RMSEP (CV), CV is cross validation, and of the correlation coefficients based thereon. Compared to the untransformed raw data, the basic second derivative (2nd), SNV and MSC did not affect the results very much. The first derivative (1st) and EMSC pre-transforma- Copyright © 2013 SciRes. JDAIP
P. DOS SANTOS PANERO ET AL. 33 (a) (b) (c) (d) Figure 1. SW-NIR spectra (850 - 1050 nm) of the 32 marzi- pan sample; (a) Raw; (b) SNV transformed; (c) MSC trans- formed; (d) EMSC transformed. Table 1. Performance statistics of the PLSR models. Correlation Prediction error (% moisture) Preprocessing Calibration (CV) Prediction (CV) RMSEC (CV) RMSEP (CV) Raw 0.991 0.987 0.489 0.570 First Derivative (1st) 0.992 0.989 0.449 0.529 Second Derivative (2nd) 0.991 0.988 0.480 0.561 SNV 0.993 0.990 0.462 0.533 MSC 0.993 0.991 0.467 0.536 EMSC 0.991 0.988 0.447 0.527 EMSC + 1st 0.992 0.990 0.443 0.518 1st + EMSC 0.963 0.987 0.429 0.512 EMSC + 2nd 0.993 0.990 0.420 0.500 2nd + EMSC 0.984 0.979 0.631 0.730 tion affect the results very much. But the combination between EMSC and first derivative (1st) and second de- rivative (2nd) present the b etter correlation of calibration and prediction, and the better RMSEC (CV) and RMSEP (CV). The better calibration and prediction test set is Ex- tended Multiplicative Signal Correction followed by second derivates (EMSC + 2nd), this combination usu- ally results in a reduction of the scatter related offset and light scattering and reveals mores spectral features com- pared to the raw spectra. 5. Conclusion The result of the different preprocessing methods shows that the new, extended MSC, or be, the Extended Multi- plicative Signal Correctio n—EMSC methods provide the best overall performance in terms of prediction ability and calibration short-wavelength near infrared spectros- copy (SW-NIR) spectra of moisture in marzipan. The suc- cess is presumably du e to the ability o f spectral modeling to separate chemical light-absorbance and physical light scatter effects. The best classification results were ob- tained by Extended Multiplicative Signal Correction fol- lowed by second derivates (EMSC + 2nd). This way the application of the extended multiplicative signal correc- tion—EMSC is very important for estimating and sepa- rating multiplicative physical effects (light scattering) and additive chemical effects (light absorbance) in near infrared spectroscopy. REFERENCES [1] C. Pasquini, “Near Infrared Spectroscopy: Fundamentals, Practical Aspects and Analytical Applications,” Journal of the Brazilian Chemical Society, Vol. 14, No. 2, 2003, pp. 198-219. doi:10.1590/S0103-50532003000200006 [2] H. Martens and T. Naes, “Multivariate Calibration,” 1st Edition, John Wiley & Sons, New York, 1996. [3] H. Martens, J. P. Nielsen and S. B. Engelsen, “Light Scattering and Light Absorbance Separated by Extended Multiplicative Signal Correction. Application to Near- Infrared Transmission Analysis of Powder Mixtures,” Analytical Chemistry, Vol. 75, No. 3, 2003, pp. 394-404. doi:10.1021/ac020194w [4] H. E. B. da Silva, F. S. Pane ro and L. P. D. Ribeiro, “Ap- plication by Extended Multiplivative Signal Correction to Reflectance Difuse the Near-Infrared of Ration for Shrimp,” 10th International Conference on Chemomet- rics in Analytical Chemistry, Águas de Lindóia, 10-15 September 2006, p. 23. [5] F. S. Panero and H. E. B. da Silva, “Application by Ex- tended Multiplicative Signal Correction to NIR FR Ra- man Spectra of Pharmaceutical Tablets,” 58th Pittsburg Conference on Analytical Chemistry and Applied Spec- troscopy , Chicago, 25 February-2 March 2007, p. 57. [6] M. Lin, A. G. Cavinato, D. M. Mayes, S. Smiley, Y. Huang, M. Al-Holy and B. A. Rasco, “Bruise Detection in Pacific Pink Salmon (Oncorhyncus gorbuscha) by Short-Wavelength Near-Infrared Spectroscopy,” Journal of Agricultural and Food Chemistry, Vol. 51, No. 22, 2003, pp. 6404-6408. doi:10.1021/jf0346197 [7] M. H. Lee, A. G. Cavinato, D. M. Mayes and B. A. Rasc o, “Noninvasive Short Wavelength Neon IR Spectroscopic Method to Exhibit the Crude Lipid Content in the Muscle of Intact Rainbow Trout,” Journal of Agricultural and Food Chemistry, Vol. 40, No. 1, 1992, pp. 2176-2181. doi:10.1021/jf00023a026 [8] E. Ben-Dar, Y. Inbar and Y. Chen, “The Reflectance Copyright © 2013 SciRes. JDAIP
P. DOS SANTOS PANERO ET AL. Copyright © 2013 SciRes. JDAIP 34 Spectra of Organic Matter in the Visible near Infrared and Short Wave Infrared Region (400 - 2500 nm) during a Control Decomposition Process,” Remote Sensing of En- vironment, Vol. 61, No. 1, 1997, pp. 1-15. doi:10.1016/S0034-4257(96)00120-4 [9] M. G. Trevisan and R. J. Poppi, “Química Analítica de Processos,” Quimica Nova, Vol. 29, No. 5, 2006, pp. 1065-1071. doi:10.1590/S0100-40422006000500029 [10] A. S. Malik, O. Boyko, N. Atkar and W. F. Young, “A Comparative Study of MR Imaging Profile of Titanium Pedicle Screws,” Acta Radiologica, Vol. 42, No. 3, 2001, pp. 291-293. doi:10.1080/028418501127346846 [11] H. Martens and T. Næs, “Multivariate Calibration,” 1st Edition, John Wiley & Sons, Chichester, 1989. [12] T. Næs, T. Isaksson, T. Fearn and T. Davies, “A User- Friendly Guide to Multivariate Calibration and Classifi- cation,” 1st Edition, NIR Publications, Chichester, 2002. [13] D. L. Massart, B. G. M. Vandeginste, S. N. Deming, Y. Michotte and L. Kaufman, “Data Handling in Science and Technology, Vol. 2: Chemometrics: A Textbook,” 1st Edition, Elsevier, Amsterdam, 1988. [14] M. J. Saiz-Abajo, B. H. Mevik, V. H. Segtnam and T. Naes, “Ensemble Methods and Data Augmentation by Noise Addition Applied to the Analysis of Spectroscopic Data,” Analytica Chimica Acta, Vol. 533, No. 2, 2005, pp. 147-159. doi:10.1016/j.aca.2004.10.086 [15] Z. P. Chen, J. Morris and E. Martin, “Extracting Che- mical Information from Spectral Data with Multiplicative Light Scattering Effects by Optical Path-Length Estima- tion and Correction,” Analytical Chemistry, Vol. 78, No. 22, 2006, pp. 7674-7681. doi:10.1021/ac0610255 [16] P. Geladi, D. McDougall and H. Martens, “Linearization and Scatter-Correction for Near-I nfrared Reflectance Spec- tra of Meat,” Applied Spectroscopy, Vol. 39, No. 3, 1985, pp. 491-500. doi:10.1366/0003702854248656 [17] H. Martens and E. Stark, “Extended Multiplicative Signal Orrection and Spectral Interference Subtraction: New Pre- processing Methods for near Infrared Spectroscopy,” Jour- nal of Pharmaceutical and Biomedical Analysis, Vol. 9, No. 8, 1991, pp. 625-635. doi:10.1016/0731-7085(91)80188-F [18] M. Zeaiter, J.-M. Roger and V. Bellon-Maurel, “Robust- ness of Models Developed by Multivariate Calibration. Part II: The Influence of Pre-Processing Methods,” Trends in Analytical Chemistry, Vol. 24, No. 5, 2005, pp. 437- 445. doi:10.1016/j.trac.2004.11.023 [19] B. D. K. Pedersen, H. Martens, J. Pram-Nielsen and S. Balling-Engelsen, “Near-Infrared Absorption and Scatter- ing Separated by Extended Inverted Signal Correction (EISC): Analysis of Near-Infrared Transmittance Spectra of Single Wheat Seeds,” Applied Spectroscopy, Vol. 56, No. 9, 2002, pp. 1206-1224. doi:10.1366/000370202760295467 [20] H. Martens, J. Pram-Nielsen and S. Balling-Engelsen, “Light Scattering and Light Absorbance Separated by Extended Multiplicative Signal Correction. Application to Near-Infrared Transmission Analysis of Powder Mix- tures,” Analytical Chemistry, Vol. 75, No. 3, 2003, pp. 394-404. doi:10.1021/ac020194w [21] M. Decker, P. V. Nielsen and H. Martens, “Near-Infrared Spectra of Penicillium camemberti Strains Separated by Extended Multiplicative Signal Correction Improved Pre- diction of Physical and Chemical Variations,” Applied Spectroscopy, Vol. 59, No. 1, 2005, pp. 56-68. doi:10.1366/0003702052940486 [22] Christensen, L. Nørgaard, H. Heimdal, J. G. Pedersen and S. B. Engelsen, “Rapid Spectroscopy Analysis of Marzi- pan: Comparative Instrumentation,” Journal of near In- frared Spectroscopy, Vol. 12, No. 1, 2004, pp. 63-75. doi:10.1255/jnirs.408
|