Paper Menu >>
Journal Menu >>
Engineering, 2 http://dx.doi.or g Copyright © 2 0 Split t ABSTRA C In this paper, (GMMs) call e p roblems in o to increase t h densities foll o healthy infan t mature infant s sional Mel-F r method for tr a based re-esti m Keywords: A E x 1. Introdu c Gaussian Mi x form smooth ties and it h a model for bi o recognition s y GMMs are es t special case o gorithm base d finite amount that commit s N evertheless, antee that the r tion after eac h optimal para m p arameter est i of free para m some proble m example, in A tems using H T ting and EM - guarantee th a splitting alwa y re-estimation. in EM- b ased sensitivity to 013, 5, 277-28 3 g /10.4236/eng. 2 0 13 SciRes. t ing of G Perta i Departm e C T we make use e d adapted B o ther conventi o h e number of G o wed by learni n t s and those t h s . Cry- p attern r equency Cep s a ining GMMs m ation as a ref e A dapted Boost e x pected-Maxi m c tion x ture Model ( approximatio n a s proved to o metric syste m y stems and s p t imated from a o f the Expect a d on the max i of sample da t s tatistical err o this iterative a r e will be no d h iteration and m eters [3]. Pe r i mation error s m eters in the c l m s when incr e A utomatic Sp e T K with the m - based re-esti m a t the newly a y s increases t h Second, con v re-estimation initial para m 3 2 013.510B058 P G aussi a i ning t o H e nt of Electrica l E of the boostin g o osted Mixtur o nal technique s G aussian co m n g via the int r h at present a s e for each path o s tral Coefficie n has a better p e rence system e d Mixture Le a m ization Alg o ( GMM) has t h n s to arbitraril y be an effecti v m s, most not a p eaker identif i a vailable train a tion-Maximi z i mu m -likeliho o t a produce de t o rs in training a lgorithm co m d ecreasing in l therefore con v r formance de g s is a functio n l assifier [3], s e asing model c e ech Recogni t m ethod based o m ation [4]: F i a dded mixtur e h e likelihood f v ergence to th e is not guara n m eters of the P ublished Onlin e a n Mo d o Cry- B H esam Farsa i l Engineering, É E mail: hesam.fa r Rece i g method to i n e Learning ( B s for estimati n m ponents. The r oduced meth o e lected set of m o logical condi t n ts (MFCCs) erformance t h in multi- p ath o a rning; Gauss i o rith m ; Cry Si g h e capability t y shaped den s v e probabilis t a bly in speak i cation [1]. T h i ng data usin g z ation (EM) a o d (ML) [2]. t rimental effe c of the GM M m es with a gu a l ikelihood fun v erges to local g radation due t n of the numb o there are st i c omplexity. F o t ion (ASR) s y o n random spl i i rst, there is n e from rando m f unction prior t e optimum poi n n teed due to t h randomly sp l e October 2013 d els via B ased D i e Alaie, Ch a É colede Technol r saie-alaie.1@e n i ved June 2013 n troduce a ne w B ML). The m e n g the GMM p discriminativ e o d. Then, the G m edical condi t t ion is created feature vecto r h an the traditi o o logical classi f i an Mixture M g nals t o s i- t ic er h e g a a l- A c ts M s. ar - c- ly t o er i ll o r y s- it - n o m t o n t h e l it Gaussi a thod h a model s ture L e Marko v aforem techni q the dis c linear m this p u b een t r model mum M GM M p roach trainin g on-line newbo r which treatm e Regres s ceptro n Probab tion ( R such a s discri m (http://www.sc i Adapt e D iagnos t a kib Tadj ogy Supérieure , n s.etsmtl.ca w learning alg o e thod possess e arameters, du e e splitting ide a G MM classifie r t ions. Each gr o by using the a r . The test res u o nal method b a f ication task. M odel; Splittin g a ns. More re c a s been used t s [5,6]. Anoth e e arning (BML ) v Model (H M entioned prob l q ues for estim a c riminative s p m ixture densit i u rpose, the pa r ansformed i n as presented i M utual Inform a M s represent that enables o g (EM algori t classificatio n r n infants ca n are currently e nt. Recently s ion Neural N n (MLP), Ti m ilistic Neural N R BF) and hybr i s bagging an d m inating betw e rp.org/journal/ e e d BM L t ic Sys t , Montréal, Can a o rithm for Ga u e s the ability e in part to a n a is employed r was applied o up includes b a dapted BML m u lts demonstr a a sed upon ran d g of Gaussians ; c ently, the tr a t o solve som e e r new metho d ) to learn Ga u M M) is introd u l ems in other a a ting the GM M p litting idea h i es in a speec h a rameters of G n to their eq u n [8,9], and t h a tion (MMI) f r a statistical p o ptimal proce t hm) the clas n . Cry- b ased d n be valuable undetectable u several classi N etwork (GR N m e Delay Neu r N etwork (PN N i d systems un d d boosting [1 0 e en normal an d e ng) L Meth o t em a da u ssian Mixtur e to rectify the n ew mixing-up for Gaussian to distinguish b oth full-term m ethod and 1 3 a te that the in t d om splitting a ; a ditional boos t e p roblems o f d called Boos t u ssian mixtur e u ced to overc a vailable con v M parameters [ 4 h as been used h recognition t G aussian mo d u ivalents in l o h en trained in r amework. attern recogn i ssing of data sifier and pe r d iagnostic sy s in medical p u ntil it is too fiers such as N N), Multi-L a r al Network ( N ), Radial Ba s d er several ap p 0 ] were exa m d sick infan t ’s ENG o d e Models existing strategy mixture between and pre- 3 -dimen- t roduced a nd EM- t ing me- f mixture t ed Mix- e Hidden ome the v entional 4 ]. In [7] for log- t ask. For d el have o g-linear a Maxi- i tion ap- both for r forming s tem for p roblems late for General a yer Pe r - ( TDNN), s is Func- p roaches m ined for cry sig- H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 278 nals [11-17]. In our previous work [18], we made use of cry signals to distinguish between healthy and sick in- fants both full-term and premature. Most of the previous studies [11-17,19] concentrate on health status of infants via a binary classification task, but this paper focuses on identifying several different pathological conditions. In this article a method for splitting of Gaussian mixture densities is presented based on the boosting algorithm to maximize the frame-level ML objective function. The performed experiments on the diagnosis of infants’ dis- eases show that it has fairly superior performance to the conventional method based on random splitting and EM-based re-estimation. This paper is organized as follows: In Section 2 we give a brief review of GMM. Section 3 explains the dif- ferent parts of introduced learning algorithm. In Section 4, preprocessing steps and experiments are reported, and in section 5 a follow-up analysis of the results and a con- clusion are presented at the end to finalize this paper. 2. Gaussian Mixture M o d el A complete GMM for a D dimensional continuous value data vector called X can be represented by the weighted sum of M Gaussian component densities ,, kkk c 1, ,kM as follows: 11 ;, 1, MM Mkkkkk kk FXc Xc (1) where each mixture component k is a D-dimen- sional multivariate Gaussian distribution and ,, kkk c are the mixture weights, mean vector and covariance matrix respectively. Since GMMs are used usually in unsupervised learning and clustering problems with un- known number of mixtures and their parameters, the choice of model configuration is almost determined by the amount of data available for estimating the GMM parameters in a particular application. GMM, as a para- metric probability density function with the following adapted learning method could be a successful candidate for cry-based physical or psychological status identifica- tion system. 3. Adapted Boosted Mixture Model Generally, boosting method combines weak learners or base classifiers in a weighted majority voting scheme to improve the overall classification accuracy for almost any type of learning algorithm [20,21]. The main idea of boosting is that instead of always treating all data points as equal, component classifiers should specialize on cer- tain examples. Moreover, some recent work has shown that the boosting method can effectively increase the margin of all training samples, which can be explained by a theoretical view related to functional gradient tech- niques [4,22]. We should note that the boosting algo- rithm does not always improve the accuracy of a learning algorithm nor does it always increase the margin. In the presented method a new component k and its weight k wcan be trained based discriminatively based on a predefined objective function, denoted as , in an optimal way. Then, they will be added to the pre- vious mixture model k-1 F which has k − 1 mixture components to grow into a new mixture model k F . 1 1kkkkk F XcFcX (2) Objective function is defined as the log likelihood function of the mixture model k F , based on all training data 12 ,, T XX X. 1 log T kkt t F FX (3) where k w is a weight to combine the new mixture component with the current model. When a new mixture component k is added, it will increase the ML objec- tive function with respect to Funtil the criterion which will be explained later is met. 1 1k-1kk CεF+εN>CF (4) where is a small deviation constant. Thus, the new mixture component k should be estimated in order to increase the ML objective function the most. By em- ploying Teylor’s series and predefined inner product of mixture models p and Q over training samples, T tt t1 1 P,QP XQ X T (5) the optimal new component can be obtained by: k * kk1kk1 argmaxF ,F k Tkt t1 k1 t X argmax FX (6) The new mixture component is generated along the direction of functional gradient where the objective func- tion grows the most. There is no closed-form of the op- timization problem for GMMs, but it can be solved by optimizing a lower bound on the boosting learning for- mula with the EM algorithm [4]. After estimating * k , the mixture weight * k c can be obtained by using the following line search: ** 1 0,1 1 k kkkkk c cargmax cFc (7) 3.1. Process of Adding a New Component In this method, a single Gaussian model initialized by ML training is estimated to fit the data at first, and then H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 279 in each step it is split into two Gaussians followed by learning via introduced method. In the splitting or adding process the part of training vectors in which k X has a higher value than the reminder of the mixture mod- el, denoted by kk F is selected. Then this subset of data indicated by s ub X should be modeled by a small GMM consisting in two Gaussian components called * k and k1 . The initial component came from the EM-based re-estimation, and then the second component and its weight were estimated based upon adapted BML method. We considered the estimated component—the second one—as an initial component and run the algo- rithm again. This process continues repeatedly, until it reached the optimal maximum log-likelihood estimate of parameters over s ub X . This procedure for finding the best two new components 1k and * k continued for 1, ,kK . Amongst all the created K mixture models, denoted by 1K F , the one that gave the highest value of the objective function was selected and added to the mixture by adjusting its weight. This iterative density splitting process in ML frame work is repeated as long as the added component causes an increase in the prede- fined objective function. 3.2. Partial and Global Updating During previous step, instead of finding the new mixture weight from the line search, there is an alternative me- thod called partial updating in which each new compo- nent and its weight are estimated at the same time, which is preferable since it may result in more robust and relia- ble estimation. , ** 1 , ,1 kk kkkk kk c cargmaxcFc (8) The iterative re-estimation formula for model parame- ters n1 11 k Φ, nn kk at the th n1 iteration can be evaluated as follows: [4]: 11 1 n ktk n tn nn kktkk ktk X wX cX cFX 1 n nt tk Tn t t wX wX 1? 1 1T nnn kkt t ccwX T 1 1 . T n n ktkt t X 111 1 .? TTr n nnn ktktktk t XX (9) where n t wX denotes the weight assigned to sample t X at the th n iteration, similar to sample weights used in the traditional boosting algorithms and 1 , kkk . Moreover, in order to speed up con- verging process and finding the minimum number of Gaussian component in the final mixture, the current mixture model k F should be updated globally over training data samples before adding the next component. For example in the GMM with k components, denoted by k F , the th k component can be re-estimated for 1, ,kK when the reminder of the mixture mode is assumed to be fixed. It means that after obtaining a mix- ture model K F , we could update each component k and its weight over all training feature vectors by using the same updating equations. The parameters updating phase, subsequent to splitting the selected density in half, brings about an increase in the objective function through the localized training of each component separately. 3.3. Initialization of Sample Weights A problem may arise when the initial values of the weights are chosen by boosting theory as follow: 0 11 1/ tktk wXF X (10) The dynamic range of 1k F is large in a way that it could be dominated by only a few number of outliers or samples with low probabilities. We use the so-called “Weight decay” method [23] to overcompensate for the low probability by smoothing sample weights based on power scaling. 0 11 (1 0/1,)p tktk wXF Xp (11) where p is a decay parameter or an exponential scaling factor. In the second method the idea of sampling boost- ing in [24] is applied to form a subset of training feature vectors according to the mean and variance values of the decayed weights. Afterwards, vectors contained in the previously created subset are utilized with equal weights to estimate the new component parameters. Assume M and 2 denote the mean and variance of weights calcu- lated in equation (9) as defined below. 0 t meanlog wMX 20 t variancelog wX (12) Then, the aforementioned subset with large weights is selected as described below: 0 sub tt XXlogwXM (13) where is a linear scaling factor to control the size of subset s ub X . In the experiments, we set 0.05p and 0.5 to overcome over fitting and these same para- meter values which utilized for BML algorithm in [4]. H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 280 3.4. Criterion for Model Selection The process of adding new mixture component to the previous mixture model is continued incrementally and recursively until the optimal number of mixtures is met. The set of Gaussian components selected should re- present the space covered by the feature vectors. For this purpose, the selected strategy to stop the adding process is a criterion-based called Bayesian Inference Criterion (BIC). It can be represented as the following [25]: log kk BIC kFMT (14) where k F is the log-likelihood function of the mix- ture model over all training data, k M is the number of parameters used in model k F , and T denotes total number of training data. Figure 1 shows a brief review of all mentioned processes to train a GMM for each available pathological condition in order. A simple pro- cedure to evaluate the presented learning method is to monitor the progress of the method during learning phase with a created training dataset, whose samples have been drawn from a known mixture of multivariate Gaussian distributions. Given training data with 600 two-dimen- sional samples, we wish to estimate the parameters of the GMM, ,, kkk c , which in some sense best matches the distribution of the training feature vectors. Figure 2 shows the final trained GMM and the whole discriminative splitting process after each substitution step. We compare the log-likelihood score between our method and the mentioned traditional method at the end of the discriminative training of this model. The negative log-likelihood score of the estimated GMM bears a close resemblance to that of the trained model with the tradi- tional method consisting of the correct number of Gaus- sian components on the same data, whose values are 3 2.7682 10 and 3 2.7684 10 respectively. 4. Experiments 4.1. Preprocessing and Features Extraction It would be worthwhile to find a clear correlation be- tween infants’ medical statuses and extracted cry charac- teristics. This concept could prove useful in the early infant diagnosis system. Several different cry characte- ristics and features were described in [19,26] and have Figure 1. Block diagram of adapted BML technique. been shown to work well in practice for distinguishing between a healthy infant’s cry and that of infants with asphyxia, brain damage, hyperbilirubinemia, Down’s syndrome, and mothers who abused drug during their pregnancies. Therefore, selecting the most informative features to distinguish between healthy baby class and pathological infant classes with different pathology con- ditions has a significant role in pathological classification tasks. Table 1 shows the list of available different pa- thological conditions and the number of samples in each class; totaling 63 cry signals for each healthy and sick infants classes including both full-term and premature per class. In a similar way to typical speech recognition systems, the pre-processing and the feature extraction phases are modeled in such a way that irrelevant information to phonetic content of the cries should be eliminated as far as possible i.e. nurses talking and environmental noises. On the other hand, the Mel-Frequency Cepstral Coeffi- cients (MFCCs) are selected to be extracted from the cries which contain the vocal tract information [27]. This type of excitation source characteristics is one of the popular schemes in speaker recognition and identifica- tion systems [27-30]. It is common practice to pre-em- phasis the signal prior to computing the speech parame- ters by applying the filter 1 10.97Pz z [31,32]. In all related practical applications, the short terms or frames should be utilized, which implies that the signal characteristics are uniform in the region. Prior to any frequency analysis, the Hamming windowing is neces- sary to reduce any discontinuities at the edges of the se- lected region. A common choice for the value of the window length is 10 - 30 ms [32-34]. A total number of 12 MFCCs ,1,,12 n Cn are computed directly from the data [31,35]. For better per- formance, the 0th cepstral coefficient 0 C is appended to the vector which is simply a version of energy (i.e., weighting with a zero-frequency cosine). Therefore, each frame is represented by a 13-dimensional MFCCs feature vector [33]. 4.2. Multi-Pathology Classification In training phase of algorithm, in order to estimate the parameters of GMMs for pathology classes, almost 63% of total cry signals were employed and the reminder for system evaluation. The GMM classifier is employed to identify infants’ pathological conditions. The Maximum Likelihood (ML) decision criterion is applied to assist in choosing between hypotheses. #argmax j j P athology ClassX (15) where j X shows the likelihood of a feature vector X given a Gaussian model i for th i pathology class. H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 281 (a) (b) (c) Figure 2. Estimated contour (a) of first Gaussian component, (b) after splitting GMM into 2 components, (c) of final GMM. Table 1. Cry database. Infants State Pathologies Number Full term Healthy N /A 38 Sick Bovine protein allergy13 Tetralogy of Fallot5 Thrombosis in the vena cava13 Premature Healthy N /A 25 Sick Tetralogy of Fallot9 Cardio complex14 X chromosomal abnormalities9 This multi-pathology classification was done by using predefined feature vectors extracted from different frame durations (10, 20, 25, 30 msec) with the same overlap percentage (30%) between two consecutive windows to assess what improvements it may have. Nevertheless, our results show that, on the average, it had a better accuracy rate compared with the traditional method based on random splitting and EM-based re- estimation for GMMs as our reference system. It is worth mentioning that the GMMs created by the traditional method for each class were trained by setting the number of components equal to that of mixture model learned by adapted BML method. The coefficient of variation (CV) is used to represent the reliability of performance tests. It gives the standard deviation as a percentage of the mean values which is computed from frequency distribution over all pathology classes as follows [36]: 100% StandardDev i ation CV Mean (16) Due to space limitation, Table 2 shows only the re- sults for two frame length (10 ms and 20 ms) as the most reliable results. Note that the states correspond to the order given in Table 1. It can be seen that both methods delivered great performances for most pathology classes, but based on the frequency distribution of the cry sam- ples. The presented method for 20 ms frame size had Table 2.Obtained accuracy rate (%) for multi-pathology task. 20 msec 10 msec State EM-Based ABML EM-Based ABML 1 100 100 100 100 2 100 100 80 80 3 100 100 100 100 4 75 100 75 75 5 100 88.9 100 100 6 100 100 100 100 7 80 60 80 80 8 100 100 100 100 Mean 94.16 94.58 92.08 92.08 CV 10.9 12 11.8 11.8 better final accuracy rate. Moreover, the larger the CV, the more the performance varies. 5. Conclusion An adapted mixture learning method for GMMs based on boosting algorithm is introduced in this paper. Advanced techniques of signal processing, and machine learning were employed in different parts of the learning process such as adding a new component per step, weighting function for samples, model selection, and global re- estimation of parameters. The focus of this paper has been on the application of discriminative training via introduced GMM-ABML as it pertains to the pathology detection through infants’ cry signals. For each path- ology class in our cry database, the adapted BML method trained a mixture model with a separate Gaussian pool as a cry-pattern. The results show that, on the average, it delivers a higher classification accuracy rate (94.58%) than the traditional method based on random splitting and EM-based re-estimation. It might be early to reach strong conclusions since there are not enough cases of the pathological classes, but the results have the potential First feature Second feature 810 1214 16 18 20 2224 0 2 4 6 8 10 12 14 16 18 First Fea tu re Second Feature 810 12 1416 18 20 22 2 4 6 8 10 12 14 Fir s t Fe a tu re Second Feature 810 12 1416 18 20 22 2 4 6 8 10 12 14 H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 282 to serve as a mixture learning method for further research. We are currently trying to use alternative discriminative criteria like MMI rather than ML and collecting more sample cries for further tests. 6. Acknowledgements We would like to thank Dr. Barrington and members of neonatology group of Mother and Child University Hos- pital Center in Montreal (QC) for their dedication of the collection of the Infant’s cry data base. This research work has been funded by a grant from the Bill & Melinda Gates Foundation through the Grand Challenges Explo- rations Initiative. REFERENCES [1] D. A. Reynolds and R. C. Rose, “Robust Text-Indepen- dent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Au- dio Processing, Vol. 3, 1995, pp. 72-83. http://dx.doi.org/10.1109/89.365379 [2] A. P. Dempster, et al., “Maximum Likelihood from In- complete Data via the EM Algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, 1977, pp. 1-38. [3] L. P. Heck and K. C. Chou, “Gaussian Mixture Model Classifiers for Machine Monitoring,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 6, 1994, pp. VI/133-VI/136. [4] D. Jun, et al., “Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Li- kelihood for Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, 2011, pp. 2091-2100. http://dx.doi.org/10.1109/TASL.2011.2112352 [5] M. Kim and V. Pavlovic, “A Recursive Method for Dis- criminative Mixture Learning,” Proceedings of the 24th International Conference on Machine Learning, 2007, pp. 409-416. [6] V. Pavlovic, “Model-Based Motion Clustering Using Boosted Mixture Modeling,” Proceedings of the 2004 IEEE Computer Society Conferences on Computer Vision and Pattern Recognition, Vol. 1, 2004, pp. I-811-I-818. [7] W. Boyu, et al., “Gaussian Mixture Model Based on Ge- netic Algorithm for Brain-Computer Interface,” 3rd In- ternational Congress on Image and Signal Processing (CISP), 2010, pp. 4079-4083. [8] G. Heigold, et al., “Equivalence of Generative and Log- Linear Models,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, 2011, pp. 1138-1148. http://dx.doi.org/10.1109/TASL.2010.2082532 [9] G. Heigold, et al., “On the Equivalence of Gaussian and log-Linear HMMs,” INTERSPEECH, 2008, pp. 273-276. [10] E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning, Vol. 36, 1999, pp. 105-139. http://dx.doi.org/10.1023/A:1007515423169 [11] M. Hariharan, et al., “Normal and Hypoacoustic Infant Cry Signal Classification Using Time-Frequency Analy- sis and General Regression Neural Network,” Computer Methods and Programs in Biomedicine, Vol. 108, 2012, pp. 559-569. http://dx.doi.org/10.1016/j.cmpb.2011.07.010 [12] M. Hariharan, et al., “Pathological Infant Cry Analysis Using Wavelet Packet Transform and Probabilistic Neural Network,” Expert Systems with Applications, Vol. 38, 2011, pp. 15377-15382. http://dx.doi.org/10.1016/j.eswa.2011.06.025 [13] E. Amaro-Camargo and C. Reyes-García, “Applying Statistical Vectors of Acoustic Characteristics for the Automatic Classification of Infant Cry,” In: D.-S. Huang, et al., Eds., Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodo- logical Issues, Vol. 4681, Springer Berlin/Heidelberg, 2007, pp. 1078-1085. [14] S. Cano, et al., “A Combined Classifier of Cry Units with New Acoustic Attributes,” In: J. Martínez-Trinidad, et al., Eds., Progress in Pattern Recognition, Image Analysis and Applications, Vol. 4225, Springer Berlin/Heidelberg, 2006, pp. 416-425. [15] O. Galaviz and C. García, “Infant Cry Classification to Identify Hypo Acoustics and Asphyxia Comparing an Evolutionary-Neural System with a Neural Network Sys- tem,” In: A. Gelbukh, et al., Eds., MICAI 2005: Advances in Artificial Intelligence, Vol. 3789, Springer, Berlin/ Heidelberg, 2005, pp. 949-958. [16] S. C. Ortiz, et al., “A Radial Basis Function Network Oriented for Infant Cry Classification,” In: A. Sanfeliu, et al., Eds., Progress in Pattern Recognition, Image Analy- sis and Applications, Vol. 3287, Springer, Berlin/Hei- delberg, 2004, pp. 15-36. [17] J. Orozco and C. A. R. Garcia, “Detecting Pathologies from Infant Cry Applying Scaled Conjugate Gradient Neural Networks,” European Symposium on Artificial Neural Networks, Bruges, 2003. [18] H. FarsaieAlaie and C. Tadj, “Cry-Based Classification of Healthy and Sick Infants Using Adapted Boosting Mix- ture Learning Method for Gaussian Mixture Models,” Modelling and Simulation in Engineering, Vol. 2012, p. 10. [19] O. Wasz-Hockert, et al., “Twenty-Five Years of Scandi- navian cry Research,” New York, 1985. [20] C. Bishop, “Pattern Recognition and Machine Learning,” Springer, Berlin, 2006. [21] R. O. Duda, et al., “Pattern Classification,” John Wiley & Sons, 2001. [22] L. Mason, et al., “Functional Gradient Techniques for Combining Hypotheses,” In: A. J. Smola, et al., Eds., Advances in Large Margin Classifiers, MIT Press, Cam- bridge, 2000, pp. 221-246. [23] S. Rosset, “Robust Boosting and Its Relation to Bagging,” Proceedings of the 11th ACM SIGKDD International Conferences on Knowledge Discovery in Data Mining, Chicago, Illinois, 2005. H. F. ALAIE, C. TADJ Copyright © 2013 SciRes. ENG 283 [24] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, Vol. 55, 1997, pp. 119-139. http://dx.doi.org/10.1006/jcss.1997.1504 [25] G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, Vol. 6, 1978, pp. 461-464. http://dx.doi.org/10.1214/aos/1176344136 [26] M. J. Corwin, et al., “The Infant Cry: What Can It Tell Us?” Current Problem Pediatrics, Vol. 26, 1996, pp. 325- 334. http://dx.doi.org/10.1016/S0045-9380(96)80012-0 [27] M. D. Plumpe, et al., “Modeling of the Glottal Flow De- rivative Waveform with Application to Speaker Identifi- cation,” IEEE Transactions on Speech and Audio Processing, Vol. 7, 1999, pp. 569-586. [28] W. Longbiao, et al., “Speaker Identification by Combin- ing MFCC and Phase Information in Noisy Environ- ments,” IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010, pp. 4502- 4505. [29] K. S. R. Murty and B. Yegnanarayana, “Combining Evi- dence from Residual Phase and MFCC Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 13, 2006, pp. 52-55. [30] Z. Nengheng, et al., “Integration of Complementary Acoustic Features for Speaker Recognition,” IEEE Signal Processing Letters, Vol. 14, 2007, pp. 181-184. http://dx.doi.org/10.1109/LSP.2006.884031 [31] S. Young, et al., “The HTK Book (for HTK Version 3.4),” Cambridge University Engineering Department, 2006. [32] L. R. Rabiner and R. W. Schafer, “Digital Processing of Speech Signals,” Prentice-Hall, Upper Saddle River, 1978. [33] X. Huang, et al., “Spoken Language Processing: A Guide to Theory, Algorithm, and System Development,” Pren- tice Hall, Upper Saddle River, 2001. [34] M. Benzeghiba, et al., “Automatic Speech Recognition and Speech Variability: A Review,” Speech Communica- tion, Vol. 49, 2007, pp. 763-786. http://dx.doi.org/10.1016/j.specom.2007.02.006 [35] J. John R. Deller, et al., “Discrete Time Processing of Speech Signals,” Prentice Hall, Upper Saddle River, 1993. [36] D. Zill, et al., “Advanced Engineering Mathematics,” Fourth Edition, 2011. |