Journal of Transportation Technologies, 2013, 3, 220231 http://dx.doi.org/10.4236/jtts.2013.33023 Published Online July 2013 (http://www.scirp.org/journal/jtts) Configuration for Predicting TravelTime Using Wavelet Packets and Support Vector Regression Adeel Yusuf, Vijay K. Madisetti School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA Email: adeel@gatech.edu, vkm@gatech.edu Received May 28, 2013; revised June 28, 2013; accepted July 5, 2013 Copyright © 2013 Adeel Yusuf, Vijay K. Madisetti. This is an open access article distributed under the Creative Commons Attribu tion License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ABSTRACT Traveltime prediction has gained significance over the years especially in urban areas due to increasing traffic conges tion. In this paper, the basic building blocks of the traveltime prediction models are discussed, with a small review of the previous work. A model for the traveltime prediction on freeways based on wavelet packet decomposition and support vector regression (WDSVR) is proposed, which used the multiresolution and equivalent frequency distribution ability of the wavelet transform to train the support vector machines. The results are compared against the classical support vector regression (SVR) method. Our results indicated that the wavelet reconstructed coefficient when used as an input to the support vector machine for regression performed better (with selected wavelets only), when compared with the support vector regression model (without wavelet decomposition) with a prediction horizon of 45 minutes and more. The data used in this paper was taken from the California Department of Transportation (Caltrans) of District 12 with a detector density of 2.73, experiencing daily peak hours except most weekends. The data was stored for a period of 214 days accumulated over 5minute intervals over a distance of 9.13 miles. The results indicated MAPE ranging from 12.35% to 14.75% against the classical SVR method with MAPE ranging from 12.57% to 15.84% with a predic tion horizon of 45 minutes to 1 hour. The basic criteria for selection of wavelet basis for preprocessing the inputs of support vector machines are also explored to filter the set of wavelet families for the WDSVR model. Finally, a con figuration of traveltime prediction on freeways is presented with interchangeable prediction methods. Keywords: TravelTime Prediction; Wavelet Packets; Support Vector Regression; Advanced Traveler Information System 1. Introduction Accurate traveltime forecast information has become a fundamental component of all ATIS (Advanced Traffic Information Systems). Currently, drivers demand an ac curate traveltime calculator that can forecast their com mute time in advance. This forecast is even more sig nificant in the morning and evening hours, when the commuters face jammed freeways and they want to avoid the peakhour congestion. Drivers prefer precise infor mation of the future traffic conditions to manage their route. Presently, most of the State Department traffic websites provide the current traffic conditions, some sites even calculate a forecast of the travel time based on the historical data and/or current data by employing a suit able algorithm [1,2]. The traveltime is dependent on multiple factors that are related through a complexdependent relationship with one another. Such factors include weather condi tions, driver behavior, and time of the day etc. This com plexdependence makes the traffic data both nonlinear and nonstationary. Consequently, accurate prediction of travel time becomes a challenging task. Travel time prediction method can be classified from different perspectives as shown in Figure 1. While, a brief overview of all types is given in Section 2, the fo cus of this paper is on improving a shortterm data driven prediction method. Table 1 shows a brief overview of the prior art in this area. The prediction horizons in Table 1 range from 5 minutes to 60 minutes. However, lower forecast horizons are not very useful for commuters in the realworld sce nario as there are delays involved in every module of the traveltime prediction process; the process diagram of the prediction process is shown in Figure 2. Artificial Intelligence methods were extensively used C opyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 221 Travel Time Prediction Prediction Methodology Prediction Horizon Short Term Prediction Long Term Prediction Datadriven methods Traffic Flow Model based Approach Direct Indirect Input Data Type Figure 1. A taxonomy of travel time prediction approaches. Data Acquisition & Storage ILD ILD ILD ILD ILD ILD Preprocessing Traffic Database Historical Realtime Freeway Info Filtered Data Traveltime Estimation Trajectory Based Traffic Flow based Filtered Data Traveltime Prediction Model Parameters Testing Training Predicted Traveltime Traveltime Prediction Process Data Acquisition & Storage ILD ILD ILD ILD ILD ILD Preprocessing Traffic Database Historical Realtime Freeway Info Filtered Data Traveltime Estimation Trajectory Based Traffic Flow based Filtered Data Traveltime Prediction Model Parameters Testing Training Predicted Traveltime Figure 2. Process diagram of traveltime prediction meth ods. in traveltime prediction [710]. Most of this work was concentrated on the shortterm traveltime prediction, (prediction horizon less than 60 minutes) mainly using the artificial neural network (ANN) technique. On the other hand, machine learning methods, such as support vector regression (SVR), that have shown superior per formance when compared with other traditional methods for prediction of nonlinear data, have not been applied aggressively in the area of traveltime prediction. Support vector machines since their inception by Vap nik [11,12] were extensively used in classification and prediction problems. SVM uses a simple geometric in terpretation and gives a sparse solution. The solution of SVM is also global and unique as SVM employs the structuralriskminimization principle. The support vec tor regression method [13] approaches the linear regres sion forecast by addressing it as a convex optimization problem (details in section 4). Its performance in finan cial time series forecast [14], bioinformatics [15] and various other areas of research also makes it a viable method in intelligent transportation systems (ITS) appli cations. SVR application as a forecasting tool in ITS was first done by Wu [5], who predicted shortterm travel time on the basis of past and current values. Recently, Wang in [16], used wavelet kernel support vector ma chine for regression to predict traffic flow in ITS appli cations. In the recent years many researchers decomposed time series into more informative domains like the wavelets transform [17], Stransform [18] etc., as an input to the SVR that showed more accurate results than the non decomposed method. This improved performance of SVR along with the ability of SVR to predict nonlinear data, formed the motivation of our research to explore the effectiveness of traveltime prediction using wavelet transformed traveltime values as an input to SVR. The rest of the paper is organized as follows: the problem statement along with some highlights of the past research is given in Section 2. Wavelet theory and Sup port vector regression are explained in Section 3 and 4, respectively. In Section 5 the proposed model is ex plained. Then we show the results of our model in Sec tion 6. Finally, the paper is concluded in Section 7, with a brief on the claims made and future research direction. 2. Problem Description The traveltime prediction problem can be viewed from the perspective of the input data type, prediction meth odology and prediction horizon as shown in Figure 1. Irrespective of the class of traveltime prediction, the fundamental components of the process are similar as shown in Figure 2. Below we explain each component with a review of the main published work done in each area. 2.1. Data Acquisition and Storage (ILD) Formulation of an accurate predictive inference relies significantly on the quality of the traffic data. A typical speed plot constructed using a portion of the dataset we used is shown in Figure 3. The blue area represents con gestion, while the red part shows the free flow speeds. Inductive Loop Detector (ILD) data based on its abun dance and known quality issues has been used as input data in most traveltime prediction papers [6,1925]. The scalability of the model also biased the choice of the re searcher towards choosing ILD as a data source. Other orms of datasets include probe vehicle data, traffic cam f Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI Copyright © 2013 SciRes. JTTs 222 Table 1. Comparison of related work. Prior Art Related to ShortTerm TravelTime Prediction Prediction Methods Author/Year of Publication Length of Roadway Accuracy/Prediction Horizon Neural Networks J.W.C. Van Lint (2004) [3] 5.28 Mi (8.5 Km) RMSEP: 7.7% MRE: 0.49% SRE 6% Horizon: 15 min Kaman Filter Chen and Steven Chien (2001) [4] 8 Mi (12.88 Km) MARE: 0.0173  0.0208 Horizon: 5 min Support Vector Regression Wu, Ho and Lee (2004) [5] 28  217.5 Mi (45  350 Km) RME:0.96%  4.42%, RMSE 1.33%  7.35% Horizon: 3 min PCA/Nearest Neighbor Rice and Zwet (2004) [1] 48 Mi (77.25 Km) RMSE: 2.6  11 (Approx) Horizon: 60 min Regression Kwon, Coifman and Bickel (2000), [6] 6.2 Mi (10 Km), 20 Mi (32.19 Km) MAPE: (Tree Method) 6.9%  28.7%, (Regression) 7.7%  23.3% Horizon 10  60 min travel time) is essential to calculate and evaluate the re sults (predicted travel time). The traveltime estimation methods are divided into two broad categories: trajec torybased and flowbased. Figure 3. Speed plot of a portion of the dataset. era feeds, and satellite data, data obtained from micro wave radar, license plate matching, and automated vehi cle tag matching. Before using ILD data as our data source, certain known issues required attention in context of the site selection and data preprocessing phases. Spacing be tween consecutive loop detectors directly affects the quality of the data captured. The standard spacing re quirement between consecutive loop detectors is not de fined in literature. However, [26] concluded that the de tector spacing of 1 to 1.5 km is optimum for the use of shortterm forecasting of traffic parameters. In [27], it was shown that a detector spacing of 0.33 to 1 mile does not destabilize the traveltime estimation errors, while [28] concluded that a detector spacing of 0.5 miles is sufficient to represent traffic congestion with acceptable accuracy. After data acquisition preprocessing steps are per formed on this data to ensure its validity. ILDs are prone to a number of errors [29]. These data errors are usually detected and removed using imputation methods [29,30]. [29] gave a linear model based on historical data using neighboring detectors to detect faulty values and through linear regression imputed the missing or bad values. The method proposed in [29] was adopted by CALTRANS for data processing of the loop detector data in California roadways. 2.2. TravelTime Estimation Like any prediction problem, the ground truth (estimated 2.2.1. TrajectoryBased Methods vert the timemean 2.2.2. Flow Based Metho ds g traveltime is through 2.3. TravelTime Prediction ch is mainly classified The trajectorybased methods con speeds collected from detectors to spacemean speed. Different methods are proposed to calculate link travel time from this speed. The two common methods are the midpoint method and the average speed method. Both of these methods assume a constant speed between links, which in reality is never the case especially when traffic is in transition from free flow to congestion or vice versa. Hence, the algorithms proposing a constant speed lose their accuracy with the increase in congestion [31]. Van Lint and Van der Zijpp proposed an alternate approach, the “Piecewise Linear Speed” method [32], which solved the function of the traveltime based on the time mean speed using an ordinary differential equation to calculate the trajectory of the vehicle in the section based on space mean speed. An alternate way of estimatin flowbased models which focus on capturing the dy namoics of traffic using trafficflow theory concepts, and through traffic data simulation, draw the traveltime of the segment. Accurate flow information is also required for a precise estimation; however, in most cases it is dif ficult to collect data from all onramps and offramps using the existing infrastructure, which becomes a bot tleneck for flowbased estimation methods. These models are, however, more popular in research involving traffic flow simulation. The traveltime prediction approa w.r.t. the prediction horizon, modeling approach and type of input data as shown in Figure 1. Further classification
A. YUSUF, V. K. MADISETTI 223 is also possible w.r.t. the road type (freeways, arterials); but, since the scope of this proposal is confined to free ways; we would not discuss the arterial traveltime pre diction problem. The historical data of traffic parameters can represent a historical data with cu es similarities when compared with hi ilters used in [2,42] pr N) were extensively us co understanding of th 3. An Overview of Wavelets nt a multiresolution traffic profile, which could be implemented to predict future values, in similar traffic conditions. This approach demands offline processing. The data is classified into different subtypes based on their characteristics. In [33] the data was subclassified into the “type of day”, for prediction of traveltime. This forecast method does not take into account the dynamics of traffic for traveltime prediction, which makes this method less robust for shortterm prediction. Consequently, it produces low ac curacy results, when the current traffic is not representa tive of its historical profile. Historical predictor is nor mally used for longterm prediction. A hybrid approach of combining rrent data was used in [34] where realtime data was captured directly from the road side terminals, and using it with aggregated historical data showed improved re sults. [1] used principal component analysis and win dowed nearest neighbor, while combining historical and instantaneous data. Traffic data shar storical data of the same day and time as the current data. Regression methods with coefficients varying with the time of the day were used by [1], [35] and [36] to predict traveltime. [6] also used linear regression with step wise variable selection method. Regression models involve the examination of historical data, thereby, ex tracting parameters, which represent traffic characteris tics, and projecting them into the future to predict tra veltime. Autoregressive integrated moving average (ARIMA) was introduced by [37] and [38] as an alternate to model the stochastic nature of traffic. [39] used auto regression model to predict travel time. Nonlinear time series with multifractal analysis was implemented in [40] and [41] for travel time prediction. Kalman and Extended Kalman F ovide good performance in predicting traveltime for one timestep ahead horizon, which is normally not more than 5 minutes, as the state model needs real observa tions to calculate each error term. Artificial neural networks (AN ed for marking nonlinear boundaries. To address the problem of a time series forecast, a subtype of ANN called the recurrent neural network (RNN) was consid ered suitable [19,24,43]. RNN has an internal state, which keeps track of the temporal behavior between classes. Different architectures of the Multilayer percep tron have been used to predict traveltime with an im proved accuracy [7,8,10,19,20,23,24,4345]. The support vector regression method was also investigated in [5,46]. On the other hand, traffic flow models work on the ncept of correlating the theory of fluid dynamics with vehicular flow. From the perspective of traffic flow models, traveltime prediction is more of a boundary condition prediction problem, because the flow model is designed offline, and it would predict the time based on the values of demand and supply at onramps and off ramps respectively. The model is run using a simulation scheme, which is based on the assumptions of the carfollowing, gap acceptance, and risk avoidance pa rameters. The simulation model predicts the aggregated parameters of simulated vehicles to display the predicted traveltime [47,48]. This makes traffic flow models very complex and requires a high degree of expertise and long manhours for design and maintenance. Traffic flow models give us a better e traffic flow dynamics, but as far as their accuracy for travel time prediction is concerned, they demand a pre cise infrastructure of input detectors, whose location would be defined by the flow model. To manage the supply and demand parameters, the flow models require additional detectors on each off and onramp. Traffic flow based models are a good method to evaluate the cause and effect of traffic phenomenon, but applying them for traveltime prediction would entail a huge de sign and maintenance cost for every freeway section. Due to their modular design, precision of traffic flow models, for traveltime prediction, would be as accurate, as the precision of the predicted inputs and boundary conditions. Wavelets are functions, which prese decomposition of a signal x using a mother function and a linear combination of its dilated and/or shifted ve sions (1). r , 1, us u xs s (1) where s defines the dilation and u defines the shift. To ensure orthonormalilty of basis functions [49] the time scale parameters are sampled on a dyadic grid on the timescale plane. Thus Equation (1) becomes , 1 tn . 2 2 jn j j t The orthonormal wavelet transform is then given by ,, 1 ,2 2 jjn n j j xt ψ xtt ndt To make the transform computationally effective the concept of subband coding [50] was used to filter the signal with a series of high pass and low pass filters to Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 224 analyze its high frequency and low frequency compo nents respectively. The input signal x(t) can now be rep resented in discrete domain as , ,, Jn Jnjn jn nz jJnz xtctd ψt. , The sampled scaling cj,n and wavelet coefficients dj,n ca ,21 ,21 To add translationinvariance in discrete wavelet tra n now be defined using high pass hl and low pass filter gl. ,1 . jnl jn lz cgc ,1 . jnl jn lz dhc nsform (DWT), maximum overlap discrete wavelet transform (MODWT) was introduced, which instead of down sampling and up sampling the signal introduces high and low pass filters up sampled by a factor of 2j−1. The up sampling filters also introduce redundancy in the output, since the number of samples at output in every level is equal to the number of samples in the input signal. This makes multiresolution analysis much more effect tive especially from the perspective of using this trans form as an input to another system. 1LM M 1 ,1, 2 0 . j l jn nl modN l dhc 1 1() ,1, 2 0 . j L MM jn jnl m lodN l ccg The filters can now be represented as a circular filter of the original time series. j L 1 , , 0 . M jl nnl l dhx modN 1 , , 0 . j L M nnl jl l cgx modN To generate the wavelet packet tree, both the approxi m 4. Support Vector Regression on the concept of ation and detail coefficients are decomposed instead of just the approximation coefficients as in the case of the DWT. Hence the wavelet packet distributes the fre quency of the original signal evenly between all coeffi cients as opposed to the wavelet transform where 50% of the signal frequency is in the first detail as shown in Figure 4. In the WDSVR model, we chose the wavelet packet transform to evenly distribute the signal frequency in each support vector module. Support vector machines (SVM) work Structural Risk Minimization [12] by transforming a low dimensional input x into a high dimensional feature space through a mapping function and then approximating the function f(x) using linear rression eg 1 , ii i D xwx b where b is the threshold. w is the normal vector to the hyperplane. The coefficients can be determined from the data by minimizing the regression risk function. 2 1 1N Reg , 2 i wwCyfx (2) where C is the cost function, which defines the tradeoff between training error and model complexity. The εSVR algorithm discards the training points that lie beyond the threshold ε defined by the user. Mathematically for 0 otherwise yfxyfx ε i i ε fx y (3) Equation (3) is also known as the Vapnik’s εinsensi tive loss function. Both Equation (3) and the regression risk unction Equation (2) can be minimized by introduc ing Langrangian multipliers α and * i to this quadratic problem, yielding the solution ** 1 ,, , ii i N xk xxb with ** 0, ,0 iii i function k(xi,x), wh for k(xi,x) is the 1,, .iN putedkernel ich is com by calculating the dot product of some feature space. A2 D1 D2 Signal A1 D2 D1 (a) 20 D221 2322 10 11 Signal (b) Figure 4. Frequency allocatlevel DWT. Frequency allocation of 2 level wavelet packet transform. ion of 2 Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 225 ,. D jj kxyx y 1j It is important to note that the kernel k(x,y) has a known an elet Packet Support Vector or regression alytical form and must obey the Mercer’s con dition. 5. Wav Regression The structure of wavelet packet support vect is schematically outlined in Figure 5. The model works by evenly distributing the original signal’s frequency us ing the wavelet packet transform into the SVR modules. The time series signal, which represented the traveltime of the freeway was sampled from the database, based on the prediction horizon selected. The time signal was then transformed using the wavelet packet decomposed sig nals, such as 21 , 0 j n nW, where j is the level of the de composition. Tt decomposition was done using a sliding windown in Figure 6. The window size he wavele nd Results t proposed travel nto two parts: or wavelet decomposed support ve the condition in Equation (4) is m e, is the error of the classical suppo method.ear from Equation (4) that WDSVR ce ata For accurate predictions of a nonlinear and non The second test was to detect if the reconstructed wavelet ing a certain pattern at using as show determines the number of input features given to the sup port vector machine. In our case the window size of 8 was selected and the decomposition was done at level 2. These wavelet coefficients were stored for the support vector regression module. The four frequency compo nents were processed through their respective support vector machines leading to compute one timestep ahead output, where the step was equal to the time interval be tween the consecutive input values. The support vector regression output was finally aggregated to calculate the traveltime forecast. Table 2 gives the step by step im plementation of the wavelet packet support vector re gression algorithm. 6. Experiments a 6.1. Selection of Mother Wavele The major computational load of the time prediction model was divided i computation of the wavelet packet reconstructedtime series data, and training of the support vector regression machines using the optimal cost and epsilon values. The grid search method was used for searching for epsilon and cost values. A definite procedure for selection of mother wavelets is yet to be established f ctor regression models. However, analyzing the wave let reconstructed signal in context of the characteristics of the support vector machines helped us in filtering the relevant wavelets basis. The accuracy of the proposed model is superior to the classical SVR model, if et. 2,0 2,12,2 2,3, SVR SVRSVR SVRSVR εεεεε (4) wher SVR ε It is cl rt vector would not produ more accurate results than SVR for shorter time horizons, knowing that prediction error is propor tional to the prediction horizon. In our datasets, the WDSVR gave more accurate results than the SVR me thod for prediction horizons of 45 minutes or more. We conducted two basic tests for the admissibility of all wavelets for the support vector machine module. 6.1.1. CrossCorrelation of Wavelet Decomposed D stationary dataset the reconstructed wavelet coefficients of successive windows should not be correlated with one another. A positive linear correlation of +1.0 would indicate a similar pattern to the SVR module for every input and would adversely affect its prediction accuracy. To test our hypothesis we computed the crosscorrelation of each window with the other. 6.1.2. Recurrence Relationship coefficients windows were follow a particular location. We know that the input data of the successive windows is nonlinear. The existence of a unique pattern at a similar location in the input signal would indicate a similar pattern to the support vector machine in every iteration, which in reality is not the case. Consequently, it would adversely affect the per formance of the SVR module. To detect such events we calculated the first difference of each successive window. Table 2. Algorithm for wavelet decomposed support vector gression. re 1) Sample traveltime array into subsets for their respective predict tion horizons 0 1 5 N k hk yt x, where h is the prediction horizon in minutes. 2) Initialize p = 0 and decompose the sampled signal using wavelet packet decomposition at level j = 2 7 , p jn kp Wytk. 3) Store Wj,n computed in step 2 for the SVR module and increment p = p + 1. 4) Repeat steps 2 and 3 until the end of the input array yt . 5) Increment n = n + 1 and repeat steps 2  4 until n = 2j. 6) Divide Wj,n into training and testing sets and compute one step ahead prediction value using their respective SVR modules. 7) Aggregate the predictions of all 4 SVR modules to calculate the predicted travel time. Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI Copyright © 2013 SciRes. JTTs 226 s subset of the data chosen at random ranging four days. In Figure 7(a) the wavelet recons tructed difference signal converged to zero at a similar p the first difference o wong the successive windows. On the other hand, the best performing wavelet at one hour prediction horizon, the Reverse Biorthogonal 6 rp as shown in Figures 7(b) and (d). Based on our admout of a total ofd our pn wavelet selection for WDSVR is needed,our results on the s haver work i To identify the above characteristics in the wavelet ignal we used a oint in every iteration. Figure 7(b) is f the of the Biorthogonal 1.1 filter output at level 2,3, hich indicates a linear correlation am .8 wavelet, showed no crosscorrelation or recurrence elationshi issibility tests, 9 wavelets were filtered 42, hence reducing the computational loa of roject by 21.43%. While a detailed study o election of wavelets for the support vector machines shown encouraging results to motivate furthe n this area. 6.2. An Alternate Configuration for Interchangeable The WDSVR and SVR have both proven suitable for traveltime prediction depending on the selected forecast Historic TravelTime Database Wavelet Tree Decomposition & Coefficient Reconstruction W 2,2 W 2,1 W 2,0 W 2,3 SVR 2,2 SVR 2,1 SVR 2,0 SVR 2,3 Ŵ 2,2 Ŵ 2,1 Ŵ 2,0 Ŵ 2,3 Predicted Travel Time Figuram of the wavelet decomposed horizon. In our dataset, weobserved that SVR is more accurate for prediction horizons of less than 45 minutes. From 45 minutes onwards, WDSVR gives more accurate results. Considering the effectiveness of both models in different horizons, we have proposed an interchangeable configuration in Figure 8, where traveltimes using both models were computed in parallel and then switch to the configuration for active use depending on the selected prediction horizon. The cloud component, which houses both the prediction models is flexible and can be either scaled horizontally or vertically toaccommodate for the computation overhead. 6.3. Experimental Setup ance Measurement Sys tem (PeMS) website [2]. The route of 9.13 miles on I5N was selected with a detector density of 2.73. The data was observed for 214 consecutive days commencing from March 01, 2011 to September 30, 2011 from 1 pm to 8 pm. The time slot was selected after observing the daily pattern of conges tion during this period. The data revealed daily conges tion in the evening hours except holidays and most weekends. This loop detector data was collected over a 5 minutes interval. The speed data was converted to traveltime series using the PLSB traveltime estimation method [32]. We decomposed the time series using the wavelet packet decomposition at level 2. The data was then reshaped into a u*v matrix with u = N − 7 and v = 8. The decomposed and reshaped wavelet transform of traveltime matrix gave us 2j matrices at level j repre sented as The data for our model validation and testing was col lected from the Caltrans Perform ,, 1,,8 , ,,7,, jnt jnt jn jnN jnN WW W WW The four matrices were given as input to their respec tive support vector machines with (N − 7) × 0.7 rows for training while the remaining 30% for evaluation. The re 5. Schematic diag support vector regression model. t8 t7t6 t5 t4t3 t2 t1 t14 t13 t12 t11 t10 t9 t8 t7 SVR 2,0 SVR 2,1 SVR 2,2 SVR 2,3 W 2,0,t+1 W 2,1,t+1 W 2,2,t+1 W 2,3,t+1 Predicted travel time value for time t + 1 W 2,0 t7         t1 t W 2,1 t7         t1t W 2,2 t7         t1t W 2,3 t7         t1t t7 t6 t5 t4 t3 t2 t1 t t14 t13 t12 t11 t10 t9 t8 t7  t8 t7 t6 t5 t4 t3t2 t1 t7 t6 t5 t4 t3 t2t1t Reshaped Travel Time Data Wavelet Coefficients Wavelet Coefficients Wavelet Coefficients Figure 6. Flow diagram of the algorithm for wavelet deco mpose d suppor t vector regression.
A. YUSUF, V. K. MADISETTI 227 (a) (b) (c) (d) Figure 7. A comparison of wavelet recurrence relationship and cross correlation of better and worse performing wavelets: (a) First difference signal of wavelet Packet Reconstructed time series at level 2,3 using Biorthogonal 3.3; (b) First difference signal of wavelet Packet Reconstructed time series at level 2,3 using Reverse Biorthogonal 6.8; (c) First difference signal of wavelet Packet Reconstructed time series at level 2,3 using Biorthogonal 1.1; (d) First difference signal of wavelet Packet Re constructed time series at level 2,3 using Reverse Biorthogonal 6.8. Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 228 evaluation matrix for each Wj,n above was represented as The predicted labels of each support vector machine were aggregated to compute the forecast time value. Fi nally the values generated by SVR were evaluated for errors. We tested our model using Debauchies, Coiflets, Symlets, Reverse Biorthogonal and Biorthogonal wave lets in 42 different configurations, with different values of cost and epsilon. It was observed that not all wavelets gave better results than the benchmark SVR predicted values. However, some of the worse performing wavelets were filtered out using our wavelet selection process to save computational cost. The best outputs in each time horizon subcategory were shown in Tables 13. Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE) and Pearson ProductMoment Correlation were the three indicators chosen for evalua tion of our model and for comparison with the classical Support Vector Regression model. Table 4 shows the comparison of MAPE between SVR and SVR with wavelet decomposed inputs. Table 5 shows comparison of Pearson productmoment correlation between SVR and SVR with wavelet decomposed inputs. Our results indicated that the wavelet decomposed support vector regression model consistently showed better performance for prediction horizon of 45 minutes and above but below 45 minutes the classical SVR method was more accurate. Figure 9 showed the better tracking ability of the proposed model in comparison with the SVR model. 7. Summary of Results The proposed wavelet packet decomposed SVR method showed improved results for traveltime data prediction over the conventional SVR method for prediction hori zons of 45 minutes and above. For accurate state estima tion through machine learning methods large datasets are Table 3. Comparison of RMSE betwee n SVR and SVR with wavele t de c o mpose d inputs (our appr oac h). tion Horizon ,, ,,1 , ,, 1 label jnt jnt jn jnN W W W Predic Prediction Methods 45min 60min 50min 55min bior2.6 ε = 0.1, C = 100 bior6.8 ε = 0.01, C = 100coif5 ε = 0.1, C = 100 db6 ε = 0.001, C = 100 Wavelet Packet SVR 2.2 2.31 2.41 2.46 ε = 0.01, C = 100 ε = 0.1, C = 1 ε = 0.001, C = 100 ε = 0.1, C = 10 SVR Predictor 2.26 2.4 2.48 2.88 Table 4. Comparison of MAPE (%) between SVR and SVR with wavele t decomposed inputs. Prediction Horizon Prediction Methods 45min 50min 55min 60min bior2.6 ε = 0.1, C = 1 rbio2.8 ε = 0.1, C = 100rbio2.8ε = 0.001, C = 100 rbio6.8 ε = 0.01, C = 100 Wavelet Packet SVR 12.35 13.1 13.66 14.74 ε = 0.01, C = 10 ε = 0.01, C = 100 ε = 0.1, C = 1 ε = 0.1, C = 100 SVR Predictor 12.57 13.5 13.96 15.06 Table 5. Comparison of Pearson productmoment correlation between SVR and SVR with wavelet decomposed inputs. Prediction Horizon Prediction Methods 45min 50min 55min 60min bior2.6 ε = 0.1, C = 1 bior6.8 0coif5 ε = 0.1, C = 100 db6 ε = 0.001, C = 100ε = 0.01, C = 10 Wavelet Packet SVR 0.870.8441 67 0.8623 0.8486 ε = 0.01, C = 100 ε = 0.1, C = 100 ε = 0.1, C = 10 ε = 0.1, C = 10 SVR Predictor 0.8702 0.8498 0.8381 0.8406 Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 229 PeMS LAN (100 Mbps) D3 ATM (45 Mbps) AT M Link FTP Session CALTRANS TMC CALTRANS TMC CALTRANS TMC TRANSACCT Ethernet/Router Cloud Services Traveltime Prediction Service Automobiles Head Unit / devices / Tablet / PC Mobile Flat Files Traffic DB Predefined SQL Queries Wavelet Packet Decomposition Support Vector ession Regr Predicte Travel Tim d es Travel Times using other methods Other Intelner Applications ligent Trasportation Cloud Svices Prediction Horizon CALTRANS WAN Figure 8. Propo for ATIS. sed configuration for traveltime prediction Fig traveltime by Support Vector Regression and Wavelet rt Vector Regression methods. needed, many of wle T iple methods would require significant computation cost and stowhich, we have posed an alternate framework with a cloud component, r the memory and computation requirements. We pro ucted coeffi ci tterns, some examples are by con by day of the week or both. so makes it a viable option , Vol. /TITS.2004.833765 ure 9. Comparison of actual travel time, and predicted decomposed Suppo hich, are now availab online.heir training with mult rage, for pro which could be scaled horizontally or vertically to cater fo posed a modular prediction method, where multiple pre diction algorithms are stored in the cloud and the best performing algorithm is selected based on the prediction horizon. We also investigated wavelet properties in con junction with their effectiveness for support vector ma chines. We observed that wavelet basis, whose cross correlation between the wavelet reconstr ents of successive windows resulted in a linear correla tion of +1.0 or the ones with recurrent relationships are not useful for WDSVR model and should be discarded to reduce the computation cost. In our dataset it reduced computational cost by 21.43%. Further improvements to our model might be made possible by subdividing the dataset based on its pa gested and free flow parts or The scalability of the model al for its application to calculate arterial travel times. REFERENCES [1] J. Rice and E. Van Zwet, “A Simple and Effective Method for Predicting Travel Times on Freeways,” IEEE Transactions on Intelligent Transportation Systems 5, No. 3, 2004, pp. 200207. doi:10.1109 i and S. I. J. Chien, “Development of a Hybrid M Dynamic Travelrediction,” Transportation ch: Planning and ionhin1. [3] H. van Lint,liable Travel Time Pren for Free ways,” Ph.D. Thesis, TU Delft, Delft, 2004. [4] M. Ch, “DynamiclTime Prediction wcle Data: Lased versus Path Based,” Transportation Research Record: Journal of 1768, 2001, pp. [2] C. M. Kuchipud odel forTime P Data Resear gton DC, 2003, pp Administra t, Was. 223 “Redictio en and S. I. Chien Freeway Trave ith Probe Vehiink B the Transportation Research Board, Vol. 157161. doi:10.3141/176819 [5] C. H. Wu, J. M. Ho and D. Lee, “TravelTime Prediction with Support Vector Regression,” IEEE Transactions on Intelligent Transportation Systems, Vol. 5, No. 4, 2004, pp 3. 276281. doi:10.1109/TITS.2004.8 7813 [6] J. Kwon, B.an and P. Bickel, “Day Travel Time Tren TravelTime Predicom Loop Detecto sportation Re Jour nal of the Transportation Research Board, Vol. 1717, 2000, pp. 1. doi:10.3141/171715 Coifm ds and ytoDa tion fr r Data,” Transearch Record: 20129 [7] A. Dharia and H. Adeli, “Neural Network Model for i neering Applications of Artificial Intelligence, Vol. 16, 03, pp. 607613. doi:10.1016 .011 Rapid Forecasting of Freeway Link Travel Time,” Eng No. 78, 20 /j.engappai.2003.09 [8] D. Park, L. Han, “Spectral Basis Neural e fovee F nal of Transportation Engineering, Volo. 6, 1999, pp. 51552 doi:10.1061/(ASCE)0733947X(1999)125:6(515) R. Rilett and G. Ntworksr RealTime Tral Timorecasting,” Jour . 125, N 3. Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI 230 [9] D. J. Park and L. sting Multi nk Trav works,” In: Land Use and Transportation Plannin Programming Applications, 1998, pp. 163170. [10] L. R. Rilett and D. Park, “Direct Forecasting of Freeway Corridor Travel Times Using Spectral Basis Neural Net works,” Transportation Research Record: Journal of the Transportation Research Board, Vol. 1752, 2001, pp. 140147. [11] C. Cortes and V. Vapnik, “SupportVector Networks,” Machine Learning, Vol. 20, No. 3, 1995, pp. 273297. doi:10.1007/BF00994018 R. Rilett, “Foreca el Times Using Modular Neural Net plePeriod Freeway Li g and [12] V. N. Vapnik, “The Nature of Statistical Learning The ory,” Springer Verlag, Berlin, 2000. doi:10.1007/9781475732641 [13] V. Vapnik, S. E. Golowich and A. Smola, “Support Vec tor Method for Function Approximation, Regression Es timation, and Signal Processing,” Advances in Neural In formation Processing Systems, Vol. 9, 1997, pp. 281287. [14] T. B. Trafalis and H. Ince, “Support Vector Machine for Regression and Applications to Financial Forecasting,” IJCNN 2000, Proceedings of the IEEE International Joint Conference on Neural Networks, Como, 27 July 2000, pp. 348353. [15] M. Song, C. M. Breneman, J. Bi, N. Sukumar, K. P. Bennett, S. Cramer and N. Tugcu, “Prediction of Protein Retention Times in AnionExchange Chromatography Systems Using Support Vector Regression,” Journal of Chemical Information and Computer Sciences, Vol. 42, No. 6, 2002, pp. 13471357. doi:10.1021/ci025580t [16] F. Wang, G. Tan and Y. Fang, “Multiscale Wavele port Vector Regression for Traffic Flow Prediction,” 3rd International Symposium on Intelligent Information Te chnology Application (IITA 2009), Nanchang, 2122 No vember 2009, pp. 319322. [17] S. Yao, C. Hu and W. Peng, “Server Load Prediction Based on Wavelet Packet and Support Vector Regression,” 2006 International Conference on Computational Intelli gence and Security, Guangzhou, 36 November 2006, pp 10161019. ssion based STransform,” IASTED, International Conference on Modelling, Simulation, and Identifica tion/658: Power and Energy Systems/660, 661, 662, Bei jing, 2009. [19] H. van Lint, S. P. Hoogendoorn and H. J. van Zu “State Space Neural Networks for Freeway Travel Time Prediction,” In: J. R. Dorronsoro, Ed., Artificial Neural Networks—ICANN 2002, Madrid, 2830 August 2002, pp 10431048. [20] S. Innamaa, “ShortTerm Prediction of Travel Time Us ing Neural Networks on an Interurban Highway,” Trans t Sup . [18] M. Faisal and A. Mohamed, “A New Technique to Pre dict the Sources of Voltage Sags using Support Vector Regre ylen, . portation, Vol. 32, No. 6, 2005, pp. 649669. doi:10.1007/s111160050219y [21] J. W. C. van Lint, “Reliable RealTime Framework for ShortTerm Freeway Travel Time Prediction,” Journal of 92193 doi:10.1061 Transportation Engineering, Vol. 132, No. 12, 2006, pp. 2. /(ASCE)0733947X(2006)132:12(921) [22] N. Zou, J. Wg and G. L. Chang, “A Reliable Hybrid . Wan Prediction Model for RealTime Travel Time Prediction with Widely Spaced Detectors,” 11th International IEEE Conference on Intelligent Transportation Systems, Bei jing, 1215 October 2008, pp. 9196. [23] N. Zou, J. W. Wang, G. L. Chang and J. Paracha, “Ap plication of Advanced Traffic Information Systems Field Test of a TravelTime Prediction System with Widely Spaced Detectors,” Transportation Research Record: Jour nal of the Transportation Research Board, Vol. 2129, 2009, pp. 6272. doi:10.3141/212908 [24] J. W. C. van Lint, S. P. Hoogendoorn and H. J. van Zuylen, “Accurate Freeway Travel Time Prediction with StateSpace Neural Networks under Missing Data,” Tran sportation Research Part C: Emerging Technologies, Vol. 13, No. 56, 2005, pp. 347369. doi:10.1016/j.trc.2005.03.001 [25] H. J. M. Van Grol, M. DanechPajouh, S. Manfredi and J. Whittaker, “DACCORD: OnLine Travel Time Predic tion,” World Transport Research: Selected Proceedings of the 8th World Conference on Transport Research, Pergamon, Oxford, 1999. [26] H. Chen, M. S. Dougherty and H. R. Kirby, “The Effects of Detector Spacing on Traffic Forecasting Performance Using Neural Networks,” ComputerAided Civil and In frastructure Engineering, Vol. 16, No. 6, 2001, pp. 422 430. doi:10.1111/08859507.00244 [27] P. Yi, S. Dinglor, “Investigating tor Spacing and Sample , H. Wei and G. W. Say the Effect of Detector Spacing on MidpointBased Travel Time Estimation,” Journal of Intelligent Transportation Systems, Vol. 13, No. 3, 2009, pp. 149159. [28] J. Kwon, K. Petty and P. Varaiya, “Probe Vehicle Runs or Loop Detectors?: Effect of Detec Size on Accuracy of Freeway Congestion Monitoring,” Transportation Research Record: Journal of the Trans portation Research Board, Vol. 2012, 2007, pp. 5763. doi:10.3141/201207 [29] C. Chen, J. Kwon, J. Rice, A. Skabardonis and P. Varaiya, “Detecting Errors and Imputing Missing Data for Single Loop Surveillance Systems,” Transportation Research Record: Journal of the Transportation Research Board, Vol. 1855, 2003, pp. 160167. doi:10.3141/185520 [30] L. N. Jacobson, N. L. Nihan and J. D. Bender, “Detecting Erroneous Loop Detector Data in a Freeway Traffic Man agement System,” Transportation Research Record, Wa cord: Jour shington DC, 1990. [31] C. D. R. Lindveld, R. Thijs, P. H. L. Bovy and N. J. Van der Zijpp, “Evaluation of Online Travel Time Estimators and Predictors,” Transportation Research Re nal of the Transportation Research Board, Vol. 1719, 2000, pp. 4553. doi:10.3141/171906 [32] J. W. C. van Lint and N. Van der Zijpp, “Improving a TravelTime Estimation Algorithm by Using Dual Loop Detectors,” Transportation Research Record: Journal of the Transportation Research Board, Vol. 1855, 2003, pp. 4148. doi:10.3141/185505 [33] M. Saito and T. Watanabe, “Prediction and Dissemination Copyright © 2013 SciRes. JTTs
A. YUSUF, V. K. MADISETTI Copyright © 2013 SciRes. JTTs 231 ilizing Vehicle Detect ort Systems World nsportation Engineering, Vol. 129, No. 6, System for Travel Time Ut Steps Forward: Intelligent Transp ors,” Congress, Yokohama, 911 November 1995, p. 106. [34] I. Steven, J. Chien and C. M. Kuchipudi, “Dynamic Travel Time Prediction with RealTime and Historic Data,” Journal of Tra 2003, p. 608. doi:10.1061/(ASCE)0733947X(2003)129:6(608) [35] X. Zhang and J. A. Rice, “ShortTerm Travel Time Pre diction,” Transportation Research Part C: Emerging Technologies, Vol. 11, No. 34, 2003, pp. 187210. doi:10.1016/S0968090X(03)000263 [36] H. Sun, H. X. Liu, H. Xiao, R. R. He and B. Ran, “Use of Local Linear Regression Model for ShortTerm Traffic Forecasting,” Transportation Research Record: Journal of the Transportation Research Board, Vol. 1836, 2003, pp. 143150. doi:10.3141/183618 [37] M. S. Ahmed and A. R. Cook, “Analysis of Freewa Traffic TimeSeries Data y by Using BoxJenkins Tech orecasting Freeway Oc niques,” Transportation Research Record, No. 722, 1979, pp. 19. [38] M. Levin and Y. D. Tsao, “On F cupancies and Volumes (Abridgment),” Transportation Research Record, No. 722, 1980, pp. 4749. [39] T. Oda, “An Algorithm for Prediction of Travel Time Using Vehicle Sensor Data,” Third International Confer ence on Road Traffic Control, London, 13 May 1990, pp. 4044. [40] M. P. D’Angelo, H. M. AlDeek and M. C. Wang, “TravelTime Prediction for Freeway Corridors,” Trans portation Research Record: Journal of the Transportation Research Board, Vol. 1676, 1999, pp. 184191. doi:10.3141/167623 [41] S. Ishak and H. AlDeek, “Performance Evaluation of ShortTerm TimeSeries Traffic Prediction Model,” Jour nal of Transportation Engineering, Vol. 128, No. 6, 2002, pp. 490498. doi:10.1061/(ASCE)0733947X(2002)128:6(490) [42] H. F. Ji, A. G. Xu, X. Sui and L. Y. Li, “The Applied Research of Kalman in the Dynamic Travel Time Predic tion,” Geoinformatics, 2010 18th Internation ence on, Beijing, 1820 June 2010, pp al Confer . 15. rent Neural Networks,” Advanced Traffic ignal ravel Time Predic [43] J. W. C. van Lint, S. P. Hoogendoorn and H. J. van Zuylen, “Freeway Travel Time Prediction with State Space Neural NetworksModeling StateSpace Dynamics with Recur Management Systems for Freeways and Traffic S Systems 2002: Highway Operations, Capacity, and Traf fic Control, No. 1811, 2002, pp. 3039. [44] C. P. I. van Hinstiergen, J. W. C. van Lint and H. J. van Zuylen, “Bayesian Training and Committees of State Space Neural Networks for Online T tion,” Transportation Research Record, Vol. 2105, 2009, pp. 118126. doi:10.3141/210515 [45] D. Park, L. R. Rilett and G. H. Han, “Forecasting Multi plePeriod Freeway Link Travel Times Using Neural Networks with Expanded Input Nodes,” Applications of Advanced Technologies in Transportation, No. 1617, H. Koutsopoulos and R. i, “Network State Estimation 1998, pp. 325332. [46] L. Vanajakshi and L. R. Rilett, “Support Vector Machine Technique for the Short Term Prediction of Travel Time,” 2007 IEEE Intelligent Vehicles Symposium, Istanbul, 13 15 June 2007, pp. 600605. [47] M. BenAkiva, M. Bierlaire, Mishalani, “DynaMIT: A SimulationBased System for Traffic Prediction,” DACCORD Short Term Forecasting Workshop, Delft, 1998. [48] M. BenAkiva, M. Bierlaire, D. Burton, H. N. Kout sopoulos and R. Mishalan and Prediction for RealTime Traffic Management,” Net works and Spatial Economics, Vol. 1, No. 34, 2001, pp. 293318. doi:10.1023/A:1012883811652 [49] S. G. Mallat, “A Wavelet Tour of Signal Processing,” Academic Press, Edinburgh, 1999. [50] M. Vetterli and J. Kovačević, “Wavelets and Subband Coding,” Prentice HallPTR, Upper Saddle River, 1995.
