Based on ODE and ARIMA Modeling for the population of China

doi:10.4236/ti.2013.41B006

Paper Menu >>

Journal Menu >>

Technology and Investment, 2013, 4, 27-30

Published Online Febr uary 2013 (http://www.SciRP.org/journal/ti)

Based on ODE and ARIMA Modeling for the population

of China

Xiaohua Hu, Min Yu

School of Mathematics and Statistics, Hainan Normal University, Haikou, China

Email: 1241957415@qq.com

Received 2012

ABSTRACT

The economic data usually can be also composed into a deterministic part and a random part. We establish ordinary

differential equations (ODE) model for the deterministic part from a purely mathematical point of view, via the prin-

ciple of integral and difference, establishing

()ARIMAp dq,,

model for the random part, to combine the two estab-

lished models to predict and control the original series, then we apply the method to study the population of China from

1978 to 2007, establishing the corresponding mathematical model, to obtain the forecast data of the population of China

in 2008(1.3879503 billion), finally we make further stability analysis.

Keywords: Natural Asse t; Financial Value; Neural Network

1. Introduction

A lot of time series (as economic data) can be regarded

as a discretization of continuous-time process. if a time

series

possesses biology back gr o und , we can establish

a differential equations model for

in accordance with

the growth rate, for example, suppose that constant

the growth limit of the variable

, the rate of growth

dY dt/

is proportional to

LY−

; suppose that the

relative rate of growth

()dY dtY//

is proportional to

()LY L−/

ln lnLY−

; some simple growth models

can be established respectively, such as the Logistic

model, Gompertz model[1] and so on, they are called the

growth mechanism models.

In general, a time series

Qt n,=, ,...,,

can be decom-

posed into two parts[2]: a deterministic part

and a

random part

ht n,=, ,...,

, or

t tt

Q Fhtn=+,=, ,...,

(1)

where

is usually described by the trend, seasonal and

cyclic term, and

is described by a more complex sto-

chastic approach. In this paper, we ignore specific eco-

nomic backgrounds of variable, from a purely mathe-

matical point of view, to establish differential equations

model for the discrete-time series

on some condi-

tions, while

can be described by

()ARIMAp dq,,

model[3].

2. Principles or technology of d ifferential

equations modeling

Suppose that

()0yt t,≤< +∞

, is a continuous function,

is regarded as the discretised value of

()yt

. Set

() ()()

X tytdtytt= ≈∆

∑

∫

()()( ()())X tytXttXtt

′=≈+∆ −/∆

, as

t∆

is small.

Now let’s consider the discrete case (time series) and let

1t∆=

(unit time). It follows that

()()()()( 1)()XtytX tytXtXt

′

≈,=≈ +−

∑

Denoting that

() ()

yytXX t= ,=

, if given a original time

series

()01

yyt tn=,=, ,...,

,its cumulative sum series

can be generated as

Xytn

=,=, ,...,

∑

,so that, if

we can find the relationship between the original series

and its cumulative sum series

as below,

() 001

t tt

Fy Xtn

,,= ,= ,,...,

where t

is a residual term, a random variable, sati sfie s

() 0()0

εε σσ

=,= ,>

.We now vi e w

() ()ytX t,

continuous, because of

() ()X tyt

′=

, ignoring

, cor-

responding to the one-order differential equation model

can be established as follows.

( ()())0()(())FXtX torXtf Xt

′′

,= =

(2)

We just need to solve out

()Xt

from (2), so,

()()( 1)()(0)(0)

yyt XtXtXty yX

′

==≈+−, ==,

1tmm n= ,...,.≥

( whe re

is a positive integer) .

X. HU, M. YU

In the same way,

()01

XXt tn=,=, ,...,.

is viewed as a

new time series, we can generate its cumulative sum se-

ries

() ()Zt Xt=∑

, or, the second cumulative sum of

,denote

()

Z Zt=

( )()(1)()Z xXtZtZt

′=≈ +−

.we consid-

er to establish the relationship of three series

t tt

yXZ

,,, if

there exists the relationship as follows

() 001

tt tt

Fy X Ztn

,,,= ,=,,...,

where

is the same with the previously mentioned.

We now view

() ()ytX t,

as a continuous case, because of

()() ()ZtX tyt

′′ ′

= =

, ignoring t

,the two-order differen-

tial equation model can be established.

(()() ()) 0

()( ()())

FZtZtZt

orZtfZ tZt

′′ ′

,, =

′′ ′

= ,

(3)

we just need to solve out

()Zt

from (3), so,

()()(1)()

(0)(0)(0) 1

XXt ZtZtZt

yyt Xt XtXt

yXZtmm n

′

==≈ +−,

′

==≈+−,

==,= ,...,.≥

In a general way, suppose that the original time series is

()Xt

. If

()Xt

is not pure random data, its value ’s change

on unit time is not random, or data has a trend. its the

first cumulative sum series is

()Xt

, ..., the

-th cu-

mulative sum series is

()1

X ttn,= ,...,

,where

is a

positive integer, if we can find the relationship of

() ()()

XtXtX t, ,...,

as belo w

(()()())001

FXtX tXttn

,,...,,= , =,,...,

where t

is the same as previous, ignoring

,the

-order differential equation model can be established.

()() ()

(( ))0

p pp

dXtd XtdXt

F Xt

dt dtdt

−

,,...,, =

()() ()

(( ))

p pp

dXtd XtdXt

orfX t

dtdt dt

−

=,..., ,

We just need to solve out

()

, so

()( 1)()111

ii i

Xt XtXtiptmmn

=+−,=,...,−,= ,...,.≥

(0) (0)(0)

XX X==... =

It is usually difficult to find

Ff,

,but, we can consider

to establish the multiple linear (or nonlinear) regression

model. Given a significance level

(for exam-

ple,

0 05

= .

),if the significance test for the regression

equation can be established, we can find the correspond-

ing differential equations at 1-

confidence level, which

can explain the reasonable degree for the established

differential equations. The adjusted R-squared (

) or

goodness-of-fit can describe the fitting degree of good or

bad. We call

0F=

the main model. for the residual se-

ries t

,we test that it is or isn’t random by some test

methods, if it isn’t a pure random series(such as

white-noise series[4]), it shows that there exists some

valuable information hid in residual series, the informa-

tion should be extracted out from the residual series by

making use of the B-J method[5], at this time, we shall

establish the model for residual series, It is called as the

auxiliary model. Finally, we combine the main model

and the auxiliary model to forecast.

3. Empirical Analysis Application

We study the Chinese population data[6] from 1978 to

2007, the total number of sample observations is 21, see

Table 1 Data of the population of China(unit:ten thousand)

1978

1980 1985 1990 1991 1992 1993 1994 1995 1996 1997

9625

9870

1058

1143

1158

1171

1185

1198

1211

1223

1236

1998

1999 2000 2001 2002 2003 2004 2005 2006 2007

1247

1257

1267

1276

1284

1292

1299

1307

1314

1321

We use the sample data from 1978 to 2005 to model,

leaving two sample data of 2006 and 2007 as a reference

to assess the short-term forecast to see the accuracy of

the established model.

3.1. Analyse the Relationship Between the

Original Series and its Cumulative Seri es

Let the original sample time series be

()0 120

ytyt=,=,,...,

, the time range to model is from

1978 to 2005.that is

0 118t=, ,...,

, the first cumulative

sum series of

()

X XtX= =

, see fig.1,fig.2. We

generate new series

ZR,

via

yX,

as follows

()() ln()

ZZtRRtX t

== ,==

The scatter plot of

and

is such as fig.3.We es-

tablish regression model below

(1) (2)(3)

t tt

Zcc RcR=++.

(4)

By making use of least-squares method and EViews6.0,

it is easy to obtain the estimation value of parameters

(1)(2)(3)cc c,,

, see fig.4 and Tab.2.

22 19772308771470 107731

t tt

Z RR= .−.+.

(5)

Table 2. Results of estimate and test (significance level

=0.05)

Vari able C oef fici ent Std. Error t-Statistic Prob.

c(1) 22.19772 1.334405 16.63491 0.0000

c(2) -3.0877147 0.202384 -15.25674 0.0000

c(3) 0.107731 0.007644 14.09369 0.0000

The adjusted R2 = 0.986264.The t-statistics of three re-

gression coefficients

(1)(2)(3)cc c,,

in (4) are respec-

X. HU, M. YU

tively 16.64, -15.26,14.09, their Prob=0.0000<0.05,

fig.1: the original time series

fig.2: the cumulative sum series

F-statistics=647.2. Prob = 0.0000<0.05. the significant

test for coefficients and the whole regression equation

were resulted. We think that the Chinese population data

from 1978 to 2005 can be described by (5) at 95% con-

fidence level, Goodness-of-fit reaches 98.6%. However,

DW= 1.317 shows existence of autocorrelation in the

residual series, It shows that there are still some valuable

information not to be extracted out from the residual se-

ries, so we will establish the model for the residual se-

ries.

fig.3: the scatter plot of

and

fig.4: the fitting and residuals plot of

and

3.2. Establish the Main Model–Differential Equ-

ations Model

It follows from (2),(4)and(5)that

()[ (1)(2)ln()(3)ln()]()

dX X tccXtcXtXt

dt ′

==++

(6)

( )[221983088ln( )0108ln()]( )

dX X tXtXtXt

dt ′

== .−.+.

(7)

by formula[7],when

40ac b−>

222

2arctan

ducu bC

a bu cuac bac b

= +,

++ −−

∫

It is easy to obtain from (6)

2 (3)ln(2)

2arctan

4 (1) (3)(2)4(1) (3)(2)

c XcCt

cc ccc c

++=

−−

exp{[(2)4 (1)(3)(2)]

2 (3)

tan[4(1) (3)(2)()]}

Xccc c

c cctC

=−+− ×

−−

exp{14 2961 074tan(1)}

8 624

XC=. +.+

0(0)96259169 519tX C=,=,=−.

, so

exp(14 2961 074tan(69 519))

8 624

X=.+.− .

(8)

0 125sec(69 519))exp(

8 624

14 2961074tan(69 519))

8 624

′=. −.

.+.−.

(9)

3.3. Establish the Auxiliary Model for Residual

Series

Let the residual series be

,we establish

()ARIMAp dq,,

model for

()

h ht=

,make 1-order,

2-order difference for

below

( )(1)

ttttt t

hhththhhhh

−−

∇=−−=−,∇=∇−∇

based on the analysis of the autocorrelation coefficients

and partial autocorrelation coefficients of

and

2-order difference of

.we take

2pqd= ==

,establish

(2 2)ARMA ,

model for

22 2

11221122ttttt t

hh h

ϕϕε θεθε

− −−−

∇ =∇+∇+−−

(10)

1 121213

2 4112 2

(2)(21)

t ttt

ttt t

h hhh

ϕ ϕϕϕ

ϕε θεθε

− −−

=++−− +

++− −

this is a

(422)ARIMA ,,

for

,where

is a

white-noise series. the parameters in (10) are estimated

by EViews6.0,

ˆˆ

ˆˆ 0 84360032800 9992

ϕϕ θθ

=., =.,=,=.

123

2 84362 65440 8436

0 03280 9992

tt tt

tt t

hh hh

εε

−− −

−−

=. −.+.

+.+− .

(11)

3.4. Forecast Based on the Main and Auxiliary

Model

By (8) and (11), as

=19, 20, 21,

=131190.5504,

136191.0595, 138795.0589;

=-0.007129, -0.0133,

-0.01979, so, by (1),the predictive data of the population

of China in 2006,2007 are 1.3119054,1.3619104 billion,

respectively, however, the actual data are 1.31448,

1.32129 billion, respectively; the absolute errors are 2.58,

40.62,respectively;the relative errors are 0.2%,

3.1%,respectively;It shows that the combination model

(1),(8) a nd (11) have a higher prediction accurac y, and

the predictive data of the population of China in 2008 is

X. HU, M. YU

1.3879503 billion. We also find that predictive value of

the auxiliary model (11) is little impact on the total pre-

dictive value. the total predictive value mainly depends

on the main model (8),or, mainly depends on the predic-

tive value of the differential equations model (7).So, we

can see that the short-term forecast accuracy is very high

based on differential equations modeling for the time

series on some condition.

We further consider the stability of equilibrium point of

(6) or (7). Let

[(1) (2)ln()(3)ln()]()()0

dX ccXtcXtXtf X

dt =++ ==

0[(1)(2)ln()(3)ln()]0XccXt cXt≠, ++=

, when

(2)4 (3) (1)0c cc−>

, there are two real roots, denoted by

,.there are two equilibrium points

XX,

1 122

exp( )exp()X uXu= ,=

. on the other hand,

()(1)(2)((1)2(3)) ln(3) lnfXccccX cX

′=++ ++,

When

2(2)4 (3)(1)0c cc− >,

2 12

()(2)4 (3) (1)

() (2)4(3)(1)()0()0

fXccc

fXcc cfXfX

′=−,

′ ′′

=−−.>,<

The equilibrium point

of (6) is stable,

or,

t XX→ +∞,→

.the equilibrium point

of (6) is

unstable, or,

tX→+∞,→ +∞

. so, we must control those

factors that impact

(1)(2)(3)ccc,,

in (6),such that

(2)4(3) (1)0c cc−>

. otherwise,

tX→ +∞,→ +∞

However, in fact, for model (7),

(2)4 (3) (1)0c cc−<

,it

show that there is no equilibrium point in (7), it is ob-

viously from (9),as

/8 62469 5192

. −.→,

613ty→,→ +∞

. it show that China’s population will

tend to infinity after 613 years, so, the model (7) is only

suitable for short-term prediction.

4. Acknowledgemen ts

Acknowledgements: the author is grateful to the ano-

nymous referees for his helpful comments and sugges-

tions.

REFERENCES

[1] Zhu Minhui. Fitting Gompertz Model and Logistic Mod-

el.J. Mathematics in Practice and Theory,2003;2:705-709.

[2] Peter J.Brockwell et al. Time series: theory and methods

(2nd edn).China Higher Education Press Beijing and

Springer-Verlag Berlin Heidelberg:Beijing,2001;75.

[3] Yi Danhui.Data analysis and Eviews application. China

Statistics Press:Beijing,2002;66-70.

[4] Zhang Shiying.The financial time series analysis. Tsing-

hua University Press:Beijing,2008;90-93.

[5] Philip Hans Franses.Time Series Models for Business and

Economic Forecasting[M].Beijing: Chinese People’s

University Press.2002.

[6] http//www.cpachn.org.cn/chinese/Teaching/Information.a

sp?,2009-05-02.

[7] Tongji University Department of Applied Mathematics.

Advanced Mathematics (5th edn), Beijing: Higher Educa-

tion Press.2004.