We start with analyzing stochastic dependence in a classic bivariate normal density framework. We focus on the way the conditional density of one of the random variables depends on realizations of the other. In the bivariate normal case this dependence takes the form of a parameter (here the “expected value”) of one probability density depending continuously (here linearly) on realizations of the other random variable. The point is, that such a pattern does not need to be restricted to that classical case of the bivariate normal. We show that this paradigm can be generalized and viewed in ways that allows one to extend it far beyond the bivariate or multivariate normal probability distributions class.
This paper can be viewed as an extension of our previous work (Filus and Filus [
It is a well-known fact that among existing multivariate probability distributions, there are no more than a few classes that are widely and successfully applied in practical stochastic modeling procedures. Typically, the underlying random variables are assumed to be independent or having an “approximately Gaussian” bivariate or multivariate distribution. The normality often is assumed even when corresponding data hardly agree with that mathematical model (showing asymmetry, for example). On the other hand, from all the multivariate distributions used in applications, the normal seems to be “the best”. The reason for this is that the Gaussian models catch the stochastic relationship (mainly by a regression function) between its marginal random variables in the most natural way. We first analyze and interpret the specific way the multivariate normal density of the random vector
Pursuing this method successively for
In Section 2, we analyze the stochastic dependences between marginal random variables of the bivariate normal in order to point out the original version of the parameter dependence pattern next extending to other constructed bivariate probability densities. The explanation as well as the example of applications of the bivariate normal is different from that in Filus and Filus [
In Section 7, we point out that the “method of parameter dependence” is used in some more areas of reliability theory for different situations than we are considering. This is a part of the accelerated life testing theory where the dependence of life time distribution’s parameter from a given (high) stress is investigated.
Another (fairly new) area is the “load optimization theory” sometimes associated with the load sharing phenomena analysis that we sketch in Subsection 7.2. The differences between these approaches and our theory are pointed out in 7.2.
We start with the following situation. Suppose the normally distributed random variable X2 describes an attribute of a physical or biological object, say u. Consider the (stochastic) behavior of the object u in two distinct “physical” situations. In the first situation, u is exposed to some random stress whose magnitude is described by a normally distributed random variable X1. In the second situation we assume there is no such a stress present or the stress takes on a fixed predetermined value. The usual task here is to determine the joint distribution of X1, X2. Let the densities of X1, X2 be normal, i.e.,
Imagine the following fictitious experiment whose goal is to establish the possible stochastic impact of a medication’s dose change on some cancer treatment results. Suppose a person of a certain fixed age, was diagnosed with a kind of cancer. Assume that one of the significant characteristics of that kind of cancer is a tumor with a size X2. During a given time period T after the patient was diagnosed, a specific medication was administered. Also suppose that this medication was routinely administered in the past, and that the average dose is estimated (or fixed) to be m1 milligrams per kilo of weight daily. Assume that, originally, the known (either measured or estimated) average size of the tumor is, say, m2 millimeters and after the period T of treatment the tumor size X2 is measured again and its negative or positive increment
We assume that the goal of the underlying experiment is to make a prediction on effect
If we were interested in finding the joint probability distribution of X1, X2 it is enough to determine the conditional density g2(x2|x1) of X2|X1, since the marginal density of X1 is not changing.
In accordance with the “linear regression rule”, the dependence of the (new) expected value
where a = r(σ2/σ1) and r is the (linear) correlation coefficient of the variables X1, X2 .
This approach directly leads to the determination of the conditional density of the random variable X2 given any realization X1 = x1. It is a well-known fact that the conditional density g2(x2|x1) is, again, normal and
i.e., the
The joint density g(x1,x2) of the random variables X1, X2 is given by the usual arithmetic product g2(x2|x1)g1(x1) .
In the example above, one can reinterpret the “response random variable” X2 to be for example the patient’s “residual life-time”, or blood pressure, or level of some important chemical in the blood (such as cholesterol). In such cases the mathematics of the problem would remain the same.
Note the obvious fact that the tumor size X2 does not have a physical influence on the medication dose X1 so that the original marginal pdf g1(x1) remains the same. However, the stochastic dependence between X1, X2 is mutual, since, in general, g1(x1|x2) ≠ g1(x1).
It is well known that the actual problem with the bivariate normal density construction is to get to the conditional density (2), which fully represents the underlying stochastic dependence of random variable X2 on X1.
Our claim is that the above paradigm for the stochastic dependence (characteristic for the bivariate Gaussians) can be extended to other classes of bivariate and multivariate distributions (see Filus and Filus [
Historically, people relied on the nice symmetry in the stochastic dependence of X1 and X2 when using their joint bivariate normal distribution. This kind of symmetry (i.e., both marginal and both conditional distributions are normal and both sides regression functions are linear) can only be achieved with the linear regression functions as described above (1). However, are linear regression functions really the only functions that one can successfully apply within this framework? Assuming that the function m2(X1) is any continuous function in X1, one obtains a wide and interesting extension of the class of bivariate normal densities. We called this class FF-normal (previously named “pseudonormal”, see [
With the bivariate FF-normal densities of (X1, X2) we can use general continuous m2(x1) and σ2(x1) functions, and, performing similar calculations as above, we find, rather surprisingly, that g2(x2|x1) is once more a regular normal density in x2.
Consider now the following situation with a bivariate FF-normal distribution in which the “physical” interpretation of the underlying random variables can now be more general than above. Let u1, u2 be two objects (or phenomena) which are characterized by the random variables X1, X2 respectively. If the objects are physically separated then the random variables X1, X2 are assumed to be independent, having normal pdfs
For the conditional density of X2|x1 we have:
The functions
More explicitly, one obtains the bivariate FF-normal pdf in the form:
where
In particular, one may consider the “nonlinear regression function”
with arbitrary real parameters
Realize that in the case A = 0 and
Generally speaking, the essence of the construction method is that for any pair of (“initially independent”) random variables X1 and X2 with given probability densities
Then the joint density of the pair (X1, X2) is always
This situation is especially natural if we consider X2 to be the life-time of an object and X1 is the stress put on it.
Roughly, one can say that the construction method of bivariate distributions, presented above is an extension of the method used in the construction of the bivariate normal.
Consider a 2-component (say u1, u2) parallel system reliability setting in which X1, X2 represent the components’ life-times (see Barlow and Proschan [
One can also imagine this situation as follows. During the two components’ “in-system” performance, component u1 creates a situation in which component u2 is “constantly bombarded” by a string of harmful “micro-shocks” (see Filus and Filus [
Suppose the lifetimes of the components u1 and u2 in “laboratory conditions” are independent and distributed according to the Weibull density random variables X1 and X2.
Let
Here, for k = 1, 2, we have the “vector parameter”
Next consider the components u1 and u2 as acting within the system. Let the resulting (changed) values
One then obtains the wide class of bivariate FF-Weibullian densities:
where, for ease of computation, we recommend to apply as “sub-model” the following family of “parameter functions”:
In particular, s may depend on x1.
Another analytically interesting “sub-model” is given by:
Note that both factors g1(x1) and g2(x2|x1) of the joint density g(x1, x2) given by (3) are Weibullian densities. In particular, g2(x2|x1) is Weibullian with respect to the argument x2 alone.
For the simpler FF-exponential example, see [
A parallel and basically independent path of investigation, which also has its roots in the bivariate normal distribution’s dependence paradigm, is present in the literature under the key word “conditioning”.
This method, used in the construction of numerous multivariate probability distributions, was extensively developed mostly since around 1987. See, for example, Arnold, Castillo and Sarabia [
The underlying method (by numerous authors called the “conditioning method”) relies on imposing conditional structure X|Y and Y|X on, given in advance, “baseline” probability densities f(x; A) and g(y; B) of some (“initially independent”) random variables X and Y respectively, where A and B are scalar or vector parameters. The two conditional densities are defined as we did above, i.e.
g(y|x) = g(y; B(x)) and f(x|y) = f(x; A(y))
where A(y) and B(x) are continuous functions of realizations of the random variables Y, X respectively.
In this case the task is to find two proper (unknown) marginal densities for the bivariate probability distribution of (X, Y) which are, as a rule, not unique and sometimes do not exist.
Despite similarities this method essentially differs from ours. In our case, instead of the two conditional densities g(y|x) and f(x|y), we define only one, say, g(y|x), but together with the marginal f(x).
Pursuing this way we always directly obtain a unique model simply as the product of the two (known) densities.
In such a way, we have obtained a wide class of bivariate densities which is essentially disjoint from the class obtained by that alternative method. Also, the physical interpretation of the, so defined, conditional densities differs in the two approaches. However, both approaches are devoted to the same purpose which is to extend of the paradigm of the bivariate normal in stochastic modeling. Nevertheless, using the conditioning method it is very difficult to construct the multivariate distributions of any higher than two dimensions.
Practically that method reduces to the bivariate cases while the method we present has a remarkable easiness of construction of probability distributions of, actually, arbitrary finite dimension. There is, namely, a recurrence procedure which allows to construct any j-th dimensional pdf based on corresponding (j – 1)-th dimensional pdf
The next section is devoted to the construction of multivariate distributions for any arbitrary finite dimension.
For the construction, mentioned in the title, we successively use the simple recurrence method that yields the j-th dimensional probability density, given the (j – 1)-th one. Realize, that (for j = 3) we have already defined the 2- dimensional densities g2(x1, x2) by means of the products g1(x1)g2(x2|x1), where each underlying conditional density was given by g2(x2|x1) = g2(x2, q2(x1)).
So the “first step” is already done. Suppose now that we have at our disposal the (j – 1)-th dimensional (j ³ 3) pdf, say
Now, the “new” value
The j-dimensional pdf of the random vector
The latter pdf becomes the basis for identical construction of the (j + 1)-dimensional pdf and so on.
We then stop the procedure once j + 1 = m, where m is the total dimension of the considered (maximal) random vector, say,
Since the analogy with the construction of each j-dimensional normal pdf
where the random variables
where B is a lower triangular and M is an orthogonal matrix. From (5) we obtain that any nonsingular lower triangular matrix B can be represented as the product:
where A is an arbitrary nonsingular matrix. If we replace representation (4) of the random normal vector X by the following representation
then we replace the arbitrary random vector X by an arbitrary “triangular” random vector Y related to X by:
where MT is an arbitrary orthogonal transformation.
Since the two zero-expectation random vectors X – µ, Y – µ are obtained one from another by an isometry (here, rotation) MT in the Euclidean space Rj, they may be considered as representing the same “stochastic data” expressed in two different (but still rectangular) coordinate systems. So from a stochastic viewpoint the “difference” between the random vectors X and Y is inessential and we can consider the random vector Y as an “arbitrary normal” (“with accuracy to the rotation” MT).
Collecting all the above, we will consider the normal random vector Y, given by (7), where matrix B is any lower triangular matrix and Z is the standard normal j-vector. Write (7) in the form:
where m ³ j is the actual dimension of the constructed (final) random vector, say
Considering the first –1 lines in (7*) as a system of linear equations, one obtains all
Realize that transformation (9) is easily reversible.
Assuming that realizations
where from the above assumed nonsingularity we have ckk ≠ 0. From (10) it follows that the conditional density of each Yk, given the values
while for the (constant) conditional variance we obtain
To adopt the above procedure to our concept of “baseline” Tj versus “in system” Yj random variables, replace in (9) the independent standard random variables
Replace transformation (7) by
where
This yields the conditional pdf of
Finally, the general pattern of “creation” of any successive j-variate normal pdf can be explained as follows.
Given are the first j – 1 lines of transformation (11) in the form:
for some
We may assume that the next baseline random variable Tj, originally having the N(µj, sj) pdf, is incorporated to the “system” by transforming
This transformation is thought of as adding to (12) the following j-th line:
[“Physically” this could mean that the variables
From (13) one can determine the conditional pdf of Yj, given any realization
Thus, as the j-th “object” (originally independent from the “system” and characterized by the random quantity Tj) was “put into the system” the quantity Tj turns to the quantity Yj and, in parallel, the parameters µj and sj of its normal density are turned into
Clearly, the new value
of realizations
As we have shown also in multivariate cases, the origin of the “parameter dependence method for the construction”, lies in the construction of the multivariate normal distributions. Recall that having defined the conditional pdf
Preserving the general spirit of the multivariate normal pdf derivation, let us extend all the Equations (13) for
where Fj() and Yj() are arbitrary continuous functions and
From (13*) we obtain its inverse:
and then for each observation
It is clear that the sequence of the densities (14)
However, the marginal pdfs of
The main conclusion which follows the considerations in Sections 6.2 - 6.4 may be stated as: There is a generic relationship that associates the construction method of the parameter dependence with the stochastic dependence structure present within the multivariate normal distribution of any dimension.
As an example of this relationship realize that the transformations (13) and (13*), when applied to the independent normal random variables
Let now
Applying to the random vector
where the latter is the two parameter exponential density with respect to yj for
Another interesting case of the m-variate FF-Weibullian pdf can be obtained by applying transformations (13*) to m independent Weibullian random variables. An even more general class of FF-Weibullians one obtains using the pseudopower transformations (see Filus and Filus [
All these distributions (including the m-variate normal) can as well be obtained by direct use of the “parameter dependence pattern” which produces more m-variate models than the considered above transformations. On the other hand existence of the defining transformations facilitates an underlying statistical analysis and simulations.
Some paradigms, applied in the reliability literature, are exactly those of the “parameter dependence” that we describe in this paper. However, in most of the cases they are not directly related to the problem of construction of multivariate probability distributions (so, also are different from the “conditioning” procedures in [
When testing the life times of some high reliability products, the stresses usually encountered such as temperature, humidity, voltage sometimes are kept on significantly higher than usual levels in order to make the life times shorter than they are in normal conditions. The so obtained data (a “sample”) is then extrapolated into those (hypothetical) life times that would, possibly, be obtained under the regular values of the stresses. Existence of rules, that associate the products’ life times with values of the stresses applied, is necessary for performing proper extrapolations. Several such rules, typically known as the Arrhenius or Eyring (see, Meeker and Escobar [
Unfortunately, with this method the simplicity often comes along with inaccuracy of the predictions. Other methods apply the “Proportional Hazards Relationships” known also as Cox Model (see Cox [
More recently ([
In what follows we discuss the differences.
1) The generality of the “parameter dependence theory” we built in this paper, is significantly higher than the very special case applied to the accelerated life testing theory. There are three reasons for that.
Firstly, in our approach the subject of constructing conditional probability distributions is not limited to the life testing, and not even to the “stress-life time” pattern only. The range of applications of our theory is very wide, including many biomedical (see, Collett [
Secondly, in the paradigm we consider, the relation between a parameter and stress (or any other random quantity) is given by an arbitrary continuous function, while the number of such functions applied in association with the accelerated life time testing is very limited. Actually, the functions are restricted to few “models” such as the Arrhenius, Eyrie, inverse power law, log-linear, and not many more (see, for example, the Eyring-Weibull model in [
Our idea is to omit the complicated physical or chemical phenomena that often are poorly understood and to apply two steps purely empirical approach.
Speaking roughly, the first step is an “educated guess” (for choice of a proper function) and the second is statistical verification of this guess.
Thirdly, in our theory we may consider an arbitrary parameter of an arbitrary probability distribution as a stress dependent, while, according to our knowledge (see, for example, [
2) Besides the generality (of the constructed conditional distributions) our concept also differs with regard to the purpose. Namely, independently of the conditional distributions construction, we also have the construction of bivariate and multivariate probability distributions such as the FF-normal, FF-exponential, FF-Weibullian, FF-gamma and other (for comparison with similar “conditioning methods” of construction present in the literature, see Section 5).
The construction of high dimension multivariate distributions based on parameter dependence can easily be extended to Markovian and non-Markovian (still simple!) stochastic processes (see Filus and Filus [
1) Other than the accelerated life testing subject, where the “parameter dependence paradigm” is applied, is a set of problems centered around the notion of “load optimization” (see Filus [
2) Similar application of the parameter dependence pattern also occurs when “load sharing phenomena” takes place. Suppose that we have a parallel system supporting a load such as several engines aircraft or two electric power lines. Failure of any system’s component may cause the total load to be redistributed among fewer components, so that the load on each of them increases by some predictable value. Now we may encounter either the load optimization problem [
Remark. As a final remark, let me mention the relationship between the parameter dependence presented in this paper, and the stochastic dependence based on models initiated in 1961 by Freund [
Quite opposite to that, in the models we introduce, the component interactions take place only when the components work. Any failure of a system component stops its influence on the remaining components’ life times. Therefore, the two paradigms, the “Freund’s load sharing” and our “parameter dependence”, are “disjoint” and in a sense “complementary”. In reality, both (physical) phenomena may take place at the same time and it seems to be quite possible in the future to construct stochastic models (i.e., multivariate probability distributions) that would obey both paradigms.
Nevertheless, we stress the generic relation of all the multivariate probability distributions based on the parameter dependence with the multivariate Gaussians.