﻿Runs and Patterns in a Sequence of Markov Dependent Bivariate Trials

Open Journal of Statistics
Vol.1 No.2(2011), Article ID:6183,13 pages DOI:10.4236/ojs.2011.12014

Runs and Patterns in a Sequence of Markov Dependent Bivariate Trials

Kirtee Kiran Kamalja, Ramkrishna Lahu Shinde

Department of Statistics, School of Mathematical Sciences, North Maharashtra University, Jalgaon, India

E-mail: kirteekamalja@gmail.com

Received May 13, 2011; revised June 2, 2011; accepted June 8, 2011

Keywords: Markov Dependent Bivariate Trials, Conditional Probability Generating Function, Joint Distribution

Abstract

In this paper we consider a sequence of Markov dependent bivariate trials whose each component results in an outcome success (0) and failure (1) i.e. we have a sequence of valued Markov dependent bivariate trials. By using the method of conditional probability generating func-tions (pgfs), we derive the pgf of joint distribution of where for, de-notes the number of occurrences of i-runs of length in the first component and denotes the number of occurrences of i-runs of length in the second component of Markov dependent bivariate trials. Further we consider two patterns and of lengths and respectively and obtain the pgf of joint distribution of using method of conditional probability generating functions where denotes the number of occurrences of pattern of length in the first (second) n components of bivariate trials. An algorithm is developed to evaluate the exact probability distributions of the vector random variables from their derived probability generating functions. Further some waiting time distributions are studied using the joint distribution of runs.

1. Introduction

The distributions of several run statistics are used in various areas such as reliability theory, testing of statistical hypothesis, DNA sequencing, psychology [1], start up demonstration tests [2] etc. There are various counting schemes of runs. Some of the most popular counting schemes of runs are non-overlapping success runs of length [3], overlapping success runs of length [4], success runs of length at least, - overlapping success runs of length [5], success runs of exact length [6].

The probability distribution of various run statistics associated with the above counting schemes have been studied extensively in the literature in different situations such as independent Bernoulli trials (BT), non-identical BT, Markov dependent BT (MBT), higher order MBT, binary sequence of order, multi-state trials etc. But very little work is found on the distribution theory of run statistics in case of bivariate trials which has applications in different areas such as start up demonstration tests with regard to simultaneous start ups of two equipment, reliability theory of two dimensional consecutive out of: -Lattice system etc as specified by [7]. [7] have studied the distribution of sooner and later waiting time problems for runs in Markov dependent bivariate trials by giving system of linear equations of the conditional pgfs of the waiting times. The distribution of number of occurrences of runs in the two components of bivariate sequence of trials and their joint distributions are still unknown to the literature.

Consider a sequence of -valued trials where is set of all possible outcomes of trials under study. The simple pattern is composed of specified sequence of states i.e. where. The number of occurrences of patterns can be counted according to the non-overlapping or overlapping counting scheme. The non-overlapping counting scheme starts recounting of the pattern immediately after the occurrence of the pattern while the overlapping counting scheme of patterns allows an overlap of prespecified fixed length in the successive occurrences of patterns.

Recently the study of distributions of different statistics based on patterns has become a focus area for many researchers due to its wide applicability area. Distribution of, the waiting time for the occurrence of pattern of length in the sequence of multistate trials is studied by [1,8,9]. [10] considered the sequence generated by Polya’s urn scheme and study the waiting time distribution of for.

Joint distribution of number of occurrences of pattern of length and pattern of length in Markov dependent multi-state trials is studied by [11]. [12] considered a sequence of -   dimensional i.i.d. Random column vectors whose entries are -valued i.i.d. random variables and obtain the waiting time distribution of two dimensional patterns with general shape. The general method, which is an extension of method of conditional pgfs, is used to study these distributions by [12].

Even though the distribution of waiting time of the pattern of general shape in the sequence of multi-variate trials with i.i.d. components has been done, the joint   distribution of number of occurrences of patterns in the sequence of component of the -variate trials is still unknown. Here we derive the pgf of joint distribution of number of occurrences of runs in both the components of the bivariate trials and generalize this study to the distribution of number of occurrences of patterns in both components of the bivariate trials.

In this paper we consider the sequence of -valued Markov dependent bivariate trials. In Section 2, we obtain the pgf of joint distribution of number of occurrences of -runs of length in first components and -runs of length in the second components of the bivariate trials. We study this joint distribution of runs under the non-overlapping counting scheme of runs by using the method of conditional pgfs. Further in section 3, we study the joint distribution of number of occurrences of pattern of length in the first component and number of occurrences of pattern of length in the second component of bivariate trials. In Section 4, we develop an algorithm to evaluate the exact probability distributions of the random variables under study. As an application of the derived joint distributions, in Section 5, we obtain distributions of several waiting times associated with the runs and patterns in bivariate trials. In Section 6 we present some numerical work based on distribution of runs and patterns. Finally in Section 7, we discuss an application and generalization of the studied distributions.

2. The Joint Distribution of Number of Occurrences of Runs

Let. Consider the sequence of -valued Markov dependent bivariate trials with the transition probabilities,

and the initial probabilities

for.

Let be the number of -runs (= 0,1) of length in trials associated with the first (second) component of bivariate trials (i.e. number of -runs in. In this section we derive the joint distribution of

.

Let be the pgf of distribution of

. Assume that for a non-negative integer, we have observed until trial

(i.e. we have observed). We define as pgf of conditional distribution of number of -runs of length in and number of -runs of length in given that we have observed

and currently have -run of length in first component and -runs of length in the second component of bivariate trials is observed, , ,

We define for.

Now by assuming, we have,

(2.1)

Also we have,

(2.2)

Conditioning on the first trial we have the following system of recurrent relation of conditional pgfs for.

Conditioning on the next trial from each stage, we have the following system of recurrent relations of conditional pgfs for and.

if, ,

if, ,

if,

if, ,

if, ,

if, ,

if, ,

if,

if,

Thus we have recurrent relations of conditional pgfs for and these can be written as,

(2.3)

where

and is a square matrix of size. Each row of this matrix corresponds to corresponding element of and its elements are the coefficients of elements of.

For, we have,

where is the column vector with all the elements equal to 1.

Using (2.3) recurrently for, we have,

(2.4)

Hence from (2.1) and (2.4) we get the pgf of distribution of as,

(2.5)

where is the first column of identity matrix of order

.

3. The Joint Distribution of Number of Occurrences of Patterns

Let and be two patterns of lengths and respectively where,. We consider the non-overlapping counting scheme of patterns in which to count the number of occurrences of pattern in a given sequence of trials, one has to restart counting from scratch each time the pattern occurs. Let denotes the number of non-overlapping occurrences of patterns in the first component of bivariate trials (i.e. in) and denotes the number of non-overlapping occurrences of patterns in the second component of bivariate trials (i.e. in). Consider the random vector. In this section we obtain the pgf of joint distribution of using method of conditional pgfs.

Let pgf of joint distribution of be. Assume that for a non-negative integer, we have observed until trial i.e.

. For, we define, the following conditional pgfs.

Let be pgf of conditional distribution of number of occurrences of patterns in and number of occurrences of pattern in of bivariate trials given that currently (i.e. at trial) we have observed none of the subpatterns of and and, where.

Similarly let, be pgf of conditional distribution of number of occurrences of patterns in and patterns in of bivariate trials given that at trial we have observed the sub-pattern of length of in the first component and none of the sub pattern of is observed in the second component of bivariate trials and

where and.

Let be pgf of conditional distribution of number of occurrences of patterns in and patterns in of bivariate trials given that at trial we have observed none of the sub-pattern of in the first component and a sub pattern of the length of in the second component of bivariate trials is observed and where and.

Also let as pgf of conditional distribution of number of occurrences of pattern in and pattern in of bivariate trials given that at trial we have observed the sub-pattern of length of in the first component and a sub pattern of the length of in the second component of bivariate trials and where

and.

Let,

,

and,.

For, we assume that at trial, the sub-pattern of length of is observed in the first component of bivariate trials. If we condition on the next trial as i.e. the sub-pattern of length of observed at trial breaks at trial then to check whether any sub-pattern of of length has occurred, we define the indicator function as,

.

Similarly we have indicator function as,

.

Let

,

Assuming, we have,

(3.1)

We also have,

.(3.2)

Now for each (denotes the number of trials remaining to observe) we condition on the next trial to obtain the recurrent relations of conditional pgfs as follows.

if,

if

if

if.

Similarly we have recurrent relations of the conditional pgfs for as follows.

if

if

if

The recurrent relations of the conditional pgfs for and are as follows.

if

if

The above system of recurrent relations of conditional pgfs,;

,;,

and

can be written as follows.

(3.3)

where is column vector with all its elements 1 and is column vector with its elements as follows.

From (3.1) we get,

where is first column of identity matrix of order.

The recurrent use of (3.3) gives the pgf of as follows.

(3.4)

4. Exact distribution of

We note that the pgf of is a particular case of pgf of. Hence we develop an algorithm to obtain the exact probability distribution of which can further beapplied to obtain the exact probability distribution of from its pgf.

Observe that the pgf of joint distribution of as obtained in (2.5) in general involves matrix polynomial in, , and of order n where

.

Hence the joint probability distribution can be obtained by expanding the polynomial with respect to, , and.

That is,

i.e.

(4.1)

Exact formula to obtain the coefficient matrix is quiet tedious since multiplication of matrices is not commutative operation. But the interesting recurrent relations are found between these coefficient matrices. Let

be the coefficient matrix of

in the expansion of matrix polynomial and for, let

. The following Lemma gives the recurrent relations of the coefficient matrices of with.

Lemma 4.1 Let be the coefficient matrix of in the expansion of the matrix polynomial

Then satisfies the following recurrent relation.

(4.2)

with, , and for. Here is the row of the identity matrix of order 2.

Proof Obviously for, we have,

where O is the null matrix of same order as that of A.

For, observe that satisfies (4.2).

Assume that Equation (4.2) is true for some.

Hence we have,

Then

Hence we get the required proof of the lemma.

Theorem 4.1 The exact probability distribution of is given by,

,(4.3)

where is the coefficient matrix of in expansion of matrix polynomial and it satisfies (4.2)

Proof The proof follows by applying the Lemma 4.1 to matrix polynomial involved in the pgf of distribution of in (2.5).

Remark 4.1 The expected number of failure-runs of length in first components of Markov dependent bivariate trials is given by,

.

On simplifying this expression, we have,

(4.4)

5. Waiting Time Distributions Related to Runs and Patterns

The exact probability distribution of from its pgf given in (2.5) can be expressed as,

(5.1)

The components of the above expression can be interpreted as follows.

For,

and

.

Similarly for,

(5.2)

so that

(5.3)

The above interpretation is useful for deriving different waiting time distributions.

Let be the event that failure-run of length occur for the first time in the component of the Markov dependent bivariate trials and be the event that success-run of length occur for the first time in the component of the Markov dependent bivariate trials.

Let be the waiting time for sooner occurring event between, ,. To obtain the distribution of sooner waiting time we define the following random variables.

: The waiting time for sooner occurring event between, , given that is the sooner event (i.e. -run of length of component of bivariate trials is the sooner event) and.

: The waiting time for the first occurrence of -run of length in the component of the bivariate trials, and.

: The waiting time until the occurrence of both events and simultaneously, (i.e. - run of length in the first component and -run of length in the second component of the bivariate trials occur simultaneously).

Then we have,

.(5.4)

Let, , and be the pgf of, , and, and respectively. In the next sub-section we obtain the sooner waiting time distribution.

5.1. Sooner Waiting Time Distribution

The probability that 0-run of length (i.e. event) occurs for the first time at trial given that none of the events , has occurred until trial can be written as follows.

Hence pgf of can be given as,

.

Similarly the probability that 1-run of length (i.e. event) occurs for the first time at trial given that none of the events , has occurred until trial can be written as follows.

Hence pgf of is,

.

Similarly and is given by,

and pgf of is,

.

Now the probability that both events and occurs simultaneously for the first time at trial

can be written as follows.

In general,

Also pgf of is,

Hence from (5.4), the exact probability distribution of random variable is given by,

, .(5.5)

The pgf of is given by,

(5.6)

Here we note that it is quiet difficult to study the later waiting time distribution between, , , particularly when one is interested in studying the waiting time distribution for the later occurring event between more than two events. For the waiting time distribution of later occurring event between any two runs in the components of the bivariate trials or for the two patterns in the two components of bivariate trials the theory can be developed in general. For this we refer [13] who derive the waiting time distribution of later occurring event between the two events and where is the event that -run of length occurs for the first time in the sequence of higher order Markov dependent BT. [13] use the joint distribution of number of occurrences of 1-runs (i.e. success-runs) of length and the number of occurrences of 0-runs (i.e. failure runs) of length.

5.2. Waiting Time Distribution for Runs

Let be the waiting time for the occurrence of -run of length for the component of bivariate trials,;. We obtain the distribution of

using distribution of number of occurrences of -run of length for the component of bivariate trials;. The pgf of joint distribution of

as obtained in (2.5) is as follows.

In particular pgf of marginal distribution of is obtained by setting in the pgf

of and is as follows.

(5.7)

Let pgf of be for;. Then pgf of in (5.7) can be expressed in a simplified form as follows.

where and.

The exact probability distribution of can be obtained from Lemma 4.1 as follows.

(5.8)

where is the coefficient matrix of in the matrix polynomial and in general for satisfies the recurrent relation

with

where is the null matrix of order same as matrix and. The probability in (5.8) can be written as follows.

The above components of can be interpreted as follows.

(5.9)

Now we have,

Using the pgf of and interpretations in (5.9) we have,

Particularly when, we have,

The pgf of is given by,

and formula for exact probability is

,.

Similarly waiting time distributions of, and can be obtained.

6. Numerical Study

In this section we present the numerical study based on the joint distribution of patterns in the sequence of bivariate trials. We consider the sequence

of -valued Markov dependent bivariate trials with with transition probabilities as follows.

, , , ,

, , , ,

, , , ,

, , ,

Let and. The joint distribution of is evaluated numerically for using the algorithm given in Theorem 4.1. The evaluated joint pmf of is described in the following Table 1.

7. Application and Extension

[14] introduced the two-dimensional engineering system consisting of a grid of components arranged in -rows and -columns. The system and its components can be either in working or failed state. The system fails if and only if a grid of size components fails. Particularly, for, we can formulate the states of the components in this system as a sequence of independent bivariate trials. We assume that the component in a column is in state 1 if it is in failed state and in state 0 otherwise. For we define,

and

The reliability of consecutive--out-of-:

F-Lattice system can now be obtained simply by using the joint distribution of asP(consecutive--out-of-: F-Lattice system works) =where is 1-run of length in the first component and is 1-run of length in the second component of the bivariate trials.

Extending the concept of bivariate trials to multivariate trials, the joint distributions of number of occurrences of patterns of length in the component of -variate trials can be used to get the reliability of general two-dimensional consecutive- -out-of-: F-Lattice system.

Table 1. Distribution of.

The pgf of joint distribution of, the number of occurrences of patterns of length in the component of -variate trials obtained in general by using the method of conditional pgfs is of form,

(7.1)

Extending the Lemma 4.1 for the pgf in (7.1) exact joint probability distribution of can be obtained.

The above study of runs and patterns can be extended in another direction by generalizing the sequence of Markov dependent bivariate trials to the sequence of Markov dependent multivariate trials. In the following, we discuss briefly the method of deriving the joint distribution of where,

denotes the number of occurrences of two dimensional patterns of rectangular shape   in the sequence of Markov dependent multivariate trials.

Let

so that. Consider the sequence of -variate -valued Markov dependent trials. Let the transition probabilities of these Markov dependent trials be,

, ,.

and initial probabilities be.

Consider the two dimensional pattern of rectangular shape, where. Let, be the number of occurrences of patterns in n trials. We obtain the joint distribution of

using the method for deriving joint distribution of number of occurrences of patterns in the sequence of multi-state trials as done by [11]. For this consider the following transformation of sequence of -variate sequence of -valued Markov dependent trials to the univariate sequence of Markov dependent trials.

Define the function such that

where. Hence

. Each in can be treated as a -digit binary number. The function converts this -digit binary number into a unique equivalent decimal number in.

Now corresponding to the -variate sequence of -valued Markov dependent trials, we have the univariate sequence of -valued Markov dependent trials where. The transition probabilities for the sequence of trials and for the converted sequence of trials are related as follows.

where and. Convert the pattern into a pattern

where for. Now the original problem of studying the joint distribution of in the sequence of

-valued Markov dependent trials reduces to the problem of studying distribution of

for the sequence of

-valued Markov dependent trials. [15] obtain the joint distribution of number of occurrences of -runs of length, while [11] obtain the joint distribution of number of occurrences of patterns in the sequence of -valued Markov dependent trials using the method of conditional pgfs. The related waiting time distributions can also be studied following the same process as in Section 5 in the case of Markov dependent multivariate trials.

7. References

[1]    S. J. Schwager, “Run Probabilities in Sequences of Markov-Dependent Trials,” Journal of American Statistical Association, Vol. 78, No. 381, 1983, pp. 168-175. doi:10.2307/2287125

[2]    N. Balakrishnan and P. S. Chan, “Start-up Demonstration Tests with Rejection of Units upon Observing d Failures,” Annals of Institute of Statistical Mathematics, Vol. 52, No. 1, 2000, pp. 184-196. doi:10.1023/A:1004101402897

[3]    W. Feller, “An Introduction to Probability Theory and Its Applications,” 3rd Edition, John Wiley & Sons, Hoboken, 1968.

[4]    K. D. Ling, “On Binomial Distributions of Order k,” Statistics & Probability Letters, Vol. 6, No. 4, 1988, pp. 247-250. doi:10.1016/0167-7152(88)90069-7

[5]    S. Aki and K. Hirano, “Number of Success-Runs of Specified Length until Certain Stopping Time Rules and Generalized Binomial Distributions of Order k,” Annals of Institute of Statistical Mathematics, Vol. 52, No. 4, 2000, pp. 767-777. doi:10.1023/A:1017585512412

[6]    A. M. Mood, “The Distribution Theory of Runs,” Annals of Mathematical Statistics, Vol. 11, No. 4, 1940, pp. 367-392. doi:10.1214/aoms/1177731825

[7]    S. Aki and K. Hirano, “Sooner and Later Waiting Time Problems for Runs in Markov Dependent Bivariate Trials,” Annals of Institute of Statistical Mathematics, Vol. 51, No. 1, 1999, pp. 17-29. doi:10.1023/A:1003874900507

[8]    M. Uchida, “On Generating Functions of Waiting Time Problems for Sequence Patterns of Discrete Random Variables,” Annals of Institute of Statistical Mathematics, Vol. 50, No. 4, 1998, pp. 655-671. doi:10.1023/A:1003756712643

[9]    D. L. Antzoulakos, “Waiting Times for Patterns in a Sequence of Multistate Trials,” Journal of Applied Probability, Vol. 38, No. 2, 2001, pp. 508-518. doi:10.1239/jap/996986759

[10]    K. Inoue and S. Aki, “Generalized Waiting Time Problems Associated with Pattern in Polya’s Urn Scheme,” Annals of Institute of Statistical Mathematics, Vol. 54, No. 3, 2002, pp. 681-688. doi:10.1023/A:1022431631697

[11]    K. S. Kotwal and R. L. Shinde, “Joint Distributions of Patterns in the Sequence of Markov Dependent MultiState Trials,” 2011 Submitted.

[12]    S. Aki and K. Hirano, “Waiting Time Problems for a Two-Dimensional Pattern,” Annals of Institute of Statistical Mathematics, Vol. 56, No. 1, 2004, 169-182. doi:10.1007/BF02530530

[13]    K. S. Kotwal and R. L. Shinde, “Joint Distributions of Runs in a Sequence of Higher-Order Two-State Markov trials,” Annals of Institute of Statistical Mathematics, Vol. 58, No. 1, 2006, pp. 537-554. doi:10.1007/s10463-005-0024-6

[14]    A. A. Salvia and W. C. Lasher, “2-Dimensional Consecutive k-out-of-n: F Models,” IEEE Transactions on Reliability, Vol. R-39, No. 3, 1990, pp. 382-385. doi:10.1109/24.103023

[15]    R. L. Shinde and K. S. Kotwal, “On the Joint Distribution of Runs in the Sequence of Markov Dependent MultiState Trials,” Statistics & Probability Letters, Vol. 76, No. 10, 2006, pp. 1065-1074. doi:10.1016/j.spl.2005.12.005