Open Journal of Statistics, 2012, 2, 4859 http://dx.doi.org/10.4236/ojs.2012.21006 Published Online January 2012 (http://www.SciRP.org/journal/ojs) Copyright © 2012 SciRes. OJS Minimum MSE Weights of Adjusted Summary Estimator of Risk Difference in MultiCenter Studies Chukiat Viwatwongkasem1*, Jirawan Jitthavech2, Dankmar Böhning3, Vichit Lorchirachoonkul2 1Department of Biostatistics, Faculty of Public Health, Mahidol University, Bangkok, Thailand 2School of Applied Statistics, National Institute of Development Administration, Bangkok, Thailand 3Applied Statistics, School of Biological Sciences, University of Reading, Reading, UK Email: *phcvw@mahidol.ac.th Received October 14, 2011; revised November 18, 2011; accepted November 30, 2011 ABSTRACT The simple adjusted estimator of risk difference in each center is easy constructed by adding a value c on the number of successes and on the number of failures in each arm of the proportion estimator. Assessing a treatment effect in multicenter studies, we propose minimum MSE (mean square error) weights of an adjusted summary estimate of risk difference under the assumption of a constant of common risk difference over all centers. To evaluate the performance of the proposed weights, we compare not only in terms of estimation based on bias, variance, and MSE with two other conventional weights, such as the CochranMantelHaenszel weights and the inverse variance (weighted least square) weights, but also we compare the potential tests based on the type I error probability and th e power of test in a variety of situations. The results illustrate that the propo sed weights in terms of point estimation and hypothesis testing perform well and should be recommend ed to use as an alternative choice. Finally, two ap plications are illustrated for the practi cal use. Keywords: Minimum MSE Weights; Optimal Weights; CochranMantelHaenszel Weights; Inverse Variance Weights; MultiCenter Studies; Risk Difference 1. Introduction It is widely known that the conventional proportion esti mator, ˆ pXn, is a maximum likelihood estimator (MLE) and an uniformly minimum variance unbiased estimator (UMVUE) for the binomial parameter p where the binomial random variable is the number of successes out of the number of patients n. However, Agresti and Coull [1], Agresti and Caffo [2], Ghosh [3], and Newcombe [4,5] highlighted the point that ˆ p might not be a good choice for p when the assumption of ˆ5np and ˆ 15np was violated; this violation often occurs when the sample size n is small, or the estimated probability ˆ p is close to 0 or 1 (close to the boundaries of parameter space), leading to the problem of the zero estimate of the variance of ˆ p. The estimated variance of ˆ p, provided by ˆˆ ˆ () 1Vppp n , is zero in the occurrence of any case: 0X or n . Böh ning and Viwatwongkasem [6] proposed the simple ad justed proportion estimator by adding a value c on the number of successes and the number of failures; cones quently, ˆ2 c pXcnc is their proposed esti mate of p with the nonzero variance estimate 2 ˆˆˆ 12 ccc Vpnppn c . They concluded that the estimator 12Xn minimizes the Bayes risk (the average MSE of ˆc p) in the class of all estimators of the form 2 cnc with respect to uniform prior on [0,1] and Euclidean loss function; furthermore, the esti mator 12Xn has smaller MSE than n in the approximate interval 0.15,0.85 of p. For another argumentation in the Bayesian approach, Casella and Berger [7] showed that Xn is a Bayes estimator of p under the conditional b inomial sampling ~, pbinomialn p and the prior beta distribution ~,pbeta . Note that in case of 1 the beta distribution has a special case as the uniform distribution over [0,1]. Consequently, the estimator 2 cnc derived from the Bayesian approach and the Bayes risk approach und er the ab ove men tion ed crite ria pr ovid es th e same result at 1c . With the idea of ˆ2 c pXcnc , the extension leads to 12 ˆˆˆ ccc pp , the adjusted risk difference esti mator between two independent binomial proportions, for estimating a common risk difference where 11111 ˆ2 c pXcnc and 22222 ˆ2 c pXcnc are proportion estimators for treatment and control arms. In a multicenter study of size k, the parameter of in *Corresponding author.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 49 terest is also a common risk difference that is as sumed to be a constant across centers. We concern about a combination of several adjusted risk difference estima tors 12 ˆˆˆ cjc jcj pp from the th j center 1, 2,,jk into the adjusted summary estimator of risk difference of the form 1 ˆˆ k cwcj cj jf where cj are the weights subject to the condition that 11 k cj jf . In this study, we would propose the optimal weights cj as an alter native choice based on minimizing the MSE of ˆcw in Section 2, then state the wellknown candidates such as the CochranMantelHaenszel (CMH) weights and the inverse variance (INV) weights in Section 3. A simula tion plan for comparing the performance among weights in terms of estimation and hypothesis testing is presented in Section 4. The results of the comparison among the potential estimators based on bias, variance, and MSE and also the evaluations among tests related the men tioned weights through the type I error probability and the power criteria lie on Section 5. Some numerical ex amples are applied in Section 6. Finally, conclusion and discussion are presented in Section 7. 2. Deriving Minimum MSE Weights of Adjusted Summary Estimator Under the assumption of a constant of common risk dif ference across k centers, we combine several ad justed risk difference estimators 12 ˆˆˆ cjc jcj pp in which 11111 ˆ2 cj jj pXcnc and 22222 ˆ2 cj jj pXcnc from the th j center 1, 2,,jk arrive at an adjusted summary estimator of risk difference of the form 1 ˆˆ k cwcj cj jf where cj are nonrandom weights subject to the constraint that 11 k cj jf . Please observe that for a single center 1k the adju sted summary estimator 1 ˆˆ k cwcj cj jf subject to 11 k cj jf is a shrinkage estimator of a simple adjusted estimator 12 ˆˆˆ ccc pp . Our minimum MSE weights cj of the adjusted summary estimator ˆcw were derived by following Lagrange’s method [8] under the assumption of a constant of common risk dif ference over all centers with the pooling point estimator to estimate . Lui and Chang [9] proposed the optimal weights proportional to the reciprocal of the variance with the MantelHaenszel point estimator under the as sumption of noncompliance. It was observ ed that both of optimal weights provided the different formulae because of different assumptions even though they were derived from the same method of Lagrange. Now, we wish to present the proposed weights minimizing the MSE of ˆcw as follows: 2 2 1 ˆˆ ˆ k cwcwcjc j j QMSEEE f To obtain the minimum Q subject to a constraint 11 k jcj f , we form the auxiliary function to seek cj that minimize 2 11 ˆ1 kk cj cjcj jj Ef f where is a Lagrange multiplier. The weights cj f and are derived by solving the following equations simultaneously: 0 , 0 cj f , 1,2,,jk. The details are presented in Appendix. The result of the weighted estimate for the th j center yields 1 1 11 1 1 ˆ ˆˆ 1 ˆˆ ˆ ˆˆ ˆˆ 1 ˆ ˆˆˆ ˆˆ jj pool cj k mmmm pool jj k mmmm V fa VE V a aEV where 1112 22 112 2 ˆˆ ˆ22 jcj jcj jjj npc npc Enc nc 111222 22 112 2 ˆˆˆˆ 11 ˆ22 cj cjjcj cj j jj npp npp Vnc nc , ˆ ˆ ˆˆ jj aE b 1 11 1ˆ ˆˆ kk jj j aV V , 1 ˆ ˆˆ k j E bV , 12 ˆˆˆ pool pp 11 1 11 1 11 11 ˆ ˆ kk jj jj kk j jj np X pnn , 22 2 11 2 22 11 ˆ ˆ kk jj jj kk j jj np X pnn In the particular case of 12 0cc, our estimator 1 ˆˆ k cwjcjcj f has a shrinkage estimator to be the popular inversevariance weighted estimator. Under a common risk difference over all centers, the variance of ˆcw in the case of nonrandom weights cj are ob tained by 2 1 1112 22 222 1112 2 ˆˆ 11 22 k cwcjcj j kjjjj jj cj jjj VfV npp npp fncnc Suppose that a normal approximation is reliable, the asymptotic distribution is 1 2 1 ˆˆ ˆ(0,1) ˆ ˆˆ ˆˆ k cj cj j cw k cwcj cj j fN VfV
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 50 for testing 00 :H we have the norm al approximate test 0 1 20 1 ˆˆ ˆˆ ˆ k cj cj j cw k cjcj j f Z VH We will reject 0 at level for twosided test if 2cw Z where 2 is the upper 100 2th percentile of the standard normal distribution. Alterna tively, 0 is rejected when the pvalue (p) is less than or equal to p where 21 cw pZ and is the standard cumulative normal distribu tion function. 3. Other WellKnown Weights 3.1. CochranMantelHaenszel (CMH) Weights Cochran [10,11] proposed a weighted estimator of cen terspecific sample sizes for a common risk difference based on the unconditional binomial likelihood as 1 1 ˆ ˆk j j CMH k j w w where 1 121 2 12 11 jj jj jj wnnnn nn and 12 12 12 ˆˆˆ j jjj j X pp nn . Cochran’s weight w is widely used as a standard nonrandom weight derived by the harmonic means of the centerspecific sample sizes. Note that 1 k jj j fw w is also Cochran’s weight subject to the condition that 11 k j jf . A straightfor ward derivation illustrates that ˆCMH is an unbiased estimate of and the variance of ˆCMH is readily available as 2 1 2 1 ˆ ˆ k j j CMH k j j wV Vw where 1112 22 ˆ11 jjjjjj Vppnppn . As suming that a normal approximation is reliable, the Cochran’s Zstatistic for testing 00 :H is provided as 0 2 20 11 11 ˆ ˆ ˆ jj j CMH jj j kk jj kk jj ww Z wV Hw where 01 11222 ˆ ˆˆˆˆˆ 11 jjjjjj VHp pnppn . The rejection rule of 0 follows the same as the previ ous standard normal test. Alternatively, Mantel and Haenszel [12] suggested the test based on the conditional hypergeometric likelihood for a common odds ratio among the set of k tables un der the null hypothesis of 0:1HOR 0 . With the null criterion, MantelHaenszel’s weight stated by SanchezMeca and MarinMartine [13] was equivalent to 1212 1 jjjjj wnn nn . Since the minor difference between the conditional MantelHaenszel weight and the unconditional Cochran weight is in the denominators, thus the two are often referred to interchangeably as the CochranMantelHaenszel weight. In this study, we use 121 2 jj j j wnnnn. 3.2. Inverse Variance (INV) or Weighted Least Square (WLS) Weights Fleiss [14] and Lipsitz et al. [15] showed that the in versevariance weighted (INV) estimator or the weighted leastsquare (WLS) estimator for was in the summa ry estimator of the weighted mean (linear, unbiased estima tor) of th e form 11 ˆˆ kk NVj jj jj ww where 1211 22 ˆˆˆ jj jj jj pp XnXn and w de fined by the reciprocal of the variance as 1 1122 12 11 1ˆ jjj j jjj j pppp wnn V The nonrandom and nonnegative weights w yield the minimum variance of the summary estimator ˆ NV for estimating . The variance of ˆ NV is just given by 22 22 11 1 11 ˆ11 ˆjj jj INV jj kk jj k kk j jj wV ww Vw ww However, the weights w cannot be used in practice since 1 p and 2 p are unknown. Therefore, it has be come common practice to replace them by their sample estimators. It yields 1122 1 12 ˆˆ ˆˆ 11 ˆ jj j jjj pppp wnn This weight was suggested in several textbooks of epidemiology such as Kleinbaum et al. [16] or in text books of metaanalysis such as Petitt [17]. We assume that a normal approximation is reliable; the inversevariance weighted test statistic for testing 00 :H is 0 11 1 ˆ 1 jj j INV j kk jj k j ww Zw where 10 ˆ ˆ ˆjj wV H . Also, the rule of 0 rejection follows the same as the above standard normal test.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 51 4. Monte Carlo Simulation We perform simulations for estimating a common risk difference and testing the null hypothesis 00 :H in the similar plans as follows: Parameters Setting: Let the common risk difference be some constants varying from 0 to 0.6, with incre mental steps of 0.1. Baseline proportion risks 2 p 21 222 ,,, k pp p in the control arm for the th j center 1, 2,,jk are generated from a uniform distribution over 0, 0.95 . The correspondent proportion risks 1 p for the treatment arm in the th j cen ter ar e ob ta in ed as 12jj pp . For example, if 0.2 , then 2~0,0.75 j pU and 12 ~ 0.2,0.95 jj pp U . The sample sizes 1 n and 2 n are varied as 4, 8, 16, 32, 100. The number of centers k takes values 1, 2, 4, 8, 16, 32. Statistics: Binomial random variables 1 and 2 in treatment and control arms are generated with pa rameters 11 , j np and 22 , j np for each center j. Estimation: All summary estimates of are com puted in a variety of different weights. The procedure is replicated 5000 times. From these replicates, bias, vari ance, and MSE (mean square error) are computed in the conventional way. Type I Error: From the above parameter setting, we assign 0 under a null 00 :H , so all tests are computed. The replication is treated 5000 times. From these replicates, the number of the null hypothesis reject tions is counted for the empirical type I error . 00 Number of rejections of when is true Number of replications (5000 times) HH The evaluation for twosided tests in terms of the type I probability is based on Cochran limits [18] as follow. At 0.01 , the value is between 0.005, 0.015. At 0.05 , the value is between 0.04, 0.06. At 0.10 , the value is between 0.08, 0.12. If the empirical type I error ˆ lies within those of Cochran limits, then the statistical test can control type I error. Power of Tests: Before evaluating tests with their powers, all comparative tests should be calibrated to have the same type I error rate under 0 ; then any test whos e power hits the maximum under 1 would be the best test. To achieve the alternative hypothesis, we assume the random effect model for as 0.10.12 1 jm UmU where m U as an effect of between centers is assigned to be uniform ,mm for a given 0, 0.1m, or equivalently, U is an uniform variable over 0,1. That is, 0.1 j E and 2 212 j Var m . Also, we have 12 jj pp where 2 p be uniform distri bution over 0.1, 0.8. Binomial random variables 1 and 2 are drawn with parameters 11 , j np, and 22 , j np, respectively. All proposed test statistics are then computed. The procedure is replicated 5,000 times. From these replicates, the empirical power 1 of test is counted. 01 Number of rejections of when is true 1 Number of replications 5000 times H 5. Results Since it is difficult to present all enormous results from the simulation study, we just have illustrated some in stances. Nevertheless, the main results are concluded perfectly. 5.1. Results for Estimating Risk Differences Table 1 presents some results according to point estima tion of a common risk difference . However, we can draw conclusions in the following. The number of centers, k, can not change the order of the MSE of all weighted estimators, even though an increase in k can decrease the variance and the MSE of all estimators, leading to the increasing effi ciency. Also, increasing 1 n and 2 n can decrease the variance of all estimators while fixing k. The unbalanced cases of 1 n and 2 n for center j have a rare effect on the order of the MSE of all estim ates. For most popular situations used under 0 , 0.1 , 0.2 , and 0.3 , the proposed sum mary estimator cw adjusted by 12 1ccc in cluding adjusted by 2c is the best choice with the smallest MSE. The estimator ˆcw adjusted by 0.5c and the inversevariance (INV) weighted es timator 0c are close tog ether and ar e the second choice with smaller MSE. The CochranMantel Haenszel (CMH) weight performs the worst in this simulation setting. This finding is very useful in gen eral situations of most clinical trials and most causal relations between a disease and a suspected risk factor since the risk difference is often less than 0.25 [19]. For 0.4 , the proposed estimator ˆcw adjusted by 1c performs best; for 0.5 , the proposed estimator ˆcw adjusted by 0.5c performs best; for 0.6 , the INV weighted estimator (0c ) performs best. 5.2. Results for Studying Type I Error Table 2 presents some results for controlling the empiri cal type I error. We can conclude the performance of several tests according to the empirical alpha under 0 as follows.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 52 Table 1. Mean , variance, M SE for estimating θ. k 1j n 2j n Measure CMH INV (0c ) 0.5c 1c 2c 0.0 1 2 2 Mean: –0.001700 –0.000850 –0.001130 –0.000850 –0.000570 Var: 0.171245 0.042811 0.076109 0.042811 0.019027 MSE: 0.171250 0.042813 0.076112 0.042813 0.019028 0.0 1 4 4 Mean: –0.000800 0.000400 –0.000640 –0.000530 –0.000400 Var: 0.088874 0.053058 0.056879 0.039499 0.022219 MSE: 0.088875 0.053058 0.056880 0.039500 0.022219 0.0 1 8 8 Mean: 0.002625 0.001965 0.002333 0.002100 0.001750 Var: 0.042575 0.035480 0.033641 0.027249 0.018923 MSE: 0.042584 0.035483 0.033647 0.027254 0.018926 0.0 1 16 16 Mean: –0.000050 0.000328 –0.000047 –0.000044 –0.000040 Var: 0.021759 0.020761 0.019275 0.017193 0.013926 MSE: 0.021759 0.020761 0.019275 0.017193 0.013926 0.0 1 32 32 Mean: –0.001900 –0.001950 –0.001840 –0.001790 0.001690 Var: 0.010805 0.010674 0.010160 0.009572 0.008538 MSE: 0.010809 0.010678 0.010164 0.009575 0.008540 0.0 1 100 100 Mean: 0.000566 0.000572 0.000560 0.000555 0.000544 Var: 0.003482 0.003478 0.003413 0.003346 0.003219 MSE: 0.003482 0.003478 0.003413 0.003347 0.003219 0.1 16 2 2 Mean: 0.102200 0.051100 0.068133 0.051100 0.034067 Var: 0.178755 0.044689 0.079446 0.044689 0.019861 MSE: 0.178759 0.047080 0.080462 0.047080 0.024210 0.1 16 4 4 Mean: 0.101900 0.071067 0.081520 0.067933 0.050950 Var: 0.093292 0.056358 0.059708 0.041462 0.023323 MSE: 0.093295 0.057194 0.060047 0.042490 0.025729 0.1 16 4 8 Mean: 0.091175 0.073915 0.078964 0.069820 0.056883 Var: 0.068527 0.048536 0.047903 0.036184 0.023445 MSE: 0.068605 0.049217 0.048345 0.037095 0.025305 0.1 16 4 16 Mean: 0.096425 0.086770 0.087330 0.080322 0.069865 Var: 0.057752 0.041273 0.040889 0.032469 0.024164 MSE: 0.057764 0.041448 0.041048 0.032856 0.025072 0.1 16 4 32 Mean: 0.103087 0.094537 0.095306 0.089488 0.080958 Var: 0.052651 0.037007 0.037127 0.030458 0.025400 MSE: 0.052662 0.037037 0.037149 0.030568 0.025763 0.1 16 8 8 Mean: 0.105625 0.091604 0.093890 0.084500 0.070417 Var: 0.047621 0.041375 0.037626 0.030478 0.021165 MSE: 0.047653 0.041446 0.037664 0.030718 0.022040 0.1 16 8 16 Mean: 0.100700 0.094838 0.093524 0.087382 0.077367 Var: 0.035620 0.031899 0.029404 0.024987 0.019128 MSE: 0.035620 0.031926 0.029445 0.025147 0.019641 0.1 16 8 32 Mean: 0.097381 0.093334 0.092488 0.088258 0.081217 Var: 0.028539 0.025407 0.023764 0.020808 0.017542 MSE: 0.028546 0.025452 0.023820 0.020945 0.017895 0.1 16 16 16 Mean: 0.099100 0.094834 0.093271 0.088089 0.079280 Var: 0.023792 0.023050 0.021075 0.018798 0.015227 MSE: 0.023793 0.023077 0.021120 0.018941 0.015656 0.1 16 32 32 Mean: 0.100794 0.099611 0.097741 0.094866 0.089594 Var: 0.011022 0.010951 0.010364 0.009764 0.008709 MSE: 0.011023 0.010951 0.010369 0.009790 0.008817 0.1 16 100 100 Mean: 0.100052 0.099934 0.099061 0.098092 0.096204 Var: 0.003728 0.003725 0.003654 0.003583 0.003446 MSE: 0.003728 0.003725 0.003655 0.003587 0.003461
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 53 Table 2. Empirical type I error for testing H0: θ = θ0 at 5% significance level. 0 k 1j n 2j n CMH INV (0c ) 0.5c 1c 2c 0.0 1 4 4 3.42 3.42 3.42 3.42 3.42 4 8 2.08 2.08 6.84 4.76 4.76 4 16 3.00 3.00 6.52 5.80 8.18 4 32 2.76 2.76 6.66 6.18 10.50 4 100 2.54 2.54 7.30 6.46 14.40 8 8 3.28 3.28 6.76 4.16 4.16 8 16 4.26 4.26 6.54 4.74 4.30 8 32 4.34 4.34 5.58 4.22 5.10 8 100 5.02 5.02 6.58 6.00 8.90 16 16 4.74 4.74 4.48 4.48 3.38 16 32 4.50 4.50 4.94 4.44 3.90 16 100 5.02 5.02 5.30 4.58 5.10 32 32 5.04 5.04 4.66 4.34 3.88 32 100 5.22 5.22 5.16 4.46 4.34 100 100 4.74 4.74 4.60 4.40 4.14 0.0 4 4 4 3.68 3.68 3.68 3.68 3.68 8 8 3.40 3.40 7.14 4.56 4.56 16 16 4.84 4.84 4.66 4.66 3.54 16 32 4.52 4.52 5.00 4.52 4.10 16 100 5.46 5.46 5.66 4.72 5.26 32 32 4.74 4.74 4.42 4.18 3.92 32 100 5.34 5.34 5.48 4.74 4.46 100 100 5.04 5.04 4.98 4.86 4.64 0.1 4 4 4 1.26 1.26 8.28 8.28 6.22 8 8 4.24 4.24 7.6 4.66 4.66 16 16 5.18 5.18 5.76 5.04 4.06 16 32 5.66 5.66 5.82 5.40 5.30 16 100 5.86 5.86 6.20 4.84 4.88 32 32 5.72 5.72 5.64 4.96 4.44 32 100 5.88 5.88 5.44 5.20 4.82 100 100 5.22 5.22 5.16 5.10 4.82 0.2 4 4 4 1.74 1.74 4.36 4.36 8.00 8 8 4.66 4.66 8.58 5.38 5.38 16 16 7.54 7.54 6.32 6.28 6.58 16 32 7.26 7.26 6.22 5.56 5.60 16 100 6.24 6.24 6.18 5.40 5.88 32 32 5.46 5.46 5.40 5.46 5.08 32 100 5.56 5.56 5.26 5.22 4.88 100 100 5.34 5.34 5.16 5.10 5.22 0.4 4 4 4 3.00 3.00 12.06 7.44 18.04 8 8 8.00 8.00 6.82 9.18 12.04 16 16 5.78 5.78 5.92 5.16 7.04 16 32 6.82 6.82 6.56 6.16 7.56 16 100 6.38 6.38 6.18 5.80 7.06 32 32 5.96 5.96 5.78 5.94 6.28 32 100 5.92 5.92 5.80 6.04 6.72 100 100 5.68 5.68 5.34 5.14 5.48 Bold values denote that the statistical tests can co ntrol the type I error.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 54 The increasing k cannot change the order of the empirical type I error rates of all tests. Also, the un balanced cases of 1j n and 2j n for center j have a slight effect on the order of the empirical type I error rates of all tests. None of tests can control type I error rates when sam ple size of treatment or control arm is very small (14 j n or 24 j n). There exists few tests that can control type I error when sample size is small (18 j n or 28 j n). For 0 , almost all tests can control type I error rates when the sample size is moderate to large (116 j n or 216 j n). This finding frequently oc curs in practical use of 0:0H . For 0.2 , 0.4 , and 0.6 , almost all tests can control type I error rates when the sample size is large to very large (132 j n or 232 j n). 5.3. Results for Studying Power of Tests Table 3 shows some more details of the powers. Fortu nately, almost all tests under 0:0H can control type I error rates when the sample size is moderate to large (116 j n or 216 j n). We ignore to consider the com parative tests when sample size is very small (14 j n or 24 j n) since all of tests can not control type I error rates. The performance of several weighted tests accord ing to the powers under 1:0.1 m U can be con cluded in the following: The empirical powers yield a similar pattern of results like the MSE. An increase in the number of centers, k, can increase the power but it can not change the order of power. Overall, the proposed weights adjusted by 1c in cluding 2c perform best with the highest power in a multicenter study of size 2k when 116 j n or 216 j n. The INV weight and the CMH weight are achieved with the highest powers in one center study when 116 j n or 216 j n. When the sample size is large to very large (132 j n or 232 j n), all weights perform well. 6. Numerical Examples Two examples are presented to illustrate the implementa tion of the related methodology. Pocock [20] presented data from a randomized trial studying the effect of pla cebo and metoprolol on mortality after heart attack (AMI: Acute Myocardial Infarction) classified by three strata of age groups, namely, 40  64, 65  69, 70  74 years. Ta ble 4 shows the data and weights corresponding to the CMH, the INV, and the proposed strategies. The esti mated summary differences based on the CMH, the INV, and the proposed weights are 0.031, 0.024, 0.030, re spectively. Also, the estimated standard errors of those of overall differences are 0.014, 0.013, 0.014, respectively. Since both of 2.237 CMH Z and 2.197 cw Z are greater than 21.96Z , the CMH and the proposed tests at 1c reject the null hypothesis at 5% level for twosided test and lead to the conclusion of a significant difference between the placebo and metoprolol mortality rates whereas the INV test with 1.823 INV Z fails to reject the null hypothesis at 5% level. Turner et al. [21] presented data from clinical trials to study the effect of selective decontamination of the di gestive tract on the risk of respiratory tract infection of patients in intensive care units. See data and weights in Table 5. The estimated overall differences and their es timated standard errors are 0.152 (0.012), 0.140 (0.011), 0.162 (0.012) for the CMH, the INV, and the proposed weights at 1c , respectively. All tests reject the null hypothesis with 12.584 CMH Z , 12.215 INV Z, 13.719 cw Z and lead to the conclusion of a significant difference between treatment effect of selective decon tamination of the digestive tract on the risk of respiratory tract infection. 7. Conclusions and Discussion In most general situations used by the risk difference lying on [0, 0.25], the results have confirmed that the minimum MSE weight of the proposed summary esti mator cw adjusted by 12 1cc c (including 12 2cc c ) is the best choice with the smallest MSE under a constant of common risk difference over all k centers. The number of centers, k, cannot change the order of the MSE of all weighted estimators, even though an increase in k can decrease the variance and the MSE of all weighted estimators. Also, increasing 1 n and 2 n can decrease the variance of all estimators while fixing k. The unbalanced cases of 1 n and 2 n for center j have a slight effect on the order of the MSE of all estimates. The minimum MSE weight is de signed to yield more precise estimate relative to the CMH and INV weights. Another benefit of the proposed weight is easy to compute because of its closedform formula. With the basis of smallest MSE and the easytocompute formula, we have been solidly sug gested to use the proposed weight. In addition, the vari ous choices for c have been considered again. The use of 0.5c as a conventional correction term [22] should be revised. The better value of c in adding on the number of successes and the number of failures is sug gested with at least for 1c (including 2c ). This result is supported by the ideas of Böhning and Viwat wongkasem [6], Agresti and Coull [1], and Agresti and Caffol [2] that recommended to use the appropriate val ues of c greater than or equal to 1.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 55 Table 3. Empirical power (percent) at m = 0.04 after controlling the estimated type I error at the nominal 5% level. X = Controllable Type I error rates Empirical power rates k 1j n 2j n CMH INV 0.5c1c 2c CMH INV 0.5c 1c 2c 1 8 8 X X 6.8 6.8 8 16 X X X X 7.3 7.3 8.3 7.4 8 32 X X X X X 9.5 9.5 11.5 9.6 9.9 8 100 X X X 11.3 11.3 11.8 16 16 X X X X 11.2 11.2 10.6 10.6 16 32 X X X X 12.2 12.2 12.7 11.8 16 100 X X X X X 16.4 16.4 15.4 14.8 14.6 32 32 X X X X 17.6 17.6 16.5 16.4 32 100 X X X X X 21.4 21.4 21.2 20.8 20.3 100 100 X X X X X 36.8 36.8 36.8 36.5 36.1 4 8 8 X X 26.9 29.7 8 16 X X 29.5 32.8 8 32 X X X X X 20.8 23.8 31.5 33.1 35.2 8 100 X X X 23.6 28.9 36.8 16 16 X X X X 25.3 27.0 31.0 33.4 16 32 X X X X X 32.4 35.9 38.0 40.6 43.6 16 100 X X X X X 39.1 44.6 45.2 46.8 48.9 32 32 X X X X 44.2 46.1 47.8 49.5 32 100 X X X X X 58.1 60.6 61.4 62.8 64.6 100 100 X X X X X 85.6 87.0 86.7 87.2 87.8 8 8 8 8 16 X X 44.0 48.8 8 32 X X X X X 35.3 39.5 50.2 53.9 59.0 8 100 X X 39.9 46.0 16 16 X X X X 43.1 45.9 52.7 56.9 16 32 X X X X X 53.4 57.0 61.3 64.3 68.5 16 100 X X X X X 65.7 69.3 72.0 74.5 77.1 32 32 X X X X X 71.1 72.9 74.8 76.9 80.4 32 100 X X X X X 86.1 87.7 88.3 89.1 90.4 100 100 X X X X X 98.8 98.9 99.0 99.1 99.1 16 8 8 X X 68.3 77.5 8 16 X 68.5 8 32 X X X X X 60.9 64.6 74.2 77.1 82.1 8 100 X X X 67.4 72.1 82.0 16 16 X X X X 71.0 73.8 79.0 82.2 16 32 X X X X X 82.5 84.4 87.3 89.2 92.0 16 100 X X X X X 90.3 90.8 93.1 93.8 94.8 32 32 X X X X 93.9 94.9 95.3 96.0 32 100 X X X X X 99.0 99.1 99.1 99.2 99.3 100 100 X X X X X 100.0 100.0 100.0 100.0 100.0 32 8 8 8 16 X X X X 81.8 83.2 92.7 95.6 8 32 X X X X X 88.7 90.0 94.1 95.1 96.7 8 100 X X X 92.2 93.4 96.4 16 16 X X 94.5 95.0 97.5 16 32 X X X X 98.1 98.5 99.0 99.1 16 100 X X X X X 99.7 99.5 99.8 99.9 99.9 32 32 X X X X 99.8 99.9 99.9 99.9 32 100 X X X X X 100.0 100.0 100.0 100.0 100.0 100 100 X X X X X 100.0 100.0 100.0 100.0 100.0
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 56 Table 4. Mortality data over three strata of age groups following Pocock. Age Placebo Metoprolol Weights 1j 1j n 2j 2j n j CMH INV 1c 40  64 26 453 21 464 0.012 0.66 0 .79 0.69 65  69 25 174 11 165 0.077 0.24 0 .16 0.25 70  74 11 70 8 69 0.041 0.10 0.05 0.06 Table 5. Respiratory tract infections following Turner et al. Trial Treatment I Treatment II Weights j 1j 1j n 2j 2j n j CMH INV 1c 1 25 54 7 47 0.314 0.03 0.02 0.02 2 24 41 4 38 0.480 0.02 0.02 0.03 3 37 95 20 96 0.181 0.05 0.03 0.04 4 11 17 1 14 0.576 0.01 0.01 0.01 5 26 49 10 48 0.322 0.03 0.02 0.02 6 13 84 2 101 0.135 0.05 0.07 0.07 7 38 170 12 161 0.149 0.09 0.09 0.09 8 29 60 1 28 0.448 0.02 0.02 0.03 9 9 20 1 19 0.397 0.01 0.01 0.01 10 44 47 22 49 0.487 0.03 0.02 0.03 11 30 160 25 162 0.033 0.08 0.07 0.06 12 40 185 31 200 0.061 0.10 0.08 0.07 13 10 41 9 39 0.013 0.02 0.01 0.01 14 40 185 22 193 0.102 0.10 0.09 0.09 15 4 46 0 45 0.087 0.02 0.06 0.04 16 60 140 31 131 0.192 0.07 0.04 0.05 17 12 75 4 75 0.107 0.04 0.05 0.05 18 42 225 31 220 0.046 0.12 0.11 0.09 19 26 57 7 55 0.329 0.03 0.02 0.03 20 17 92 3 91 0.152 0.05 0.07 0.07 21 23 23 14 25 0.440 0.01 0.01 0.02 22 6 68 3 65 0.042 0.03 0.07 0.05 In terms of type I error estimates, when sample size is very small (14 j n or 24 j n), none of tests can control type I error rates. In addition, there exists few tests that can control type I error rates when sample size is small (18 j n or 28 j n). This result is consonant with the comments of Lui [23] that none of conventional tests/weights under sparse data is appropriate. This inap propriateness under sparse data can cope with the mini mum MSE weights from this finding. The further work to seek some appropriate tests/weights in sparse data challenges for investigators to develop an innovation or to improve much more reasonable tests/weights. In gen eral results, almost all tests can control type I error rates when sample size is moderate to large (116 j n or 216 j n). In terms of power, we ignore to evaluate the power when sample size is very small (14 j n or 24 j n ) because all tests can not control type I error rates. The results illustrate the same pattern like the MSE. The pro posed weights adjusted by 1c including 2c per form best with the highest power in a multicenter study of size 2k when 116 j n or 216 j n. The INV weight and the CMH weight are achieved with the high est powers in one center study when 116 j n or 216 j n. When sample size is large to very large (132 j n or 232 j n), all tests perform well. We
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 57 strongly recommend to use the minimum MSE weight as an appropriate choice because of its highest power. 8. Acknowledgements We would like to thank the editors and the referees for comments which greatly improved this paper. This study was partially supported for publication by the China Medical Board (CMB), Faculty of Public Health, Mahi dol University, Bangkok, Thailand. REFERENCES [1] A. Agresti and B. A. Coull, “Approximate Is Better than Exact for Interval Estimation of Binomial Proportions,” American Statistical Association, Vol. 52, 1998, pp. 119 126. [2] A. Agresti and B. Caffo, “Simple and Effective Confi dence Intervals for Proportions and Differences of Pro portions Result from Adding Two Successes and Two Failures,” The American Statistician, Vol. 54, No. 4, 2000, pp. 280288. doi:10.2307/2685779 [3] B. K. Ghosh, “A Comparison of Some Approximate Con fidence Interval for the Binomial Parameter,” Journal of the American Statistical Association, Vol. 74, No. 368, 1979, pp. 894900. doi:10.2307/2286420 [4] R. G. Newcombe, “TwoSided Confidence Intervals for the Single Proportion: Comparison of Seven Methods,” Statistics in Medicine, Vol. 17, No. 8, 1998, pp. 857872. doi:10.1002/(SICI)10970258(19980430)17:8<857::AID SIM777>3.0.CO;2E [5] R. G. Newcombe, “Interval Estimation for the Difference between Independent Proportions: Comparison of Eleven Methods,” Statistics in Medicine, Vol. 17, No. 8, 1998, pp. 873890. doi:10.1002/(SICI)10970258(19980430)17:8<873::AID SIM779>3.0.CO;2I [6] D. Böhning and C. Viwatwongkasem, “Revisiting Pro portion Estimators,” Statistical Methods in Medical Re search, Vol. 14, No. 2, 2005, pp. 147169. doi:10.1191/0962280205sm393oa [7] G. Casella and R. L. Berger, “Statistical Inference,” Dux bury Press, Belmont, 1990. [8] A. E. Taylor and W. R. Mann, “Advanced Calculus,” John Wiley & Sons, New York, 1972. [9] K. J. Lui and K. C. Chang, “Testing Homogeneity of Risk Difference in Stratified Randomized Trials with Non compliance,” Computational Statistics and Data Analysis, Vol. 53, No. 1, 2008, pp. 209221. doi:10.1016/j.csda.2008.07.016 [10] W. G. Cochran, “The Combination of Estimates from Different Experiments,” Biometrics, Vol. 10, No. 1, 1954, pp. 101129. doi:10.2307/3001666 [11] W. G. Cochran, “Some Methods for Strengthening the Common ChiSquare Test,” Biometrics, Vol. 10, No. 4, 1954, pp. 417451. doi:10.2307/3001616 [12] N. Mantel and W. Haenszel, “Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease,” Journal of the National Cancer Institute, Vol. 22, 1959, pp. 719748. [13] J. SanchezMeca and F. MarinMartinez, “Testing the Significance of a Common Risk Difference in Meta Analysis,” Computational Statistics & Data Analysis, Vol. 33, No. 3, 2000, pp. 299313. doi:10.1016/S01679473(99)000559 [14] J. L. Fleiss, “Statistical Methods for Rates and Propor tions,” John Wiley & Sons Inc., New York, 1981. [15] S. R. Lipsitz, K. B. G. Dear, N. M. Laird and G. Molen berghs, “Tests for Homogeneity of the Risk Difference When Data Are Sparse,” Biometrics, Vol. 54, No. 1, 1998, pp. 148160. doi:10.2307/2534003 [16] D. G. Kleinbaum, L. L. Kupper and H. Morgenstern, “Epidemiologic Research: Principles and Quantitative Methods,” L i f e t i me Le a rni ng P u b l i c a t i o n s, B e lmont , 1982. [17] D. B. Petitti, “MetaAnalysis, Decision Analysis and CostEffectiveness Analysis: Methods for Quantitative Synthesis in Medicine,” Oxford Unive rsity Press, Oxford, 1994. [18] W. G. Cochran, “The ChiSquare Test of Goodness of Fit,” Annals of Mathematical Statistics, Vol. 23, No. 3, 1952, pp. 315345. doi:10.1214/aoms/1177729380 [19] K. J. Lui and C. Kelly, “Tests for Homogeneity of the Risk Ratio in a Series of 2 × 2 Tables,” Statistics in Medi cine, Vol. 19, No. 21, 2000, pp. 29192932. doi:10.1002/10970258(20001115)19:21<2919::AIDSI M561>3.0.CO;2D [20] S. J. Pocock, “Cl inical Trials: A Practical Approach,” Wiley Publication, New York, 1997. [21] R. Tuner, R. Omar, M. Yang, H. Goldstein and S. Thompson, “A Multilevel Model Framework for Meta Analysis of Clinical Trials with Binary Outcome,” Statis tics in Medicine, Vol. 19, No. 24, 2000, pp. 34173432. doi:10.1002/10970258(20001230)19:24<3417::AIDSI M614>3.0.CO;2L [22] F. Yates, “Contingency Tables Involving Small Numbers and the ChiSquared Test,” Journal of the Royal Statisti cal Society (Supplement), Vol. 1, 1934, pp. 217235. [23] K. J. Lui, “A Simple Test of the Homogeneity of Risk Difference in Sparse Data: An Application to a Multicen ter Study,” Biometrical Journal, Vol. 47, No. 5, 2008, pp. 654661. doi:10.1002/bimj.200410150 [24] C. A. Rencher, “Linear Models in Statistics,” Wiley Se ries in Probability and Mathematical Statistics, New York, 2000. [25] A. Sen and M. Srivastava, “Regression Analysis: Theory, Methods, and Applications,” SpringerVerlag, New York, 1990.
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 58 Appendix Under a true common risk difference over all k centers (1, 2,,jk), the mean square error of 1 k cw cj cj jf i s given by 2 2 1 k cw cwcj cj j MSEEE f To obtain the optimal weights cj subject to a constraint that 110 k cj jf , we form the auxiliary function by following Lagrange’s method to seek cj that minimize 2 11 1 kk cj cj cj jj Ef f 2 2 11 1 21 kk k cj cj cj cjcj jj j Ef Eff 2 11 2 11 21 kk cj cj cj cj jj kk cj cj cj jj Vf Ef fE f 2 2 11 1 1 kk k cj cj cj cjcj jj j fV fEf Let cj j VV and cj j EE . The partial de rivatives with respect to and cj yield 11 k cj j f , 1 22 k cj jcjjj j cj fVfEE f Setting 0 and 0 cj f , it yields 11 k cj jf 12 k j cjcj j j j E ffE VV Solving for by taking summation on cj , it yields 111 1 1 12 kkk k j cjcj j jjj j j E ffE VV Let 1 11 1 kk jj j aV V , 1 k j E bV , then 2 can be written as 1 1 2 k cjj j bfE aa Hence, 1 1 1 k cjj kj j cjcj j j jjj bfE E ffE VVaVa 11 () 1 kk jcjj cjjcjj jj aVfaEfEbfE 1 1 () 1 k jcjj cjjj j k cj j j aVfaEfEaE bfEb 11 k jcjcjjjj j aV ffEaEbaEb Let jj aE b 11 k cjcj jjj j aV ffE 11221 cjccck kjj aV ffEfEfE Substitute each of the subscript j and rearrange terms. 1j; 1111221 33111 () 1 cc cckk aVEffEf Ef E 2j; 112222 233222 1 cccckk fEaV EffEfE 3j; 1132 23333333 ... 1 ccc ckk fEfEaV EffE jk; 112233 ...()1 ckckc kkkkckk fEfEfEaV Ef It can be written in the matrix form as Hf y, where 111 21311 122 22322 13233 333 123 k k k kkk kkk aV EEEE EaVE EE EEaVEE EE EaVE H
C. VIWATWONGKASEM ET AL. Copyright © 2012 SciRes. OJS 59 123 cc cck ff f f, 123 1 1 1 1k y The matrix H can be illustrated as HDte where 1 2 00 00 00 k aV aV aV D , 12 k t, 12 k EE E e The inverse of H is suggested in several textbooks of linear model such as Rencher [24] and Sen and Srivastava [25]. It yields 11 1 11 1 1 DteD H=D+te =D+eDt Therefore, 11 11 1 DteD f=D +e Dt 1 11 11 1 11 11 1 1 11 22 22 1 1 1 1 1 1(1 ) 1 1(1 ) 1(1 ) k mmm m k mmm m k mmm m k mmm m kk kk mm VE V a aEV V aVE V Va aaEV V aV aE f 1 1 1 1 1 k mmm m k m m VE a V Therefore, for the th j center, it yields 1 1 1 1 1 1 (1 ) 1 jj cj k mmm jj m k mmm m V fa VE V a aEV where 1 11 1 kk jj j aV V , 1 k j E bV , jj aE b 12 1112 22 112 2 22 cj jcjc j jjj j jj EEEpp npcnpc nc nc 12 1112 22 22 112 2 11 22 cj jcjc j jjjj j jj VVVp p npp npp nc nc In practice, we have to estimate the adjusted summary estimator by replacing the sample estimates for the un known quantities: E, V, 1 p, 2 p, .
