Z.-Q. J. LU ET AL. 267
We may also decide on a threshold value, say 1, and as-
sign the decision 1, 0, −1 for “increase”, “indecision”,
“decrease” if the score on a patient is ≥1 between −1 and
1, or ≤−1. The contingency table for the two readers in
the column and row order of −1, 0, 1 is: 7, 0, 0; 2, 5, 3; 0,
4, 2. The Pearson’s Chi-squared test for independence
has value of 16.3215, with df = 4, P-value = 0.0026.
Fisher’s exact test has a P-value of 0.0012 (for two-sided
We believe that there is a strong need to study the reli-
ability and statistical performance of RECIST, or any
other time-sequence tumor size measurement regimes
such as WHO or 3D volume metrics. Statistical methods
suggested in this paper are used to demonstrate the po-
tential of medical decision making by taking into account
explicitly the uncertainty in the markings by expert radi-
ologists, and a statistical decision rule for change could
potentially be available for the future based on realistic
measurement quantification along the lines of [6,18]. In
addition, there is a critical need for establishing meas-
urement uncertainty, such as accommodating the effects
of protocols and instrument settings . Statistics-based
decision rule can easily incorporate the different facets of
uncertainty components in therapy response decision
making. There are needs to study biological variability
and to study the algorithmic factors of computer-assisted
measurements in other size measures such as volume
metric which is mainly useful for thin slice CT scans (1.0
mm or less) .
Partly due to the observation that there is measurement
bias in the absolute nodule size measurements, alterna-
tive procedures have been investigated for direct change
measurements (e.g. [19,20]). However, we caution the
readers that the latter approach raises additional issues
with the uncertainty in the change measurements them-
selves and there are still issues on how to assess meas-
urement uncertainty in change-measurement data such as
for small nodules. Though there are many develop-
ments with RECIST, this important topic has received
little attention in the statistical literature (an exception is
), we believe there are ample opportunities for statis-
ticians to be engaged in this important medical image
decision analysis concerned with assessing therapeutic
The first two authors would like to thank our colleague
Qiming Wang for her work in analyzing and accessing
the DICOM images used in this paper, and to our col-
league Alden Dima who made the DICOM image data-
base server available to us.
 E. A. Eisenhauer, P. Therasse, J. Bogaerts, L. H. Schwartz,
D. Sargent, R. Ford, J. Dancey, S. Arbuck, S. Gwyther, M.
Mooney, L. Rubinstein, L. Shankar, L. Dodd, R. Kaplan,
D. Lacombe and J. Verweij, “New Response Evaluation
Criteria in Solid Tumours: Revised RECIST Guideline
(Version 1.1),” European Journal of Cancer, Vol. 45, No.
2, 2009, pp. 228-247. doi:10.1016/j.ejca.2008.10.026
 C. C. Jaffe, “Measures of Response: RECIST, WHO, and
New Alternatives,” Journal of Clinical Oncology, Vol. 24,
No. 20, 2006, pp. 3245-3251.
 H. Robbins, “Estimating Many Variances,” In: S. S. Gupta,
Ed., Statistical Decision Theory and Related Topics III,
Vol. 2, Academic Press, New York, 1982, pp. 251-261.
 H. Robbins, “Some Thoughts on Empirical Bayes Eesti-
mation,” Annals of Statistics, Vol. 11, No. 3, 1983, pp.
 L. H. Schwartz, M. Mazumdar, W. Brown, A. Smith and
D. M. Panicek, “Variability in Response Assessment in
Solid Tumors: Effect of Number of Lesions Chosen for
Measurement,” Clinical Cancer Research, Vol. 9, No. 12,
2003, pp. 4318-4323.
 Z. Q. J. Lu, N. Petrick, C. Fenimore, D. Clunie, K. Bor-
radaile, R. Ford, M. F. McNitt-Gray, H. J. G. Kim, R.
Zeng, M. A. Gavrielides, B. Zhao and A. J. Buckler, “Sta-
tistical Analysis of Reader Measurement Variability in
Nodule Sizing with CT Phantom Imaging Data,” NIST
Interagency Report, 2012.
 J. J. Erasmus, G. W. Gladish, L. Broemeling, B. S. Sabloff,
M. T. Truong, R. S. Herbst and R. F. Munden, “Interob-
server and Intraobserver Variability in Measurement of
Non-Small-Cell Carcinoma Lung Lesions: Implications
for Assessment of Tumor Response,” Journal of Clinical
Oncology, Vol. 21, No. 13, 2003, pp. 2574-2582.
 L. E. Dodd, R. F. Wagner, S. G. Armato III, M. F. McNitt-
Gray, S. Beiden, H.-P. Chan, D. Gur, G. McleNnan, C. E.
Metz, N. Petrick, B. Sahiner and J. Sayre, “Assessment
Methodologies and Statistical Issues for Computer-Aided
Diagnosis of Lung Nodules in Computed Tomography:
Contemporary Research Topics Relevant to the Lung
Image Database Consortium,” Academic Radiology, Vol.
11, No. 4, 2004, pp. 462-475.
 C. R. Meyer, T. D. Johnson, G. McLennan, D. R. Aberle,
E. A. Kazerooni, H. MacMahon, B. F. Mullan, D. F.
Yankelevitz, E. J. R. van Beek, S. G. Armato III, M. F.
McNitt-Gray, A. P. Reeves, D. Gur, C. I. Henschke, E. A.
Hoffman, R. H. Bland, G. Laderach, R. Pais, D. Qing, C.
Piker, J. Guo, A. Starkey, D. Max, B. Y. Croft and L. P.
Clarke, “Evaluation of Lung MDCT Nodule Annotation
Across Radiologists and Methods,” Academic Radiology,
Vol. 13, No. 10, 2006, pp. 1254-1265.
 RIDER: Reference Image Database to Evaluate Response,
Copyright © 2012 SciRes. OJS