Open Journal of Applied Sciences, 2013, 3, 32-36
doi:10.4236/ojapps.2013.31B1007 Published Online April 2013 (http://www.scirp.org/journal/ojapps)
XLR: A Free Excel Add-In for Introductory Business
Statistics
Pin T. Ng
W. A. Franke College of Business, Northern Arizona University, Flagstaff, USA
Email: Pin.Ng@nau.edu
Received 2013
ABSTRACT
XLR is an Excel add-in that unifies the user friendly, widely popular interface of Excel with the powerful and robust
computational capability of the GNU statistical and graphical language R. The add- in attempts to address the American
Statistical Association’s comment that “Generic packages such as Excel are not sufficient even for the teaching of sta-
tistics, let alone for research and consulting.” R is the program of choice for researchers in statistical methodology that
is freely available under the Free Software Foundation’s GNU General Public License (GPL) Agreement. By wedding
the interactive mode of Excel with the power of statistical computing of R, XLR provides a solution to the problem of
numerical inaccuracy of using Excel and its various internal statistical functions and procedures by harnessing the
computational power of R. XLR will be distributed under the GNU GPL Agreement. The GPL puts students, instructors
and researchers in control of their usage of the software by providing them with the freedom to run, copy, distribute,
study, change and improve the software, thus, freeing them from the bondage of proprietary software. The creation of
XLR will not only have a significant impact on the teaching of an Introductory Business Statistics course b y providing a
free alternative to the commercial proprietary software but also provide researchers in all disciplines who require so-
phisticated and cuttin g edge statistical and graphical procedures with a user-friendly in teractive data analysis tool when
the current set of available commands is expanded to include more advance procedures.
Keywords: Excel Add-ins; Statistical Computing; Teaching Business Statistics; R
1. Introduction
In the last decades, there has been a trend among busi-
ness schools in the U.S. to shift the focu s of an Introduc-
tory Business Statistics course from the traditional ap-
proach of teaching statistics via formulae to an interpre-
tive approach which emphasizes interpretations of statis-
tical output obtained with the help of some statistical
software. For example, see [1-3]. This encourages and
enables students to concentrate on making more sense
out of the statistical results instead of exerting most of
their energy on the mechanics of calculating the various
statistics. A survey of existing textbooks in the market
reveals that a majority of them incorporate detailed in-
structions on Excel and its add-ins. Only a small portion
of the remaining textbooks utilize other statistical soft-
ware such as Minitab, SPSS or SAS to perform the sta-
tistical computations.
Excel and its add-ins remains the most popular route
among instructors because students in business schools
usually are expected to have a certain level of exposure
and competency in Excel by the time they enroll in an
Introductory Business Statistics course. So it makes
sense to teach the course using Excel/add-ins rather than
other statistical software to improve students’ learning
curve. However, in endorsement of the Mathematical
Association of America Guidelines for Programs and
Departments in Undergraduate Mathematical Sciences
[4], the American Statistical Association commented that
“Generic packages such as Excel are not sufficient even
for the teaching of statistics, let alone for research and
consulting.” Numerou s studies have highlighted the defi-
ciencies and dangers of using Excel as a statistical pack-
age for teaching and research. Reference [5-9] performs
extensive studies to reveal that several statistical algo-
rithms used in Excel yield erroneous results. Reference
[10] demonstrates the danger of using Excel to perform
regression analysis that involves correlated independent
variables. Reference [11,12] discuss additional problems
in using Excel while [13] recommends researchers against
using Excel for any scientific purpose.
As a result, there have been quite a few third-party
Excel add-ins written to address and attempt to solve the
problems of using Excel and its add-ins in the Microsoft
Data Analysis Toolpak. Some examples are Analyse-it®,
Data Analysis Plus, Fast Statistics©, Lumenaut©, N-
SEA©, PHStat®, PopTools, SigmaXL®, statistiXL©,
Copyright © 2013 SciRes. OJAppS
P. T. NG 33
StatTools®, UNISTAT®, and XLSTAT©. With the ex-
ception of PopTools, which is written specifically to
analyze ecological models, these add-ins are commercial
products that have an annual single user license fee that
ranges from $40.00 for statistiXL© to $300.00 for UNI-
STAT®. In light of the ever rising textbook prices and
costs of attending colleges/universities, it will be valu-
able to the students, instructors and researchers to have
the freedom of using an Excel add-in that utilizes the
familiar interface of Excel, and offers a reliable and an
extended range of statistical procedures without having
to be burdened with the licensing cost.
XLR intends to serve this purpose by being a free
software version of an Excel add-in that includes a fuller
range of statistical procedures that are commonly cov-
ered in a typical Introductory Business Statistics course
and free the users from the constraints that come with
proprietary software. It attempts to achieve this by har-
nessing the power of th e popular free sof twar e R [14 ] -- a
language and environment for statistical computing and
graphics.
2. Design Approach and Philosophy
Reference [15] provides a utility, R-Excel interface, to
connect Excel to R. The R-Excel interface uses DCOM
to embed R into Excel. Component object model (COM)
is a technology used on Microsoft Windows platforms
for server applications (e.g., R) to expo se functionality o f
a component (set of objects, e.g., R functions) to client
applications (e.g., Excel). DCOM is the distributed ver-
sion of COM that makes COM objects (e.g., a set of R
functions) transparently available across a network of
computers.
Specifically, R-Excel provides three interface modes:
(1) scratchpad and data transfer mode, (2) macro mode,
and (3) spreadsheet mode. The first scatchpad and data
transfer mode is designed for power users who have a
reasonable amount of knowledge in R. We did not adopt
this interface mode in XLR. We want to shield the users
from all the details of R computations and coding so that
the users think that they are working in Excel not R. The
macro mode allows us to put any statistical procedures
written in R on an Excel menu or button via VBA for
Excel. This is the pre-dominant interface design in XLR.
The spreadsheet mode, which retains the automatic re-
calculation feature in Excel, is adop ted to rewrite a list of
commonly used Excel functions by making functional
calls to the corresponding R functions and, hence, enable
us to solve the notorious problems of incorrect handling
of delicate numerical problems in Excel as reported in
[5,7-10].
R has an extensive collection of statistical (linear and
nonlinear modeling, classical statistical tests, time-series
analysis, classification, clustering, etc.) and graphical
(box and whisker plots, pie charts, pairs plot, coplot, 3D
plot, forest plot, etc.) procedures and is highly extensible
through the add-on packages, which number more than
4,000 to date. This makes it perfectly suitable to act as
the computational engine for Excel in the COM para-
digm so that we do not have to be concerned about rein-
venting the wheels of statistical computing procedures.
In our XLR implementation, Excel (client application)
utilizes R (server application) as a computational com-
ponent object. The users are not expected to know any
programming in R. Our goal is that they should not even
be aware that R is involved at all. The distributed version
of COM will be explored in a future proj ect.
Whenever possible, we adhere to the feel and look of
the input dialogue box and output format of the various
tools in Excel’s Analysis ToolPak. This will help make
users feel more at home in their familiar Excel comput-
ing environment. Figure 1 and Figure 2 show the dia-
logue box and output for the Simple Linear Regression in
XLR. Users can see that they look very much like the
dialogue box and output of the Regression tool in EX-
CEL’s Analysis ToolPak.
Figure 1. Dialogue box for the Simple Linear Regression
command in XLR.
Copyright © 2013 SciRes. OJAppS
P. T. NG
Copyright © 2013 SciRes. OJAppS
34
Figure 2. The output from the Simple Linear Regression command in XLR.
3. Features
3.1. Commands
The commands in XLR are grouped into 10 different
group of functionality as shown in Figure 4. Figure 5
shows the menu tree of XLR commands.
Figure 3. Scatter plot and fitted values generated by the
Simple Linear Regression command in XLR using R’s
graphic capability.
Excel’s Chart Tools are used in generating graphs so
that users can continue to use the Chart Tools to modify
their graphs. We also provide an option for the users to
generate the higher quality R graphs that can be saved in
the various graphic forms as shown in Figure 3. Figure 4. The 10 different groups of XLR commands.
P. T. NG 35
Figure 5. The menu tree of XLR commands.
3.2. Other Functions
A group of functions like RAVERAGE, RSTDEV,
RNORMINV, RNORMDIST, RCHIINV, RCHIDIST,
etc. are meant to replace the native Excel AVERAGE,
STDEV, NORMINV, NORMDIST, CHIINV, CHIDIST
commands. There is also a group of statistical functions
like pnorm, qnorm, dnorm to compute the probability,
quantile and density of a normal distribution. There are
groups for the other typical distributions like the Stu-
dent’s t, F, Chi-square, etc. as well.
4. Future Extensions
Distributed under the GNU GPL Agreement, anyone
who is interested can extend and modify the command
set and functions in XLR to include more advanced, cur-
rent and cutting edge procedures. Hence, potentially
XLR will not only have a significant impact on the
teaching of an Introductory Statistics course by providing
a free alternative to the commercial proprietary software
but also provide researchers in all disciplines who requ ire
sophisticated and cutting edge statistical and graphical
procedures with a user-friendly interactive data analysis
tool.
5. Acknowledgements
This project is partially funded by the Intramural Grant
Program at the Northern Arizona University.
REFERENCES
[1] P. C. Bell, “Teaching Business Statistics with Microsoft
Excel,” INFORMS Transactions on Education, Vol. 1, No.
1, 2000, pp. 18-26.
doi10.1287/ited.1.1.18
[2] J. Garfield, B. Hogg, C. Schau and D. Whittinghill, “First
Courses in Statistical Science: the Status of Educational
Reform Efforts,” Journal of Statistics Education, Vol. 10,
No. 2, 2002.
http://www.amstat.org/publications/jse/v10n2/garfield.ht
ml
[3] T. E. Love and D. K. Hildenbrand, “Statistics Education
and the Making Statistics More Effective in Schools of
Business Conferences,” The American Statistician, Vol.
56, No. 2, 2002, pp. 107-112.
doi10.1198/000313002317572772
[4] ASA, “Guidelines for Programs and Departments in Un-
dergraduate Mathematical Sciences,” American Statistical
Association (ASA) Endorsement of the Mathematical As-
sociation of America (MAA), 2000.
http://www07.homepage.villanova.edu/michael.posner/si
gmaastated/ASAendorsement2.html
[5] L. Knusel, “On the Accuracy of Statistical Distributions
in Microsoft Excel 97,” Computational Statistics and
Data Analysis, Vol. 26, No. 3, 1998, pp. 375-377.
doi:10.1016/S0167-9473(97)81756-2
[6] B. D. McCullough and D. A. Heiser, “On the accura cy of
statistical procedures in Microsoft Excel 2007,” Compu-
tational Statistics & Data Analysis, Vol. 52, No. 10, 2008,
pp. 4570-4578. doi:10.1016/j.csda.2008.03.004
[7] B. D. McCullough and B. Wilson, “On the Accuracy of
Statistical Procedures in Microsoft Excel 97,” Computa-
tional Statistics & Data Analysis, Vol. 31, No. 3, 1999,
pp. 27-37.
doi:10.1016/S0167-9473(99)00004-3
[8] B. D. McCullough and B. Wilson, “On the Accuracy of
Statistical Procedures in Microsoft Excel 2000 and Excel
Copyright © 2013 SciRes. OJAppS
P. T. NG
Copyright © 2013 SciRes. OJAppS
36
XP,” Computational Statistics & Data Analysis, Vol. 40,
No. 4, 2002, pp. 713-721.
doi:10.1016/S0167-9473(02)00095-6
[9] B. D. McCullough and B. Wilson, “On the accuracy of
statistical procedures in Microsoft Excel 2003,” Compu-
tational Statistics & Data Analysis, Vol. 49, No. 4, 2005,
pp. 1244-1252. doi:10.1016/j.csda.2004.06.016
[10] J. Simonoff, “Statistical Analysis Using Microsoft Ex-
cel,”2005.
http://pages.stern.nyu.edu/~jsimonof/classes/1305/pdf/exc
elreg.pdf
[11] N. Cox, “Use of Excel for Statistical Analysis,”2000.
http://users.df.uba.ar/sgil/tutoriales1/fisica_tutoriales/data
_analisis/Statistical_analysis.pdf
[12] J. D. Cryer, “Problems with Using Microsoft Excel for
Statistics,” the Joint Statistical Meetings, American Sta-
tistical As s ociation, 2001.
http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf
[13] A. T. Yalta, “The Accuracy of Statistical Distributions in
Microsoft Excel 2007,” Computational Statistics and
Data Analysis, Vol. 52, No. 10. 2008, pp. 4579-4586.
doi:10.1016/j.csda.2008.03.005
[14] R. Core Team, “R: A Language and Environment for
Statistical Computing,” R Foundation for Statistical Com-
puting, Vienna, Austria, 2012. http://www.R-project.org
[15] T. Baier and E. Neuwirth, “Excel::COM::R,” Computa-
tional Statistics, Vol. 22, No. 1, 2007, pp. 91-108.
doi:10.1007/s00180-007-0023-6