Vol.5 No.11(2013), Article ID:40063,5 pages DOI:10.4236/health.2013.511254

Validation of general linear modeling for identifying factors associated with Quality of Life: A comparison with structural equation modeling*

Naoko Kumagai1,2#, Motonori Hatta3, Yashiyasu Okuhara2, Hideki Origasa4

1Integrated Center for Advanced Medical Technologies, Kochi Medical School, Kochi, Japan; #Corresponding Author:

2Center of Medical Information Science, Kochi Medical School, Kochi, Japan;

3Data Management, Actelion Pharmaceuticals Japan Ltd., Tokyo, Japan;

4Division of Biostatistics and Clinical Epidemiology, School of Medicine, University of Toyama, Toyama, Japan;

Copyright © 2013 Naoko Kumagai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received 3 September 2013; revised 8 October 2013; accepted 21 October 2013

Keywords: General Liner Modeling; Latent Variable; Standardized Path Coefficient; Standard Partial Regression Coefficient; Structural Equation Modeling


Purpose: General linear modeling (GLM) is usually applied to investigate factors associated with the domains of Quality of Life (QOL). A summation score in a specific sub-domain is regressed by a statistical model including factors that are associated with the sub-domain. However, using the summation score ignores the influence of individual questions. Structural equation modeling (SEM) can account for the influence of each question’s score by compositing a latent variable from each question of a sub-domain. The objective of this study is to determine whether a conventional approach such as GLM, with its use of the summation score, is valid from the standpoint of the SEM approach. Method: We used the Japanese version of the Maugeri Foundation Respiratory Failure Questionnaire, a QOL measure, on 94 patients with heart failure. The daily activity sub-domain of the questionnaire was selected together with its four accompanying factors, namely, living together, occupation, gender, and the New York Heart Association’s cardiac function scale (NYHA). The association level between individual factors and the daily activity sub-domain was estimated using SEM and GLM, respectively. The standard partial regression coefficients of GLM and standardized path coefficients of SEM were compared. If these coefficients were similar (absolute value of the difference <0.05), we concluded that GLM was valid, as well as the SEM approach. Results: The estimates of living together were −0.06 and −0.07 for the GLM and SEM. Likewise, the estimates of occupation, gender, and NYHA were −0.18 and −0.20, −0.08 and −0.08, 0.51 and 0.54, respectively. The absolute values of the difference for each factor were 0.01, 0.02, 0.00, and 0.03, respectively. All differences were less than 0.05. This means that these two approaches lead to similar conclusions. Conclusion: GLM is a valid method for exploring association factors with a domain in QOL.


In medical treatment, QOL has been defined as a personal sense of well-being and a multidimensional factor that generally includes physical, psychological, social, and spiritual dimensions or domains [1]. The distinctive feature of the research objectives of QOL is that the focus is typically on broad questions [2]. These questions are made up of multiple scales, such as the binary scale, with “yes or no” questions, graded scales including options such as, “very bad,” “bad,” “average,” “good,” and “very good”; as well as continuous scales such as the Visual Analogue Scale (VAS).

For a variety of QOL questionnaires, the general linear model, such as analysis of variance, is typically used to identify factors that are associated with a certain domain of QOL. Examples of these include research on the identification of a domain and related factors among HIVpositive individuals, as well as correlation studies on asymptomatic vertebral fractures and quality of life [3,4]. However, general liner modeling (GLM) uses the summative score obtained from scores on each question in a given sub-domain. This is because GLM cannot be used with multiple response variables. However, using the summation score ignores the influence of individual questions. In contrast, structural equation modeling (SEM) can deal with multiple responses and accounts for the influence of each question’s score by compositing a latent variable from each question of a domain. The objective of this study is to determine the validity of a conventional approach involving the use of the summation score and GLM, as compared to the SEM approach.


2.1. Materials and Subjects

2.1.1. Materials

The Japanese version of the Maugeri Foundation Respiratory Failure (MRF-28) Questionnaire is a 28-item, disease-specific, health-related QOL questionnaire for patients with chronic respiratory failure due to pulmonary diseases. The questionnaire is self-administered and easy to complete, with all items requiring either a “yes” or “no” answer [5]. It consists of four domains, namely, daily activity, cognitive function, invalidity, “other,” and two general questions about the patient’s health status [5].

2.1.2. Subjects

The sample included in-patients and out-patients with symptomatic and previous, asymptomatic heart failure at the University of Toyama Hospital in Japan. Participants were recruited between December 2005 and November 2006. The study was approved by the Ethics Committee at the University of Toyama; all the participants provided written, informed consent to take part [5]. We used this database. A total of 94 subjects enrolled for this study.

2.2. Independent Variables and Response Variables

For this study, we used one of four domains of the MRF-28 questionnaire as a response variable, namely, the daily activity domain (See Table 1). In addition, we used four factors as independent variables, namely, living together (cohabitation status), occupation, gender, and the New York Heart Association’s cardiac function scale (NYHA). The associations between the daily activity domain and the four factors were estimated using GLM and SEM. The daily activity domain consists of 11 questions that require a “yes” or “no” answer. “Yes” was assigned a score of 1, while “no” was assigned a score of 0. More “yes” answers indicated a greater burden from daily activity. A summation score was obtained from adding the scores on all 11 questions. With regard to living together, individuals staying with someone obtained a score of 1, while those living alone obtained 0. Currently employed individuals obtained a score of 1, while the unemployed obtained 0. Males were assigned a score of 1, while females were assigned a score of 0. Scores on the NYHA were divided into two groups; Class 2 was assigned a score of 0, while Class 3 and 4 were each assigned a score of 1. These were shown in Table 2 as Patent Characteristics.

Table 1. The 11 items in daily activity domain—Maugeri Foundation Respiratory Failure Questionnaire (MRF-28).

2.3. Statistical Analysis

2.3.1. GLM

GLM was a special case of SEM and could be expressed as Figure 1. The summation score was regressed by a model that included four factors, namely, living together, occupation, gender, and NYHA. Standard partial regression coefficients were estimated.

2.3.2. SEM

We plotted a path from the latent variable to each question and made a latent variable of daily activity (See

Table 2. Patient characteristics.

Figure 2); the association with a latent variable was estimated on the basis of Kendall’s correlation. The goodness-of-fit of the SEM was evaluated using the Standardized Root Mean Square Residual (SRMR) (good

Figure 1. The association between summation score of daily activity and the four factors.

Figure 2. The association between daily activity and the four factors.

models <0.08), Goodness of Fit Index (GFI) (good models >0.95), and Normed Fit Index (NFI) (good models > 0.90) [6].

Then, we examined the extent of the difference between the standardized path coefficient from a factor to the latent variable and the standard partial regression coefficient of GLM. If the absolute value of the difference is small (<0.05 of scale difference for scale of 0 - 1), that is, less than 0.05, then the assumption is that GLM is suitable, as well as SEM. We utilized the Statistical Analysis system (SAS Institute, Cary, NC, USA).


The structural equation model presented in Figure 2 (SRMSR = 0.078; GFI = 0.96; NFI = 0.93) depicted acceptable fits. The standard partial regression coefficient and standardized path coefficient of GLM, and of SEM for each factor—living together, occupation, gender, and NYHA class—were, respectively, as follows: −0.06 to −0.07; −0.18 to −0.20, −0.08 to −0.08, 0.51 to 0.54, as shown in Table 3. The absolute values of the differences were 0.01, 0.02, 0.00, and 0.03, respectively. All were less than 0.05. Both approaches showed similar estimates; in addition, the positive and negative signs were the same. As scores on the NYHA increased, alluding to severity of cardiac dysfunction, so did the burden of daily activities. Further, unemployed individuals also experienced more of this burden than those in occupation. This is most likely due to the fact that people who are not in employment often have disabilities that, to some extent, interfere with daily activities. People without cohabitants felt more burdened than those who were cohabiting, probably due to lack of assistance. In terms of gender, women tended to feel more burdened than men.


In this study, we used real quantitative data in order to assess whether GLM is appropriate for the identification of associated factors within QOL domains, as compared to SEM. The association between factors in the daily

Table 3. Comparison between the general liner model (GLM) and structural equation model (SEM).

activity domain was similar for both modeling approaches.

The general linear model was expressed as: Ysummation = β1X1living together + β2X2occupation + β3X3gender + β4X4NYHA + e (Equation (1)). The structural equation model was given as follows:

YMRF1 = γ1F + e1

YMRF2 = γ2F + e2

YMRF3 = γ3F + e3

YMRF4 = γ4F + e4

YMRF5 = γ5F + e5

YMRF6 = γ6F + e6

YMRF7 = γ7F + e7

YMRF8 = γ8F + e8

YMRF9 = γ9F + e9

YMRF10 = γ10F + e10

YMRF11 = γ11F + e11

F = α1X1living together + α2X2occupation + α3X3gender + α4X4NYHA + d (Equation (2))

The different results were due to the summation or latent variable of F, which consists of a correlation between each question and its error. Therefore, we assume that if a question of a given domain strongly correlated with the domain, and there was a homogenous association between the factors and the domain, then GLM and SEM would estimate similar results. Cronbach’s alpha for questions of daily activity was 0.9. This is considered considerably high. The questions of the sub-domain were closely related as a group. Under the well-constructed QOL sub-domain, the association between the factors was similarly estimated using the GLM and SEM approaches.

As a limitation, at least 100 cases, although 200 are preferable, are required for SEM. Our study had 94 cases, which is considerably smaller than the required number of cases. However, goodness-of-fit was appropriate, which means that the small sample size may not have a major influence [7]. As other limitations, it may need simulation study which examined among various values of Cronbach’s alpha to confirm our conclusions. Therefore, further more studies would be needed.


The QOL sub-domain is generally constructed with a high Cronbach’s alpha (>0.9) [8].

Although the high Cronbach’s alpha may not be directly related to validity of GLM, we assume that wellconstructed sub-domains result in GLM and SEM that yield suitable results.


The authors would like to thank all the patients who participated so willingly in the study.


  1. Ferrell, B.R. and Hassey, D.K. (1997) Quality of life among long-term cancer survivors. Oncology, 11, 565- 571.
  2. Fayers, P.M. and Machin, D. (2005) Quality of life: The assessment, analysis and interpretation of patient-reported outcomes, Japanese version. Nakayama Shoten Co., Ltd., Tokyo.
  3. Shan, D., Ge, Z., Ming, S., Wang, L., Sante, M., et al. (2011) Quality of life and related factors among HIVpositive spouses from serodiscordant couples under antiretroviral therapy in Henan Province, China. PLoS One, 6, e21839.
  4. Lopes, J.B., Fung, L.K., Cha, C.C., Gabriel, G.M., Takayama, L., Figueiredo, C.P. and Pereira, R.M. (2012) The impact of asymptomatic vertebral fractures on quality of life in older community-dwelling women: The São Paulo Ageing & Health Study. Clinics, (Sao Paulo), 67, 1401- 1406.
  5. Hatta, M., Joho, S., Inoue, H. and Origasa, H. (2009) A health-related quality of life questionnaire in symptomatic patients with heart failure: Validity and reliability of a Japanese version of the MRF28. Journal of Cardiology, 53, 117-126.
  6. Hu, L. and Bentler, P.M. (1999) Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
  7. Hoelter, D.R. (1983) The analysis of covariance structures: Goodness-of-fit indices, sociological. Methods and Research, 11, 325-344.
  8. Altman, D.G. and Bland J.M. (1997) Cronbach’s alpha. British Medical Journal, 314, 572.


*Competing interests: The authors declare that they have no competing interests.

Authors’ contributions: All authors have contributed substantially to the analysis of the data and preparation of the manuscript. All authors also read and approved of the final manuscript.