Background: Kidney (renal) diseases and dialysis are among the most costly disorders and represent a worldwide burden. In this study, we evaluate the medical costs for individuals with kidney diseases and risk factors for the diseases in Japan. Data and Methods: The dataset used contained 113,979 medical checkups and 3,172,066 medical cost records obtained from 48,022 individuals in one health insurance society. The sample period was April 2013 to March 2016. We evaluated the distribution of all medical costs, and those of kidney diseases specifically. Then the power transformation Tobit model was used to remove the effects of other variables. Finally, a probit analysis was used to analyze the risk factors. Results: In 0.25% of all cases, individuals were diagnosed with kidney diseases. An individual with kidney disease cost 14.5 times more than those without kidney disease. If the diseases progressed into chronic kidney disease (CKD), the medical costs increased substantially. Even disregarding various characteristics of individuals, this conclusion did not vary. We found important risk factors included diabetes and blood pressure problems. In particular, an individual with both factors had a high probability of developing kidney disease. Conclusion: Kidney diseases are much costlier than other diseases. Screening high-risk individuals, educating patients, and ensuring that treatment begins at an early stage are critically important to controlling medical costs. Limitations: The dataset was observatory, and the sample period was only 3 years.
Kidney (renal) diseases and dialysis are among the most costly of diseases, and are a worldwide burden [
The Centers for Disease Control and Prevention [
Albertus et al. [
Obviously, reducing new cases of CKD and its complications, disability, death, and economic costs have become an important goal [
Various studies have been conducted about the effectiveness and costs of kidney disease treatments, such as sustained low-efficiency dialysis, continuous renal replacement, prolonged intermittent replacement, and intermittent hemodialysist therapies [
In Japan, according to the Ministry of Health, Welfare, and Labor, CKD patients represented 24,100 inpatients and 107,300 outpatients as of October 21-23 in 2014 [
To evaluate total cost and economic burden, it is necessary to investigate kidney failure patients and to compare these patients with healthy individuals. Evaluating risk factors is also important for preventing prevalence of kidney failure. It is therefore necessary to investigate a dataset including both normal healthy individuals and kidney disease patients. However, in most countries, it is very difficult and costly to obtain a large-scale dataset that includes many normal and healthy individuals, because such people do not voluntarily go to hospitals or clinics.
The health insurance societies, formed by private companies and central and local governments, pay the costs for yearly mandatory medical checkups (hereafter, checkups) for employees age 40 or older [
In this paper, we first analyze the total costs and economic burden of kidney failure/dialysis using the dataset containing 113,979 checkups and 3,172,066 receipts obtained from 48,022 individuals. The distribution of medical costs shows a heavy tail on the right side and many “zeroes” are observed. Therefore, the model that combines the power transformation and Tobit model is used. Then, risk factors for becoming kidney failure/dialysis are investigated by the probit model.
In this study, we analyzed an anonymized dataset combining checkups and receipts. First, we compared the distributions of medical costs for all cases and for kidney failure/dialysis cases. Since other variables might affect medical costs, we then evaluated their effects. For example, individuals with kidney diseases sometimes have complications such as cardiovascular diseases [
To remove the effects of other variables, we required a regression analysis. However, there are two problems in using standard regression analysis for the medical cost data. The first is that it has many zero values (about 20%). The second is that the distribution has a very heavy tail on the right side. We therefore used the power transformation Tobit model for the analysis. Finally, we use the probit model to analyze the risk factors for kidney failure/dialysis. For details regarding Tobit and probit models, see Amaiya [
The dataset contained information regarding 113,979 checkups obtained from 48,022 members of the society and all their receipts from fiscal year 2013 to fiscal year 2015 (i.e., April 2013 through March 2016). It was created with the cooperation of the health insurance society of one large Japanese corporation that has offices and operational centers throughout Japan. The receipts were classified into five different categories: dental; inpatients of DPC hospitals (hereafter, DPC); outpatients and inpatients of non-DPC hospitals (hereafter, outpatient & non-DPC); care-giving; and pharmacies. Of these, we used DPC hospital, outpatient & non-DPC hospital, and pharmacy receipts for the analysis of kidney failure/dialysis. Total cost for these three categories is subsequently referred to as the “medical cost”. The number of DPC, outpatient & non-DPC hospital, and pharmacy, and total number of receipts for the three were 15,652, 1,986,494, 1,169,920 and 3,172,066, respectively, during the sample period. These receipts were added up and medical costs in each fiscal year were calculated. A total of 113,979 cases for which both the results of checkups and medical costs were available in the same fiscal year were used. The cases where body mass index (BMI) and diastolic blood pressure (DBP) values were too large (over 100 and 300, respectively) were excluded, leaving 95,353 cases without missing values in any explanatory variables for use in the analyses of medical costs and risk factors.
As mentioned, medical costs contain many zero values. To deal with this, a Tobit model (censored regression model, limited dependent variable model) is widely used. In the Tobit model, we can observe the value of observed dependent variables if it takes a positive value, and 0 if it takes 0 or a negative value. The model is given by
Y i * = x ′ 1 i β + u i (1)
Y i = Y i * if Y i * ≥ 0 and Y i = 0 if Y i * < 0
where x 1 i and β are vectors of explanatory variables and unknown parameters. Y i * is not observable when it the values are negative. For the estimation of the model, we assumed the normality of the distribution of the error term u i , and estimated the model by the maximum likelihood estimator (MLE). However, in this case, the distribution was quite different from the normal distribution, so that we could not use the MLE directly. Sittig, Friedel and Wasem [
Y i = M C i α , α > 0 (2)
where M C i is the medical cost, and α is the transformation parameter. Combining these two models, we obtained a power transformation Tobit model, and the log of the likelihood became
log L ( θ ) = ∑ M C i > 0 { − 1 2 log 2 π − log σ − ( Y i − x i ) 2 2 σ 2 + log α + ( α − 1 ) log M C i } + ∑ M C i = 0 log { 1 − Φ ( x ′ i β σ ) } (3)
where σ 2 is the variance of u i and θ = ( α , β ′ , σ ) . Let θ ^ be the MLE that maximizes log L ( θ ) , and θ 0 be the true parameter value. Then its asymptotic distribution is given by
n ( θ ^ − θ 0 ) → N ( 0 , A − 1 ) , A = lim n → ∞ 1 n E ∂ 2 log L ∂ θ ∂ θ ′ | θ 0 (4)
As explanatory variables that might affect medical costs, we chose the following variables: Age, Female, Height, BMI, SBP (systolic blood pressure), DBP (diastolic blood pressure), Eat_fast (1: eating faster than other people, 0: otherwise), Late_Supper (1: eating supper within two hours before bedtime three times or more in a week, 0: otherwise), After_supper (1: eating snacks after supper three times or more in a week, 0: otherwise), No_breakfast (1: not eating breakfast three times or more in a week, 0: otherwise), Exercise (1: doing exercise for 30 minutes or more two or more times a week for more than a year, 0 otherwise), Daily_activity (1: doing physical activities (walking or equivalent) for one hour or more daily, 0: otherwise), Walk_fast (1: walking faster than other people of a similar age and gender, 0: otherwise), Smoke (1: smoking, 0: otherwise), Alcohol_freq (0: not drinking alcoholic drinks, 1: sometimes, 2: every day), Alcohol_amount (0: not drinking, 1: drinking less than 180 ml of Japanese sake wine (about a 15% alcohol percentage) or equivalent alcohol in a day, 2: drinking 180 - 360 ml, 3: drinking 360 - 540 ml, 4: drinking 540 ml or more), Sleep (1: sleeping well; 0: otherwise), F2014 (1: fiscal year 2014, 0: otherwise), F2015 (1: fiscal year 2015, 0: otherwise), Kidney (1: with kidney failure/dialysis, 0:otherwise), Cerebrovascular (1: with cerebrovascular diseases 0: otherwise), Cardiovascular (1: with cardiovascular diseases, 0: otherwise), Diabetes (1: with diabetes, 0: otherwise), and Anamnesis (1: with anamnesis, 0: otherwise).
Age, Female and Height represent basic characteristics of an individual. BMI represents obesity [
As a result, Y i * in Equation (1) becomes (Model A):
Y i * = β 1 + β 2 A g e + β 3 F e m a l e + β 4 H e i g h t + β 5 B M I + β 6 S B P + β 7 D B P + β 8 E a t _ f a s t + β 9 L a t e _ S u p p e r + β 10 A f t e r _ S u p p e r + β 11 N o _ B r e a k f a s t + β 12 E x e r c i s e + β 13 D a i l y _ a c t i v i t y + β 14 W a l k _ f a s t + β 15 S m o k e + β 16 A l c o f o l _ f r e q + β 17 A l c o h o l _ a m o u n t + β 18 S l e e p + β 19 F 2014 + β 20 F 2015 + β 19 K i d n e y + β 20 C e r e b r o v a s c u l a r + β 21 C a r d i o v a s c u l a r + β 21 D i a b e t e s + β 22 A n a m n e s i s + u
Next, we evaluated risk factors of kidney failure/ dialysis by the probit model. Let Z i be a dummy variable that takes 1 if an individual has kidney failure/dialysis and 0 otherwise. The probit model (Model B) is given by
Z i * = x ′ 2 i γ + u i (6)
Z i = 1 if Z i * ≥ 0 and Z i = 0 if Z i * < 0 , and
P ( Z i = 1 ) = P ( Z i * ≥ 0 ) = Φ ( x ′ 2 i γ )
where u i follows the standard normal distribution and Φ is its distribution function. Z i * is a latent variable and only its sign is observable. Basic characteristics of individuals, lifestyles, and factors determined to be relevant in previous studies [
Z i * = γ 1 + γ 2 A g e + γ 3 F e m a l e + γ 4 H e i g h t + γ 5 B M I + γ 6 S B P + γ 7 D B P + γ 8 E a t _ f a s t + γ 9 L a t e _ S u p p e r + γ 10 A f t e r _ S u p p e r + γ 11 N o _ B r e a k f a s t + γ 12 E x e r c i s e + γ 13 D a i l y _ a c t i v i t y + γ 14 W a l k _ f a s t + γ 15 S m o k e + γ 16 A l c o f o l _ f r e q + γ 17 A l c o h o l _ a m o u n t + γ 18 S l e e p + γ 19 F 2014 + γ 20 F 2015 + γ 21 D i a b e t e s + u (7)
As mentioned before, Cerebrovascular, Cardiovascular and Anamnesis are considered as important risk factors. However, there are endogeneity problems of other disease variables. Kidney is an important anamnesis, and causalities among kidney, cerebrovascular and cardiovascular diseases are unclear. Therefore, we excluded these variables and considered the reduced form, rather than the structural form in this study.
The distribution of medical costs for all cases is shown in
medical costs (1000 points) | percent of cases | percent of costs | medical costs (1000 points) | percent of cases | percent of costs |
---|---|---|---|---|---|
0 | 18.93% | 0.0% | 50 - 75 | 1.81% | 8.0% |
0 - 1 | 8.35% | 0.4% | 100 - 200 | 0.93% | 9.3% |
1 - 5 | 25.45% | 4.9% | 200 - 300 | 0.39% | 6.8% |
5 - 10 | 14.71% | 7.8% | 300 - 400 | 0.14% | 3.6% |
10 - 15 | 9.30% | 8.4% | 400 - 500 | 0.08% | 2.7% |
15 - 20 | 6.42% | 8.2% | 500 - 600 | 0.08% | 3.3% |
20 - 30 | 7.31% | 13.0% | 600 - 700 | 0.05% | 2.4% |
30 - 40 | 3.55% | 8.9% | 700 - | 0.03% | 2.2% |
40 - 50 | 1.78% | 5.8% |
In 281 cases (0.25% of all cases), individuals were diagnosed with kidney failure/dialysis as an anamnesis and their medical costs were 3.5% of total medical costs. The average and SD were 191,791 and 242,490 points. This means their average medical cost was 14.5 times as large as that of those without kidney failure/dialysis. Moreover, for the cases where the medical costs were more than 100,000 and 500,000 points, kidney failure/dialysis cases accounted for 5.9% (=114/1945) and 32.1% (=61/190), respectively. Individuals underwent checkups from one to three years in our dataset. A total of 122 individuals were diagnosed with kidney failure/dialysis one year, while 30 and 33 individuals were diagnosed with kidney failure/dialysis two and three years. The average medical cost per fiscal year were 63,608, 206,935 and 340,576 points for those diagnosed with kidney failure/dialysis for one, two and three years.
Variable | Variable | ||
---|---|---|---|
Age | mean: 49.4, SD: 7.1 | Walk_fast | 1: 40.1%, 0: 59.9% |
Female | 1: 22.9%, 0: 77.1% | Smoke | 1: 38.7%, 0: 61.9% |
Height (cm) | mean: 167.3 SD: 8.6 | Alcohol_freq | 0: 35.0%, 1: 27.3%, 2: 37.6% |
BMI (kg/m2) | mean: 23.7, SD: 3.7 | Alcohol_amount | 0: 35.0%, 1: 22.5%, 2: 28.1%, 3: 11.3%, 4: 3.1% |
SBP (mmHg) | mean: 124.9, SD: 16.4 | Sleep | 1: 63.0%, 0: 37.0% |
DBP (mmHg) | mean: 77.3, SD: 11.7 | F2014 | 1: 33.4%, 0: 66.6% |
Eat_fast | 1: 31.8%, 0: 68.2% | F2015 | 1: 34.7%, 0: 65.3% |
Late_Supper | 1: 42.1%, 0: 57.9% | Kidney | 1: 0.25%, 1: 99.75% |
After_Supper | 1: 13.3%, 0: 86.9% | Cerebrovascular | 1: 0.92%, 0: 90.08% |
No_Breakfast | 1: 24.0%, 0: 76.0% | Cardiovascular | 1: 2.0%, 0: 98.0% |
Exercise | 1: 18.1%, 0: 71.9% | Diabetes | 1: 2.9%, 0: 97.1% |
Daily_activity | 1: 28.3%, 0: 71.7% | Anamnesis | 1: 47.9%, 0: 52.9% |
Variable | Estimate | SE | t-value | Variable | Estimate | SE | t-value |
---|---|---|---|---|---|---|---|
α | 0.506 | 0.001 | 699.58** | Alcohol_freq | −5.055 | 0.500 | −10.11** |
Constant | −149.052 | 9.564 | −15.58** | Alcohol_amount | 3.082 | 0.365 | 8.437** |
Age | 2.006 | 0.043 | 46.83** | Sleep | −3.939 | 0.556 | −7.090** |
Female | 15.432 | 0.958 | 16.106** | F2014 | −0.354 | 0.652 | -0.543 |
Height | 0.154 | 0.047 | 3.295** | F2015 | 2.914 | 0.641 | 4.549** |
BMI | 3.368 | 0.078 | 43.03** | Kidney | 233.727 | 2.832 | 82.54** |
SBP | −0.010 | 0.024 | −0.402 | Cerebrovascular | 48.585 | 1.894 | 25.66** |
DBP | 0.115 | 0.035 | 3.249** | Cardiovascular | 60.698 | 1.416 | 42.86** |
Eat_fast | 6.267 | 0.603 | 10.40** | Diabetes | 65.955 | 11.588 | 5.691** |
Late_Supper | 1.121 | 0.569 | 1.969* | Anamnesis | 36.753 | 0.948 | 38.76** |
After_Supper | −1.714 | 0.805 | −2.129* | σ | 80.601 | 0.777 | 103.72 |
No_Breakfast | −15.265 | 0.656 | −23.29** | No. of Cases Total) | 95,363 | ||
Exercise | −0.050 | 0.729 | −0.069 | y = 0 | 18,221 | ||
Daily_activity | −4.453 | 0.622 | −7.163** | y > 0 | 77,122 | ||
Walk_fast | −7.308 | 0.574 | −12.74** | Log likelihood | −896,846 | ||
Smoke | −13.513 | 0.590 | −22.91** |
*: significant at the 5% level, **: significant at the 1% level, SE: standard error.
costs, too much increased costs. The estimate of F2015 was positive and significant at the 1% level. Concerning the disease variables, all estimates were positive and significant at the 1% level. The values were much larger than those of other variables. If an individual has an anamnesis, the medical costs become much higher, as expected. In particular, the estimate for Kidney was 233.7, much higher than even those of other diseases; this means that kidney failure/dialysis is a very costly disease even after eliminating the effects of other variables.
The results of our analyses suggested that kidney failure/dialysis is a very costly disease, especially when it progresses (i.e., becomes CKD). The average medical
Variable | Estimate | SE | t-value | Variable | Estimate | SE | t-value |
---|---|---|---|---|---|---|---|
Constant | −4.2392 | 0.7760 | −5.463** | Walk_fast | −0.1143 | 0.0488 | −2.342* |
Age | 0.0149 | 0.0033 | 4.523** | Smoke | −0.0698 | 0.0511 | −1.366 |
Female | −0.0605 | 0.0795 | −0.760 | Alcohol_freq | −0.1027 | 0.0447 | −2.297* |
Height | 0.0027 | 0.0041 | 0.672 | Alcohol_amount | −0.0112 | 0.0338 | −0.330 |
BMI | −0.0113 | 0.0066 | −1.718 | Sleep | 0.0572 | 0.0483 | 1.184 |
SBP | 0.0099 | 0.0019 | 5.134** | F2014 | −0.0054 | 0.0226 | −0.240 |
DBP | −0.0096 | 0.0029 | −3.313** | F2015 | −0.0118 | 0.0224 | −0.528 |
Eat_fast | 0.0461 | 0.0501 | 0.922 | Diabetes | 0.3313 | 0.0863 | 3.840** |
Late_Supper | −0.0338 | 0.0494 | −0.684 | No. of Cases (Total) | 95363 | ||
After_Supper | 0.0840 | 0.0630 | 1.334 | y = 1 | 215 | ||
No_Breakfast | −0.0940 | 0.0618 | −1.522 | y = 0 | 95148 | ||
Exercise | −0.0396 | 0.0610 | −0.649 | Log likelihood | −1460.9 | ||
Daily_activity | 0.0535 | 0.0519 | 1.031 |
*: significant at the 5% level, **: significant at the 1% level, SE: standard error.
cost per fiscal year for those diagnosed with kidney failure/dialysis for one, two and three years were 4.8, 15.6 and 25.6 times as much as those of individuals without kidney failure. As a result, 99 cases for 33 individuals with kidney failure/dialysis, less than 0.1% of all cases, used 2.2% of total medical costs. Therefore, prevention and treatment at an early stage of kidney failure/dialysis, especially to avoid CKD, are a very important issue for the Japanese medical system.
The estimation results of Model A suggested that kidney failure/dialysis is a truly costly disease even when the effects of other variables are eliminated. Since the power transformation Tobit model is used, we calculated the average cost by computer simulation of 10,000 trials. For comparison of medical costs, we consider a male age 50, height 170 cm, BMI 24, SBP 125 mmHg, DBP 80 mmHg; the values of all other variables were set to zero as a base case. The average annual medical cost of this individual was about 9200 points. If he had kidney failure/dialysis, however, the cost increased to 103,390 points, or 11.3 times higher than the base case. The medical cost of an individual with both kidney failure/dialysis and diabetes became 147,960 points or 16.2 times higher.
Next, we calculated the probability of an individual becoming kidney failure/dialysis using the results of Model B and the formula P ^ ( z i = 1 ) = Φ ( x ′ 2 i γ ^ ) . For the base case, the probability of an individual becoming kidney failure/di- alysis was 0.23%. If this individual had diabetes, the probability increased to 0.61% or 2.7 times. Nawata and Kimura [
In this paper, we evaluated the medical costs and probability of kidney diseases (kidney failure/dialysis). In terms of total medical costs, the distribution shows a very heavy tail on the right side. In a small number (0.25%) of cases, individuals diagnosed with kidney diseases had medical costs totaling 3.5% of all medical costs. Care for an individual with kidney disease cost 14.5 times the cost of individuals without kidney disease. Moreover, if the disease progressed to CKD, the medical cost increased substantially. We then used the power transformation Tobit model to eliminate effects of other variables that might affect the medical costs. Even disregarding various characteristics, lifestyles and medical histories of individuals, the conclusion was the same; that is, kidney diseases are truly very costly diseases. Finally, risk factors were evaluated using the probit model. The important risk factors for kidney diseases are diabetes and blood pressure problems (not only hypertension of SBP, but also hypotension of DBP). In particular, an individual with both diabetes and blood pressure problems has a very high probability of developing kidney diseases. These results could help medical personnel to identify high-risk individuals and provide them with sound advice and/or treat them at an early stage of the kidney disease.
We note two limitations to this study. First, our dataset was observatory and covered only 3 years. Many individuals with kidney diseases were already receiving some medical treatments that might have affected the values of explanatory variables, and individuals with severe kidney diseases might have had to leave the company and thus the health insurance society. Second, the number of individuals with kidney diseases was relatively small. It will be necessary to analyze a larger and longer range dataset from various insurance societies. We are currently negotiating various health insurance societies to provide us such datasets. In Japan, the medical costs are determined by the government and the same payment system is used independent of regions with a few exceptions. Moreover, an individual can freely choose hospitals and clinics. Analyses of regional effects and characteristics of hospitals and clinics are very important. We will also need to analyze other costly diseases such as heart and brain diseases. These are subjects to be studied in future.
This study was supported by a Grant-in-Aid for Scientific Research, “Analyses of Medical Checkup Data and Possibility of Controlling Medical Expenses (Grant Number: 17H22509),” from the Japan Society of Science, and by a research grant, “Exploring Inhibition of Medical Expenditure Expansion and Health-oriented Business Management Based on Evidence-based Medicine” from the Research Institute of Economics, Trade and Industry (RIETI). The dataset was anonymized at the health insurance society. This study was approved by the Institutional Review Boards of the University of Tokyo (number: KE17-10). The authors would like to thank the health insurance society for their sincere cooperation in providing us the data. We would also like to thank an anonymous referee for his/her helpful comments and suggestions.
Nawata, K. and Kimura, M. (2017) Evaluation of Medical Costs of Kidney Diseases and Risk Factors in Japan. Health, 9, 1734-1749. https://doi.org/10.4236/health.2017.913127