\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Predictive analytics for 30-day hospital readmissions

The first author is supported by MTSU FRCAC grant 2019

Abstract Full Text(HTML) Figure(8) / Table(8) Related Papers Cited by
  • The 30-day hospital readmission rate is the percentage of patients who are readmitted within 30 days after the last hospital discharge. Hospitals with high readmission rates would have to pay penalties to the Centers for Medicare & Medicaid Services (CMS). Predicting the readmissions can help the hospital better allocate its resources to reduce the readmission rate. In this research, we use a data set from a hospital in North Carolina during the years from 2011 to 2016, including 71724 hospital admissions. We aim to provide a predictive model that can be helpful for related entities including hospitals, health insurance actuaries, and Medicare to reduce the cost and improve the clinical outcome of the healthcare system. We used R to process data and applied clustering, generalized linear model (GLM) and LASSO regressions to predict the 30-day readmissions. It turns out that the patient's age is the most important factor impacting hospital readmission. This research can help hospitals and CMS reduce costly readmissions.

    Mathematics Subject Classification: Primary: 62P10, 62J05; Secondary: 91C20.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Histogram of days between the admissions (DBA)

    Figure 2.  Histogram of the log(DBA+1)

    Figure 3.  Box plot of patient marital status VS log(DBA+1)

    Figure 4.  The cluster dendrogram of $ \texttt{Doctor.Number}$. Each vertical line at Height 0 represents one doctor. The doctors inside the same red box are clustered as one

    Figure 5.  Box plot of PATIENT.SEX.CODE VS log(DBA+1) split by inpatient (I) and outpatient (O)

    Figure 6.  Box plot of Age vs log(DBA+1) split by inpatient (I) and outpatient (O)

    Figure 7.  Residual vs Fitted. The left figure is for GLM with features selected from the above session, the right figure is for OLS with all predictors

    Figure 8.  Q-Q plots of GLM (left) vs ordinary least squares regression (OLS)

    Table 1.  The mean and median of seven levels of PATIENT.MARITAL.STATUS

    PATIENT.MARITAL.STATUS mean median n
    D 4.06432 4.20469 8122
    M 4.31861 4.48864 26568
    P 4.84009 5.01728 21
    S 4.42558 4.67283 22164
    U 3.41953 2.83321 120
    W 4.05355 4.11087 7196
    X 4.05785 4.07754 1492
     | Show Table
    DownLoad: CSV

    Table 2.  The largest loadings of related predictors in the first principal component (PC1)

    Variable Name PC1
    ICD.PROCEDURE.CODE1 -0.38498161
    HOSPITAL.SERVICE.CODEG -0.38284437
    ICD9$ \_ $DIAGNOST$ \_ $CODE3 -0.29296143
    PATIENT.DDRG1 -0.19955252
    ICD9$ \_ $DIAGNOST$ \_ $CODE2 0.38297941
    ICD.PROCEDURE.CODE5 0.38210699
    HOSPITAL.SERVICE.CODEB 0.36536618
    PATIENT.DRG4 0.24330584
     | Show Table
    DownLoad: CSV

    Table 3.  The results of GLM on training data generated by R

    Coefficients: Estimate Std. Error t value p-value Significant Code
    (Intercept) 5.99381 0.1633 36.7 < 2e-16 ***
    DOCTOR.NUMBER2 -0.60542 0.06983 -8.67 < 2e-16 ***
    DOCTOR.NUMBER3 -0.1012 0.04012 -2.52 0.01166 *
    DOCTOR.NUMBER4 0.31089 0.03528 8.81 < 2e-16 ***
    DOCTOR.NUMBER5 0.52643 0.35486 1.48 0.13795 ·
    ServCode_DRG_ICD -0.20195 0.00603 -33.48 < 2e-16 ***
    Age10-20 -0.26285 0.05948 -4.42 9.90E-06 ***
    Age100+ -1.86377 0.39675 -4.7 2.60E-06 ***
    Age20-30 -0.61615 0.05559 -11.08 < 2e-16 ***
    Age30-40 -0.66055 0.05605 -11.78 < 2e-16 ***
    Age40-50 -0.63653 0.05632 -11.3 < 2e-16 ***
    Age50-60 -0.63357 0.05637 -11.24 < 2e-16 ***
    Age60-70 -0.63019 0.05726 -11.01 < 2e-16 ***
    Age70-80 -0.60506 0.05855 -10.33 < 2e-16 ***
    Age80-90 -0.62985 0.06173 -10.2 < 2e-16 ***
    Age90-100 -0.6078 0.08093 -7.51 6.00E-14 ***
    Surgeon2 -0.27524 0.02279 -12.08 < 2e-16 ***
    Surgeon3 -0.18455 0.06706 -2.75 0.00592 **
    Surgeon4 -0.36174 0.04333 -8.35 < 2e-16 ***
    Surgeon5 0.53341 0.34909 1.53 0.12652 ·
    PATIENT.MARITAL.STATUSM 0.16705 0.01716 9.73 < 2e-16 ***
    PATIENT.MARITAL.STATUSPS 0.05911 0.02037 2.9 0.00371 **
    Nur.StatMS_CCU_ER_PC 0.02249 0.03593 0.63 0.53137 ·
    Nur.StatWS 0.2277 0.06175 3.69 0.00023 ***
    DISCHARGE.STATUSR 0.39629 0.0448 8.85 < 2e-16 ***
    DISCHARGE.STATUSY 0.43827 0.05653 7.75 9.20E-15 ***
    income_levellow -0.11351 0.02611 -4.35 1.40E-05 ***
    income_levelmeduim -0.09598 0.01506 -6.37 1.80E-10 ***
    Patient.Days20-30 -0.28387 0.04367 -6.5 8.10E-11 ***
    Patient.Days30+ -0.48626 0.32127 -1.51 0.13015 ·
    PATIENT.RACE.CODEBO -0.18993 0.14455 -1.31 0.18887 ·
    PATIENT.RACE.CODEDH -0.99651 0.20834 -4.78 1.70E-06 ***
    PATIENT.RACE.CODEWX -0.11298 0.14369 -0.79 0.43171 ·
    IO_CODEO -0.18799 0.04251 -4.42 9.80E-06 ***
    Significant codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
     | Show Table
    DownLoad: CSV

    Table 4.  The R output of the generalized linear model (GLM) on all data

    Coefficients: Estimate Std. Error t value p-value Significant Code
    (Intercept) 5.84694 0.06366 91.84 < 2e-16 ***
    DOCTOR.GROUP2 -0.58933 0.06102 -9.66 < 2e-16 ***
    DOCTOR.GROUP3 -0.1105 0.03487 -3.17 0.00153 **
    DOCTOR.GROUP4 0.28857 0.03085 9.36 < 2e-16 ***
    ServCode_DRG_ICD -0.201 0.00528 -38.1 < 2e-16 ***
    Age10-20 -0.30656 0.05169 -5.93 3.00E-09 ***
    Age100+ -1.16735 0.35101 -3.33 0.00088 ***
    Age20-30 -0.63787 0.0482 -13.23 < 2e-16 ***
    Age30-40 -0.69793 0.04861 -14.36 < 2e-16 ***
    Age40-50 -0.67707 0.04885 -13.86 < 2e-16 ***
    Age50-60 -0.67009 0.04887 -13.71 < 2e-16 ***
    Age60-70 -0.66848 0.04966 -13.46 < 2e-16 ***
    Age70-80 -0.63585 0.05076 -12.53 < 2e-16 ***
    Age80-90 -0.63634 0.0536 -11.87 < 2e-16 ***
    Age90-100 -0.66369 0.07005 -9.47 < 2e-16 ***
    Surgeon2 -0.28085 0.01977 -14.21 < 2e-16 ***
    Surgeon3 -0.22311 0.0586 -3.81 0.00014 ***
    Surgeon4 -0.3631 0.03772 -9.63 < 2e-16 ***
    PATIENT.MARITAL.STATUSM 0.18697 0.01491 12.54 < 2e-16 ***
    PATIENT.MARITAL.STATUSPS 0.06817 0.01772 3.85 0.00012 ***
    Nur.StatWS 0.20997 0.04929 4.26 2.10E-05 ***
    DISCHARGE.STATUSR 0.41057 0.0387 10.61 < 2e-16 ***
    DISCHARGE.STATUSY 0.45851 0.04926 9.31 < 2e-16 ***
    income_levellow -0.08182 0.02262 -3.62 0.0003 ***
    income_levelmeduim -0.09786 0.01307 -7.49 7.20E-14 ***
    Patient.Days20-30 -0.22549 0.03781 -5.96 2.50E-09 ***
    PATIENT.RACE.CODEDH -0.57857 0.13194 -4.39 1.20E-05 ***
    PATIENT.RACE.CODEWX 0.07868 0.01669 4.71 2.40E-06 ***
    IO_CODEO -0.21764 0.02759 -7.89 3.10E-15 ***
    Significant codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
     | Show Table
    DownLoad: CSV

    Table 5.  Interpretation of the GLM results

    Feature Coefficients (β) exp(β)-1 Interpretation
    DOCTOR.GROUP = 2 -0.58933 -0.45 45% decrease in DBA compared to the base group DOCTOR.GROUP = 1.
    This means the group 2 of doctors are less effective in terms of improving the DBA (or reducing 30-day readmission) than group 1.
    DOCTOR.GROUP = 3 -0.11050 -0.10 10% decrease in DBA compared to the group DOCTOR.GROUP = 1.
    DOCTOR.GROUP = 4 0.28857 0.33 33% decrease in DBA compared to the group DOCTOR.GROUP = 1. Therefore group 4 of doctors are more effective in improving the DBA than group 1.
    ServCode_DRG_ICD -0.20100 -0.18 18% decrease in DBA for every 1.0 increase in the feature ServCode_DRG_ICD, which is the new artificial feature made using the PCA from the predictors PATIENT.DRG, HOSPITAL.SERVICE.CODE, ICD.PROCEDURE.CODE, ICD9_DIAGNOST_CODE
    Age = 10-20 -0.30656 -0.26 26% decrease in DBA compared to Age 0-10. This makes sense because younger people are less likely to be readmitted.
    Age = 20-30 -0.63787 -0.47 47% decrease compared to Age 0-10.
    Age = 100+ -1.16735 -0.69 69% decrease compared to Age 0-10.
    Surgeon = 2 -0.28085 -0.24 24% decrease of DBA compared with Surgeon = 1.
    PATIENT.MARITAL.STATUS = M 0.205591 0.21 21% increase of DBA for patients in marriage compared with base-level MARITAL.STATUS = DUWXNA, which is the group for divorced, widowed, or unknown status. People in marriage usually can be taken care of better thus have better health.
    PATIENT.MARITAL.STATUS = PS 0.06817 0.07 7% increase of DBA for patients with a domestic partner (P) or single (S) compared with base-level MARITAL.STATUS = DUWXNA.
    income_level = low -0.08182 -0.08 8% decrease of DBA for low-income patients compared to high-income patients. This makes sense because low-income patients may not afford enough healthcare to maintain good health.
    income_level = medium -0.09786 -0.09 9% decrease of DBA compared to high-income patients. The medium-income patients have even worse readmission days than low-income patients maybe because they have longer working hours, higher mental pressure.
    Patient.Days = 20-30 -0.22549 -0.20 20% decrease compared to its base level, the visits whose Patient.Days smaller than 20 or greater than 30. This suggests the visits whose inpatient days are between 20-30 days are most likely to be readmitted with short readmission days.
    PATIENT.RACE.CODE = WX 0.07868 0.08 8% increase of DBA for PATIENT.RACE.CODE is W or X compared to its base level PATIENT.RACE.CODE = AIMNT.
     | Show Table
    DownLoad: CSV

    Table 6.  Data Dictionary

    Variable Name Definition Date type and values
    PATIENT.DRG Patient diagnosis-related group Integer 0-999
    NurStat Nurses Station Letters code representing the type of nurses station, with majority values missing.
    DOCTOR.NUMBER ID of the doctor Integer
    Surgeon ID of the surgen Integer
    IO_CODE Inpatient or outpatient I: inpatient
    O: outpatient
    HOSPITAL.SERVICE.CODE A code representing the type of healthcare service Letters code. No missing value.
    ADMIT.SOURCE The code indicating the source of the referral for the admission or visit. 1:Physician Referral
    2:Clinic Referral
    3:HMO Referral
    4:Transfer from a Hospital
    6:Transfer from Another Health Care Facility
    8:Court/Law Enforcement
    9:Information Not Available
    DISCHARGE.STATUS Patient discharge status Integer code
    PATIENT.SEX.CODE A$\cdot$code$\cdot$indicating the$\cdot$sex$\cdot$of the$\cdot$patient. F:Female
    M:Male
    U: Unknown
    PATIENT.MARITAL.STATUS Marital status D: divorced
    S:single
    M:married
    W:widowed
    U:unknown
    P:partnered
    X:legally Separated
    PATIENT.RACE.CODE Code$\cdot$indicating the$\cdot$racial$\cdot$or ethnic background of a person. A:Asian or Pacific Islander
    B:Black
    D:Subcontinent Asian American
    H:Hispanic
    I:American Indian or Alaskan Native
    N:Black(Non-Hispanic)
    O:White(Non-Hispanic)
    W:widow
    X:legally separated
    ICD.PROCEDURE.CODE ICD-10 Procedure Coding Integer code
    ICD9_DIAGNOST_CODE ICD-9-CM Diagnosis Codes Integer code
    PATIENT ZIP Patient zip code Integer code
    DBA days between the admissions Non-negative integer
     | Show Table
    DownLoad: CSV

    Table 7.  Levels combined for predictors

    Variable Name Levels before combined Levels after combined
    Nur.Stat isNA
    MS
    CCU
    ER
    PC
    isNA
    WS WS
    Patient.Days 0-19
    30+
    0-19or30+
    20-30 20-30
    HOSPITAL.SERVICE.CODE OPS
    CTH
    NB
    Y
    DX
    MED
    OBS
    OPB
    OPV
    EOB
    R
    WND
    IVT
    SO
    WWC
    NP
    B
    ER G
    ADMIT.SOURCE 9 9
    1
    6
    16
    8 8
    4 4
    2
    5
    25
    PATIENT.SEX.CODE M
    U
    M
    F F
    PATIENT.MARITAL.STATUS M M
    P
    S
    PS
    D
    U
    W
    X
    NA
    DUWXNA
    PATIENT.RACE.CODE W
    X
    WX
    A
    I
    M
    N
    T
    B
    O
    AIMNT
    D
    H
    DH
    DISCHARGE.STATUS 1
    7
    30
    R
    2
    5
    43
    62
    63
    65
    70
    72
    Y
    3
    4
    6
    9
    21
    50
    51
    64
    81
    82
    83
    G
     | Show Table
    DownLoad: CSV

    Table 8.  The mean and median of log(DBA+1) within three levels of Patient.Days

    Patient.Days Mean of log(DBA+1) Median of log(DBA+1)
    0-19 4.34939 4.54329
    30+ 3.86552 3.98744
    20-30 2.59868 2.07944
     | Show Table
    DownLoad: CSV
  • [1] J. BenuzilloW. CaineR. S. EvansC. RebortsD. Lappe and J. Doty, Predicting readmission risk shortly after admission for CABG surgery, Journal of Cardiac Surgery, 33 (2018), 163-170.  doi: 10.1111/jocs.13565.
    [2] C. Boccuti and G. Casillas, Aiming for fewer hospital u-turns: The medicare hospital readmission reduction program, Henry J. Kaiser Family Foundation, (2017). https://collections.nlm.nih.gov/catalog/nlm:nlmuid-101707559-pdf.
    [3] R. CaruanaY. LouJ. GehrkeP. KochM. Sturm and N. Elhadad, Intelligible models for healthCare: Predicting pneumonia risk and hospital 30-day readmission, KDD, 10 (2015), 1721-1730.  doi: 10.1145/2783258.2788613.
    [4] S. CuiD. WangY. WangP. Yu and Y. Jin, An improved support vector machine-based diabetic readmission prediction, Computer Methods and Programs in Biomedicine, 166 (2018), 123-135.  doi: 10.1016/j.cmpb.2018.10.012.
    [5] J. A. DodsonA. M. HajdukT. E. MurphyM. GedaH. M. KrumholzS. TsangM. G. NannaM. E. TinettiD. GoldsteinD. E. FormanK. P. AlexanderT. M. Gill and S. I. Chaudhry, Thirty-day readmission risk model for older adults hospitalized with acute myocardial infarction, Circulation: Cardiovascular Quality and Outcomes, 12 (2019).  doi: 10.1161/CIRCOUTCOMES.118.005320.
    [6] M. Fionn and P. Legendre, Ward's hierarchical agglomerative clustering method: Which algorithms implement ward's criterion?, J. Classification, 31 (2014), 274-295.  doi: 10.1007/s00357-014-9161-z.
    [7] Y. Gai and D. Pachamanova, Impact of the Medicare hospital readmissions reduction program on vulnerable populations, BMC Health Services Research, 19 (2019), 1-15.  doi: 10.1186/s12913-019-4645-5.
    [8] R. GardnerQ. LiR. R. BaierK. ButterfieldE. A. Coleman and S. Gravenstein, Is implementation of the care transitions intervention associated with cost avoidance after hospital discharge?, Journal of General Internal Medicine, 29 (2014), 878-884.  doi: 10.1007/s11606-014-2814-0.
    [9] S. B. GolasT. ShibaharaS. AgboolaH. OtakiJ. SatoT. NakaeT. HisamitsuG. KojimaJ. FelstedS. KakarmathJ. Kvedar and K. Jethwani, A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: A retrospective analysis of electronic medical records data, BMC Medical Informatics and Decision Making, 18 (2018), 44.  doi: 10.1186/s12911-018-0620-z.
    [10] A. L. Hines, M. L. Barrett, H. J. Jiang and C. A. Steiner, Conditions with the largest number of adult hospital readmissions by payer, Healthcare Cost and Utilization Project (HCUP) Statistical Briefs, Statistical brief 172, 2011.
    [11] Q. L. HuynhK. NegishiL. BlizzardK. SandersonA. J. Venn and T. H. Marwick, Risk factors for 30-day readmissions after acute myocardial infarction, International Cardiovascular Forum Journal, 4 (2015), 30-36. 
    [12] Q. L. HuynhK. NegishiL. BlizzardK. SandersonA. J. Venn and T. H. Marwick, Predictive score for 30-day readmission or death in heart failure, JAMA Cardiol, 1 (2016), 362-364.  doi: 10.1001/jamacardio.2016.0220.
    [13] S. F. JencksM. V. Williams and E. A. Coleman, Rehospitalizations among patients in the Medicare fee-for-service program, New England J. Medicine, 360 (2009), 1418-1428. 
    [14] D. KansagaraH. EnglanderA. SalanitroD. KagenC. TheobaldM. Freeman and S. Kripalani, Risk prediction models for hospital readmission: A systematic review, JAMA, 306 (2011), 1688-1698.  doi: 10.1001/jama.2011.1515.
    [15] H. M. KrumholzA. R. MerrillE. M. SchoneG. C. SchreinerJ. ChenE. H. BradleyY. WangY. WangZ. LinB. M. StraubeM. T. RappS. T. Normand and E. E. Drye, Patterns of hospital performance in acute myocardial infarction and heart failure 30-day mortality and readmission, Circulation: Cardiovascular Quality and Outcomes, 2 (2009), 407-413.  doi: 10.1161/CIRCOUTCOMES.109.883256.
    [16] H. M. Krumholz, A. R. Merrill, E. M. Schone, G. C. Schreiner, J. Chen, E. H. Bradley, Y. Wang, Y. Wang, Z. Lin, B. M. Straube, M. T. Rapp, S. T. Normand and E. E. Drye, Hospital 30-day pneumonia readmission measure: Methodology, Centers for Medicare & Medicaid Services, https://qualitynet.cms.gov/inpatient/measures/readmission/methodology.
    [17] S. LeeS. WangP. A. BainC. BakerT. KundingerC. Sommers and J. Li, Reducing COPD readmissions: A causal bayesian network model, IEEE Robotics and Automation Letters, 3 (2018), 4046-4053.  doi: 10.1109/LRA.2018.2861084.
    [18] X. LiuY. ChenJ. BaeH. LiJ. Johnston and T. Sanger, Predicting heart failure readmission from clinical notes using deep learning, 2019 IEEE International Conference on Bioinformatics and Biomedicine, (2019), 2642-2648.  doi: 10.1109/BIBM47256.2019.8983095.
    [19] J. F. MatherG. J. FortunatoJ. L. AshM. J. Davis and A. Kumar, Prediction of pneumonia 30-day readmissions: A single-center attempt to increase model performance, Respiratory Care, 59 (2014), 199-208.  doi: 10.4187/respcare.02563.
    [20] C. K. McIlvennanZ. J. Eapen and L. A. Allen, Hospital readmissions reduction program, Circulation, 131 (2015), 1796-1803.  doi: 10.1161/CIRCULATIONAHA.114.010270.
    [21] J. E. SimmeringL. A. PolgreenA. P. ComellasJ. E. Cavanaugh and P. M. Polgreen, Identifying patients with COPD at high risk of readmission, Chronic Obstructive Pulmonary Diseases, 3 (2016), 729.  doi: 10.15326/jcopdf.3.4.2016.0136.
    [22] M. Soltani, R. Batt and H. Bavafa, Quality improvement spillovers: Evidence from the hospital readmissions reduction program, SSRN, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3132770.
    [23] L. Turgeman and J. H. May, A mixed-ensemble model for hospital readmission, Artificial Intelligence in Medicine, 72 (2016), 72-82.  doi: 10.1016/j.artmed.2016.08.005.
    [24] K. Williams, The transition to widowhood and the social regulation of health: Consequences for health and health risk behavior, The Journals of Gerontology: Series B, 59 (2004), 343-349.  doi: 10.1093/geronb/59.6.S343.
    [25] K. Yu and X. Xie, Predicting hospital readmission: A joint ensemble-learning model, IEEE Journal of Biomedical and Health Informatics, 24 (2020), 447-456.  doi: 10.1109/JBHI.2019.2938995.
    [26] , Hospital Readmissions Reduction Program (HRRP), Reported of the U. S. Centers for Medicare & Medicaid Services, 2020. Available from: https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Value-Based-Programs/HRRP/Hospital-Readmission-Reduction-Program.
    [27] , Cost of Hospital Readmissions, Report of MBA Medical, 2015. Available from: http://mbamedical.com/cost-hospital-readmissions/.
    [28] , Performance of the Massachusetts Health Care System Series: A Focus on Provider Quality, Agency for Healthcare Research and Quality (US), 2015, Available from: https://archives.lib.state.ma.us/bitstream/handle/2452/265367/ocn913253238-report.pdf.
    [29] , Analysis: Hospital Readmissions of all Ages, Insurance Types Identifies High Risk Groups, Report of Beth Israel Deaconess Medical Center, 2017. Available from: https://medicalxpress.com/news/2017-07-analysis-hospital-readmissions-ages-high.html.
    [30] , Medicare Could Be Insolvent In 2024: How To Prevent It, Report of Forbes, 2021. Available from: https://www.forbes.com/sites/nextavenue/2021/03/05/medicare-could-be-insolvent-in-2024-how-to-prevent-it/?sh=99cf1ab26f00.
    [31] , Readmissions Reduction Program, Report of Centers for Medicare and Medicaid Services, 2018. Available from: http://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/AcuteInpatientPPS/Readmissions-Reduction-Program.html.
    [32] , Accuracy of Readmission Risk Assessment Improved by Machine Learning, Health Catalyst, 2018. Available from: https://www.healthcatalyst.com/wp-content/uploads/2021/05/Accuracy-of-Readmission-Risk-Assessment-Improved-by-Machine-Learning-1.pdf.
  • 加载中

Figures(8)

Tables(8)

SHARE

Article Metrics

HTML views(608) PDF downloads(244) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return