There is no denying that machine learning and artificial intelligence (AI) are very much in vogue across the healthcare landscape. AI was a key topic in the president’s address by Mark Okusa, MD, FASN, at last year’s ASN Kidney Week in San Diego. As more healthcare information becomes digital, it is tempting to get excited about the potential for data-backed tools despite the limited deployment of AI in the clinic. Creating risk models in healthcare takes more than just computing power and advanced algorithms; it requires a deep knowledge of the underlying medical problems and a tight integration with clinical teams and their workflows. Clinicians must understand the models to effectively and seamlessly integrate them into their everyday practice.
The Rogosin Institute is affiliated with New York-Presbyterian Weill Cornell Medical Center and specializes in the care of chronic kidney disease (CKD). In 2015, Rogosin created the Program for Education in Advanced Kidney Disease (PEAK), a multidisciplinary care team that assists patients in making a smooth transition to renal replacement therapy (RRT). The PEAK program educates patients about all their options for dialysis and encourages a higher adoption of home dialysis modalities.
Healthcare AI startup pulseData specializes in creating predictive models that provide insight into the clinical domain. Rogosin and pulseData have been collaborating for over a year to effectively deploy machine learning models in a clinical setting. Through this partnership, we discovered that a deep integration of human intelligence and AI methodology matched to customized workflows is a powerful aid to delivering preventive care.
Rogosin referred patients into the PEAK program when they were at CKD stage 4, but with an increasing number of patients and limited resources, PEAK sought a better way to identify high-risk patients who would best benefit from the transition program. Perhaps machine learning was the answer.
Machine learning
Machine learning might sound complicated, but at its core it is fairly straightforward; it is just labeling with a probability. For example, when you search online for a picture of a cat, the resulting images you see have been labeled by an algorithm as probably being a cat. To create the algorithm, many images that are labeled “cat” or “not cat” are fed into the algorithm, and eventually a machine learns how to label an image as “cat” or “not cat” by creating rules and testing how well those rules work by using the examples it has been shown. Once the algorithm has trained itself in this way, it can then label images it has never seen before.
Often ignored is how much human work went into teaching that algorithm what a cat looks like. This was done by feeding the machine thousands of human-labeled cat photos (a dream job for someone). To create an algorithm that would be useful for the PEAK team, we must use the same general principles and teach an algorithm what the ideal PEAK program patient looks like.
There have been previous efforts to stratify patients for risk of kidney failure; the most highly regarded and used is the Kidney Failure Risk Equation (KFRE) (1). The KFRE assigns a probability that a patient will progress to eGFR <15 mL/min per 1.73 m2 in the next 2 years. Rogosin and pulseData wanted to more precisely define the optimal PEAK program candidate, focusing on patients likely to experience progression in a very short period of time (the next 6 months) and to have a lower eGFR threshold (<10 mL/min per 1.73 m2). The PEAK team clinicians thought this represented a renal function level at which there would be a reasonable indication to prepare for RRT and that 6 months allowed sufficient time to arrange for venous access placement and maturation, for workup for transplantation, or both. Using eGFR <10 mL/min per 1.73 m2 as the outcome also meant the model was not subject to a patient’s or physician’s choice about when to start dialysis but could provide a more objective score for the patient and provider to interpret.
The model was built using longitudinal patient data collected from the Rogosin’s electronic health record (EHR) system, and features were created across patient demographics, vital signs, comorbidities, laboratory values, and medication use. For those interested in the statistical aspects, the model has an area under the curve (AUC) of 0.93, a sensitivity of 0.81, and a specificity of 0.89 at the top quintile of risk. We designed the model to emphasize clinical discrimination rather than AUC, and the AI model reached a positive predictive value of 0.55 in the top quintile of risk, whereas the KFRE had a positive predictive value of 0.33 with an AUC of 0.92.
Model performance in clinical practice
Model performance statistics alone do not determine how well a model will fare in actual clinical practice. We examined the hazard decline curves for patients at various levels of risk to select a risk level where the vast majority of patients (nearly all) would experience the outcome (eGFR declining to <10 mL/min per 1.73 m2) within a 2-year period. We then set this threshold as the point at which we would refer the patient to the PEAK team.
We examine all the data held within the EHR system once a week and calculate a fresh score for all patients with an eGFR <30 mL/min per 1.73 m2 who have not yet started dialysis. For patients who are scored over the high-risk threshold level defined by the clinical team (identified with the red line on the hazard curve in Figure 1), an alert is sent to the patient’s primary nephrologist to prompt referral to the PEAK program to help the patient make a smooth transition to dialysis. The multidisciplinary team then meets each week to review the patients at high risk, ensure that the patient’s treatment plan is documented, and discuss any issues with moving the patient forward.
The PEAK team has achieved increased rates of preemptive transplantations and optimal outcomes. Compared with the New York rate of 2.5% for preemptive transplantations, the PEAK team sees 12.5%. Sixty-eight percent of PEAK patients began dialysis as outpatients compared with only 27% nationally (2), and 57% began with venous access in place compared to only 20% nationally (3). The PEAK team is well on the way to reaching their 25% home dialysis goal, and it currently achieves 20% home modality adoption—a significant lift over the New York City average, which is only 3.6%.
Whereas AI and machine learning are at the forefront of research and transformation in medicine, healthcare is a uniquely human effort, and the knowledge and actions of the multidisciplinary team are pivotal to ensuring better patient outcomes. Purpose-built AI tools can be a powerful partner in this effort: amplifying our human intelligence, precisely directing care, and deepening our understanding of patients.