Do we need to use race in assessing renal function?

Stimulated by recent events there has been a very active debate among physicians about the role of race in medicine, we took a look at whether using race in the eGFR equation was useful in predicting renal failure events.

July 14, 2020

Stimulated by recent events there has been a very active debate among physicians about the role of race in medicine.

The use of race features predominantly in nephrology care due to its use in the calculation for  estimated Glomerular Filtration Rate (eGFR), the gold-standard algorithm for evaluating a patient’s renal function. Both the CKD-EPI (2009) and MDRD (2008) include a variable to use if the individual is black. eGFR is also the deciding factor in chronic kidney disease staging.

The use of race as a modifier in eGFR effectively creates a higher bar for diagnosis of kidney disease in African Americans than the population as a whole. This has led to the view that African Americans may be arbitrarily excluded from analyses, making them less likely to receive early treatment for kidney disease they need. 

We at pulseData decided to build a simple model comparing eGFR (made up of serum creatinine, age, gender, and race) and compare it to a model only using the serum creatinine, age and gender to see which would better assess the risk of renal failure.

Our methodology 

We utilized a dataset of 477,633 patients to identify individuals with CKD stage 3 or above, with no prior history of renal dialysis or kidney transplant to assess their risk of renal failure (maintenance hemodialysis or renal transplant) within one year. Transplant events were defined using ICD, HCPCS and DRG codes. Dialysis events are defined using ICD, HCPCS, Medicare place of service and revenue codes; acute dialysis events were excluded. We partitioned each patient record into non-overlapping, 12-month periods for analysis. The last serum creatinine value of each period was used for prediction. We built two simple logistic regression models, model 1 used only eGFR (calculated using the CKD-EPI Equation) as its sole feature and Model 2 used serum creatinine, age and gender (and not race) as features.

Figure 1. Cohort Selection

Our results

There were 2,296 patients  in our final analysis (see figure 1.) The cohort had a median age of 69, was predominantly female (60%), and 43.1% African American. The average eGFR at cohort entry was 49.1 mL/min/1.73m2.

The regularized logistic regression model using serum creatinine, age and gender showed a similar predictive power (area under receiver operator characteristic: 0.934, area under precision recall curve: 0.670) compared to the one using eGFR (AUROC: 0.904, AUPRC: 0.662). In the top decile of risk model 1 successfully identified 30 outcomes whereas model 2 was able to identify 33 outcomes. 

Comparison of area under curve of the two models

Our conclusion

The regularized logistic regression model without race performed slightly better than the traditional eGFR-only model using race. Our analysis demonstrates that there might not be a requirement to include race as a variable when risk stratifying for renal failure. We encourage healthcare organizations to consider relying more heavily on underlying variables when conducting analysis to ensure the best treatment for people with chronic kidney disease.   

Please get in touch with any questions or requests for more information by emailing us at


  • Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate [published correction appears in Ann Intern Med. 2011 Sep 20;155(6):408]. Ann Intern Med. 2009;150(9):604-612.
  • Vyas DA, Eisenstein LG, Jones DS. Hidden in Plain Sight—Reconsidering the Use of Race Correction in Clinical Algorithms, NEJMms 2020.
  • Fontanarosa PB, Bauchner H. Race, ancestry, and medical research. JAMA. 2018;320(15):1539-1540.
  • Eneanya ND, Yang W, Reese PP. Reconsidering the consequences of using race to estimate kidney function. JAMA 2019; 322: 113-4.