Skip to contents

Estimated glomerular filtration rate (eGFR) calculation

kidney.epi R package includes functions for calculation of eGFR by different equations.

Data frames

  • ckd.data - contains synthetic data for 1000 adults and 1000 children (see description in documentation).
library(kidney.epi)
#> The kidney.epi package is made with care by the research consultancy Scientific-Tools.Org.
#> Contact us at https://Scientific-Tools.Org or via 'maintainer("kidney.epi")' for data analysis or software development.
head(ckd.data)
#>       cr  cys  age    sex ethnicity height category
#> 1 108.10 1.29 73.8   Male Caucasian     NA   adults
#> 2 101.12 1.26 66.8   Male Caucasian     NA   adults
#> 3 139.99 1.63 75.9   Male Caucasian     NA   adults
#> 4 145.26 1.75 68.1 Female Caucasian     NA   adults
#> 5 148.21 1.74 51.7 Female Caucasian     NA   adults
#> 6 179.43 2.14 41.8   Male Caucasian     NA   adults

Functions to calculate eGFR by different equations

kidney.epi contains a set of functions to calculate eGFR by different equations either for a single patient or for a dataset.

The following eGFR equations are supported:

If you use these functions from kidney.epi package for the data analysis and manuscript preparation, please cite the package: “Bikbov B. kidney.epi: Kidney-Related Functions for Clinical and Epidemiological Research. Scientific-Tools.Org, https://Scientific-Tools.Org. doi:10.32614/CRAN.package.kidney.epi”.

Contact us for data analysis or software development at Scientific-Tools.Org or via ‘maintainer(“kidney.epi”)’, connect with the author on LinkedIn.

Examples

The vignette demonstrates the usage of eGFR calculation by the CKD-EPI 2009 equation, but race-free CKD-EPI 2021 and other equations work in the same way.

Example for a single patient

To calculate for a single patient, use the following syntax:

# call egfr.ckdepi.cr.2009 function, and directly set parameters values
egfr.ckdepi.cr.2009(
  creatinine = 1.4,  
  age = 60,  
  sex = "Male", 
  ethnicity = "White", 
  creatinine_units = "mg/dl", 
  label_afroamerican = c("Afroamerican"), 
  label_sex_male = c("Male"), 
  label_sex_female = c("Female")
)
#> [1] 54.22

# Definitions of the labels for sex and race are optional if you use the same labels defined as default in the function. The following also works well:
egfr.ckdepi.cr.2009(
  creatinine = 1.4,  
  age = 60,  
  sex = "Male", 
  ethnicity = "White", 
  creatinine_units = "mg/dl"
)
#> [1] 54.22

# If you measure creatinine in micromol/l, it is possible to omit also 'creatinine_units' since the default value is "micromol/l":
egfr.ckdepi.cr.2009(
  creatinine = 103, # creatinine is in micromol/l
  age = 60,  
  sex = "Male", 
  ethnicity = "White"
)
#> [1] 67.7

Example for a cohort of patients

To calculate eGFR for a cohort of patients in a dataset, use the following syntax:

# copy as an example the internal dataframe ckd.data from R package to your dataframe
mydata <- ckd.data

# calculate eGFR by CKD-EPI equation
mydata$ckdepi <- egfr.ckdepi.cr.2009(
  creatinine = mydata$cr, age = mydata$age,
  sex = mydata$sex, ethnicity = mydata$ethnicity,
  creatinine_units = "micromol/L",
  # customize all labels for those used in the data frame if necessary
  label_afroamerican = c("Black"),
  label_sex_male = c("Male"), label_sex_female = c("Female")
) 
#> Warning in service.check_plausibility.age(age, max_age): There is 1 patient with negative values for age.  This value was substituted to NA.
#> Warning in service.check_plausibility.age(age, max_age): There is 1 patient with age >100 years.  This value was substituted to NA.
#> There are 840 patients with age <18 years.  These values were substituted to NA.

# show descriptive stat for the calculated values
# note that synthetic data set ckd.data contains input parameters for both adults and children, and since the CKD-EPI equation was developed and validated for adults only, the resulting eGFR values for children will be NA. Use children-specific eGFR equations when necessary.
summary(mydata$ckdepi)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
#>   17.78   35.44   45.19   51.80   59.85  179.11     841

Advantages of the kidney.epi package functions

There are several advantages of the kidney.epi package functions for calculating eGFR values:
  • Data workflow is reproducible and based on verified algorithms:
    The kidney.epi package offers a reproducible, open-source workflow built on verified methods, reducing duplication and enabling teams to focus on what truly matters: their data and insights. Thus, every research group should not rewrite the same computational code from scratch.
  • Control for input values:
    If some input values are not plausible (negative values for age or creatinine, age exceeding logical limits, etc) or not suitable for a given eGFR equation (applicable only to children or only to adults) - they will be omitted, and thus in the output there will be only robust results.
  • Possibility to use different measurement units for creatinine:
    There is no need to decode creatinine values in your data set. Just define in the ‘creatinine_units’ parameter whether your data contain values in micromol/L, mmol/L or mg/dL - and the rest will be processed by the function.
  • Flexible label handling for enhanced usability:
    The function offers a high degree of flexibility by allowing you to define custom labels that match the labels used in your data frame. This ensures consistent interpretation of data without needing to modify the original dataset. Thus, you don’t need to decode labels in your data frame, just define which of your labels correspond to males, females, and other parameters.
    Take into account the following examples:
    • If the data frame has only label “Male” for males, you can skip the definition because this is already assumed by the function, or define for clarity label_sex_male = “Male”.
    • Consider that labels are case-sensitive, and thus be attentive to “Male” and “male” or similar definitions.
    • If your data frame uses different labeling conventions, you can easily adjust the labels in the function parameters to align with your data. For example, if you data frame contains labels “F” for females and “M” for males, you have to indicate the labeling in parameters of the function as label_sex_male = “M”, label_sex_female = “F”.
    • The functions support also multiple labels in non-standard or mixed data. If you’re working with data that hasn’t been fully standardized — where the same category might have different labels — the function allows you to define multiple values as valid labels for the same category.
      For example, if male sex is represented by both “male” and “hombre” labels and female sex by both “female” and “mujer” labels, you can define: label_sex_male = c(“male”, “hombre”), label_sex_female = c(“female”, “mujer”).
      If male sex is represented by both “male” and 1, you can define: label_sex_male = c(“male”, 1).
      If male sex is represented by both “male” and “Male” (case-sensitive), you can define: label_sex_male = c(“male”, “Male”).
  • As the result, there is no need to modify the original dataset — just adjust the function parameters instead. This saves time and reduces data preprocessing efforts, as well as improve the code readability.

References

References for each eGFR equation are listed in the documentation to the package.