Fairness in healthcare machine learning is highly context-dependent. Different use cases, populations, and risks require tailored approaches to fairness, making it essential to align fairness metrics with the specific goals and impacts of each healthcare application. The majority of fairness mitigations in ML include: group fairness (including equalized odds, demographic parity, and predictive rate parity), individual fairness and counterfactual fairness (Choras et al, 2020).
What is Predictive Parity?
Here is a definition for Predictive Rate Parity from Zhou et al (2022):
Predictive rate parity means that the predictive value should be the same for both protected and unprotected groups.
In simpler terms, the outcome Y (e.g., true recidivism) should be independent of the attribute A (e.g., gender), given the predicted outcome Y^.
For example, in a recidivism prediction scenario, this metric ensures that, for individuals predicted to reoffend, the actual recidivism rate must be equal for both men and women.
Example of Predictive Rate Parity
Here are some examples of the application of Predictive Rate Parity from healthcare (Ahmad et al., 2020).
For understanding healthcare emergency department utilization, predictive rate parity would ensure that, among patients predicted to require emergency care, the actual proportion who genuinely need it is equal for all demographic groups.
For predicting mortality, predictive rate parity would ensure that, among patients predicted to have a high risk of dying, the actual mortality rate is the same across all groups.
The goal of predictive rate parity in these scenarios is to maintain trust and equity by ensuring that predictions are equally reliable for everyone, regardless of their demographic characteristics.
Why Predictive Rate Parity?
Predictive Rate Parity ensures that the positive predictive value (PPV)—the proportion of correctly predicted positive outcomes—is the same across protected and unprotected groups. For example, in a university admissions scenario, predictive rate parity would require that if a student is predicted to succeed, the actual success rate should be the same for all demographic groups, such as gender or ethnicity.
Predictive rate parity is particularly important in scenarios where:
- Trust in Predictions is critical: Patients and providers need to trust that predictions are reliable for all groups.
- Resource Allocation is being decided: For example, determining who gets access to limited emergency care resources.
However, it may not always be the best metric in high-risk scenarios (e.g., mortality predictions), where minimizing false negatives might take precedence over achieving parity.
Challenges of Predictive Rate Parity
Different applications, like predicting hospital visits or patient mortality, require different fairness approaches which can be challenging to address. Here are some of the the challenges:
- Conflict with Other Fairness Metrics:
Predictive Rate Parity often conflicts with Equalized Odds and Demographic Parity due to what is known as the “impossibility theorem of fairness.” This theorem states that it is mathematically impossible to satisfy all three fairness criteria simultaneously unless the base rates (i.e., the proportion of positive outcomes in each group) are equal (Chouldechova, 2017; Kleinberg et al., 2016).
- For example, in recidivism predictions, if the underlying reoffense rates differ between groups, ensuring predictive rate parity may result in unequal false positive or false negative rates, violating equalized odds.
- Limitations of Relabeling Predictions:
As highlighted by Corbett-Davies (2018), achieving fairness metrics like predictive parity through model adjustments (e.g., relabeling outcomes) can inadvertently harm protected groups. For example:
- Relabeling predictions to meet fairness goals might lead to granting loans to individuals unlikely to repay, which could harm the institution and undermine trust in the model.
- Adjusting for fairness can result in models that are less accurate for certain groups, reducing equitable outcomes in practice.
- Dependence on Calibration:
Predictive Rate Parity is tied to calibration (ensuring predicted probabilities align with actual outcomes). However, calibration alone does not guarantee fairness across all metrics. For example, even if predictions are calibrated, differences in error rates (false positives or negatives) may persist across groups, leading to inequitable outcomes.
- Differences in Base Rates:
In healthcare, different demographic groups often have varying baseline risks for specific outcomes (e.g., chronic disease prevalence, access to care). These differences make it difficult to achieve predictive rate parity without adjusting the model in ways that might affect overall accuracy.
- High-Stakes Decisions:
In healthcare, the cost of errors (false positives or negatives) can be life-threatening. For instance:
- A false positive for emergency care might result in unnecessary resource use.
- A false negative for mortality prediction could delay critical care for at-risk patients.
Prioritizing predictive rate parity may inadvertently increase these types of errors for certain groups.
- Practical Challenges in Implementation:
Consumer lending studies (Hardt et al., 2016) demonstrate that achieving fairness metrics like equalized odds or predictive rate parity is extremely challenging in real-world scenarios. For example:
- In credit scoring (e.g., FICO model), achieving predictive rate parity might require separate thresholds or randomization, which complicates implementation and raises ethical questions.
- Randomizing decisions between thresholds to balance fairness can reduce interpretability and erode trust in the system.
7. Data Quality and Bias:
Ensuring predictive rate parity requires high-quality, unbiased data. Historical inequities in healthcare access and outcomes often lead to biased datasets, making it challenging to achieve parity without extensive preprocessing or adjustments.
Summary
In certain contexts, such as university admissions, promoting group-level fairness can better align with broader societal goals like equity and diversity, rather than strictly adhering to individual fairness metrics. For instance, fostering a diverse student body may serve institutional objectives and benefit the entire community, even if it does not satisfy predictive rate parity or other fairness criteria. This underscores the critical trade-off between optimizing for fairness metrics and advancing overarching societal goals, emphasizing the need for fairness strategies that are context-sensitive and purpose-driven.
Next Steps
-
If you’re interested in bespoke training or design solutions on AI fairness, feel free to reach out for a consultation.
-
Check out our the following resources and upcoming workshops to equip your teams with the tools and knowledge to implement fair AI systems.
Free Resources for Individual Fairness Design Considerations
Sampling Bias in Machine Learning
Social Bias in Machine Learning
Representation Bias in Machine Learning
Conditional Demographic Parity Guidance – £99
Empower your team to drive Responsible AI by fostering alignment with compliance needs and best practices.
Practical, easy-to-use guidance from problem definition to model monitoring
Checklists for every phase in the AI/ ML pipeline
AI Fairness Mitigation Package – £999
The ultimate resource for organisations ready to tackle bias at scale starting from problem definition through to model monitoring to drive responsible AI practices.



Customised AI Fairness Mitigation Package – £2499



Sources
Ahmad, M.A., Patel, A., Eckert, C., Kumar, V. and Teredesai, A., 2020, August. Fairness in machine learning for healthcare. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 3529-3530).
Choraś, M., Pawlicki, M., Puchalski, D. and Kozik, R., 2020. Machine Learning–the results are not the only thing that matters! What about security, explainability and fairness?. In Computational Science–ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part IV 20 (pp. 615-628). Springer International Publishing.
Chouldechova, A. (2017). Fair prediction with disparate impact: a study of bias in recidivism prediction instruments.https://arxiv.org/abs/1703.00056
Corbett-Davies, S. & Goel, S. (2018). The measure and mismeasure of fairness: a critical review of fair machine learn-ing. https://arxiv.org/pdf/1808.00023.pdf
Hardt, M., Price, E. & Srebro, N. (2016). Equality of opportunity in supervised learning. 30th Conference on NeuralInformation Processing Systems (NIPS 2016), Barcelona, Spain
Kleinberg, J., Mullainathan, S. & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores.https://arxiv.org/abs/1609.05807
Zhou, N., Zhang, Z., Nair, V.N., Singhal, H. and Chen, J., 2022. Bias, fairness and accountability with artificial intelligence and machine learning algorithms. International Statistical Review, 90(3), pp.468-480.
Zhao, H., Coston, A., Adel, T. and Gordon, G.J., 2019. Conditional learning of fair representations. arXiv preprint arXiv:1910.07162.