AI fairness is context-sensitive, with different domains and applications requiring tailored approaches. Majumder’s (2023) research highlights the existence of numerous fairness metrics, many of which are highly correlated, measuring similar aspects of fairness. For example, 26 classification fairness metrics were grouped into seven clusters based on their similarity. This clustering simplifies the fairness evaluation process by identifying representative metrics from each cluster.
Key takeaway: Instead of evaluating every available metric, focus on representative metrics that align with the application’s fairness requirements.
A Framework for Fairness Metrics Selection
Majumder’s (2023) proposed framework for metric selection involves:
- Identifying Desired Fairness Types: Define the specific fairness goals for the AI system, such as demographic parity, equal opportunity, or individual fairness.
- Leveraging Metric Clusters: Use clustered groups of metrics to identify those that align with the fairness objectives.
- Simplifying Evaluation: Select and evaluate one representative metric from each relevant cluster to streamline the process.
This pragmatic approach reduces redundancy and ensures that fairness assessment remains efficient and focused.
Supplementary Metrics for Holistic Evaluation
In addition to fairness, other dimensions such as utility, accountability, resilience, and privacy should be considered to ensure comprehensive AI evaluation (Wang et al, 2024).
1. Model Utility
Traditionally used machine learning metrics such as accuracy, precision, recall, and F1 score measure the performance of the AI system. While these metrics do not directly address fairness, they provide a baseline for the system’s overall effectiveness.
2. Accountability
Accountability ensures that the AI system complies with regulations and aligns with stakeholders’ expectations. Explainable AI (XAI) methods contribute to accountability by making decision-making processes transparent. Metrics like:
- Correctness: Measures the execution time of XAI methods relative to the AI model’s execution time.
- Feature Importance: Evaluates the significance of each training data feature in the AI’s decisions.
- Stability, Compacity, and Consistency: Assessed using tools like the Shapash library to evaluate the robustness and clarity of XAI explanations.
3. Resilience
Resilience metrics assess the AI system’s robustness against adversarial attacks:
- Impact: Proportion of successful adversarial examples.
- Complexity: Effort required to generate a successful attack.
- Detectability: Ease of identifying adversarial modifications.
- Capability Required: Privileges needed by an attacker to perform an attack.
These metrics help quantify the system’s vulnerability and resistance to malicious actions.
4. Privacy
Privacy-preserving metrics ensure that user data remains protected:
- Differential Privacy (ε): Measures the difficulty for an attacker to infer private information.
- User Diversity: Assesses how diverse and anonymous user data is, reducing the risk of individual identification.
Related Resources for AI Fairness Metrics Selection Design Considerations
Evaluation Bias in Machine LearningÂ
Sampling Bias in Machine Learning
Measurement Bias in Machine Learning
Social Bias in Machine Learning
Representation Bias in Machine Learning
AI Fairness Metrics Selection – ÂŁ99
Empower your team to drive Responsible AI by fostering alignment with compliance needs and best practices.
Practical, easy-to-use guidance from problem definition to model monitoring
Checklists for every phase in the AI/ ML pipeline
AI Fairness Mitigation Package – ÂŁ999
The ultimate resource for organisations ready to tackle bias at scale starting from problem definition through to model monitoring to drive responsible AI practices.



Customised AI Fairness Mitigation Package – ÂŁ2499



Summary
The integration of fairness metrics with supplementary dimensions such as utility, accountability, resilience, and privacy ensures a well-rounded evaluation of AI systems. By focusing on representative fairness metrics, as suggested by Majumder, and incorporating measures from these additional dimensions, developers can build AI systems that are not only fair but also robust, accountable, and privacy-preserving.
Sources
Majumder, S., Chakraborty, J., Bai, G.R., Stolee, K.T. and Menzies, T., 2023. Fair enough: Searching for sufficient measures of fairness. ACM Transactions on Software Engineering and Methodology, 32(6), pp.1-22.
Wang, S., Sandeepa, C., Senevirathna, T., Siniarski, B., Nguyen, M.D., Marchal, S. and Liyanage, M., 2024, June. Towards accountable and resilient AI-assisted networks: case studies and future challenges. In 2024 joint European conference on networks and communications & 6G summit (EuCNC/6G Summit) (pp. 818-823). IEEE.