Measurement Bias in ML

Dr Anandhi Dhukaram

Understanding Measurement Bias

As artificial intelligence (AI) and machine learning (ML) increasingly integrate into society, concerns about the systematic inequalities these technologies may introduce or reinforce have grown significantly. Measurement bias, a key contributor to these disparities, arises when data collection or labeling processes fail to accurately capture the characteristics of different groups. According to Fahse et al, (2021) measurement bias arises in two key phases of the modelling process:

BU-Phase (Business Understanding):
- Bias occurs through subjective choices during model design, particularly when defining the target variable and selecting features.
- Using imperfect proxies or protected attributes (e.g., race, gender) can lead to discrimination or inaccuracies.
- Even if protected attributes are excluded, their correlation with non-protected attributes (redlining effect) can still introduce bias.
DP-Phase (Data Preparation):
- Bias can emerge during feature creation, derivation, or transformation, potentially omitting critical factors or introducing noise.
- Inaccurate features or reliance on a limited number of inappropriate features may result in varying prediction accuracy across groups.

Example for Measurement Bias in Machine Learning

Measurement bias can skew model predictions and outcomes, leading to harmful consequences (Tay et al, 2022). For instance, in healthcare, an algorithm designed to allocate resources underestimated the needs of Black patients compared to White patients, leading to inequitable healthcare access. Similarly, in hiring, discrepancies in the evaluation of candidates can perpetuate workplace inequalities. Addressing measurement bias is critical to fostering fairness and equity in AI-driven decision-making. Measurement or Reporting Bias arises from the way we choose, collect, and measure specific features in a dataset.

In a crime prediction application, the feature “number of arrests” is used to predict the likelihood of future criminal activity. However, this feature can reflect biases in data collection rather than actual differences in behavior (Fahse et al., 2021). For example, consider African American and Caucasian defendants who commit the same number of drug sales and thus share a similar true risk. If arrest rates are recorded differently across ethnic groups—such as heavier policing in minority neighborhoods—this can lead to disparities in the data. African American defendants in these neighborhoods are more likely to have higher numbers of drug-related arrests. As a result, even though the true risk is similar, the machine learning application may assign a higher risk score to African American defendants compared to Caucasian defendants. This highlights how biased data can skew predictive outcomes and perpetuate inequalities.

Implications of Measurement Bias in Machine Learning

For example, “creditworthiness” is an abstract construct that is often operationalised with a measurable proxy like a credit score. Proxies become problematic when they are poor reflections or the target construct and/or are generated differently across groups can contribute to bias through (Suresh and Guttag, 2021):

Oversimplification of Complex Constructs:
Proxies like credit score for “creditworthiness” fail to capture the full complexity of creditworthiness. This oversimplification can ignore group-specific indicators of success or risk for creditworthiness, leading to biased outcomes.
Variability in Measurement Across Groups:
Measurement methods may differ between groups, introducing bias. For instance, stricter monitoring at certain factory locations can inflate error counts (i.e., observed number of errors is being used as a proxy for work quality), creating feedback loops that perpetuate further monitoring for those groups.
Accuracy Disparities Across Groups:
Structural discrimination can lead to systematic inaccuracies, such as racial or gender disparities in medical diagnoses or misclassification in criminal justice risk assessments. For example, proxies like “arrest” or “rearrest” disproportionately misrepresent minority communities due to over-policing, leading to models with higher false positive rates for these groups (Mehrabi, 2021).

Design Approaches to Mitigate Measurement Bias in Machine learning

Our goal is not to prescribe specific statistical methods or tools to address measurement bias, as such technical details are beyond the scope of this guidance. Instead, we aim to highlight key considerations, challenges, and strategies for identifying and mitigating measurement bias in data. By fostering awareness of its implications, we encourage practitioners to adopt context-appropriate solutions informed by their application requirements and stakeholder engagement.

Tackling representation bias requires a systematic, proactive approach. You can get started with these resources:

Free Resources for Measurement Bias Mitigation

Checklist for Measurement Bias from problem definition to model deployment (coming soon)

AI Bias Mitigation Package – £999

The ultimate resource for organisations ready to tackle bias at scale starting from problem definition through to model monitoring to drive responsible AI practices.

Mitigate and resolve 15 Types of Bias specific to your project with detailed guidance from problem definition to model monitoring.

Packed with practical methods, research-based strategies, and critical questions to guide your team.

Comprehensive checklists with +75 design cards for every phase in the AI/ ML pipeline

Get Bias Mitigation Package– (Delivery within 2-3 days)

Customised AI Bias Mitigation Package – £2499

We’ll customise the design cards and checklists to meet your specific use case and compliance requirements—ensuring the toolkit aligns perfectly with your goals and industry standards.

Mitigate and resolve 15 Types of Bias specific to your project with detailed guidance from problem definition to model monitoring.

Packed with practical methods, research-based strategies, and critical questions specific to your use case.

Customised checklists and +75 design cards for every phase in the AI/ ML pipeline

Get Customised AI Bias Mitigation Package– (Delivery within 7 days)

Sources

Fahse, T., Huber, V. and van Giffen, B., 2021. Managing bias in machine learning projects. In Innovation Through Information Systems: Volume II: A Collection of Latest Research on Technology Issues (pp. 94-109). Springer International Publishing.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A., 2021. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6), pp.1-35.

Shahbazi, N., Lin, Y., Asudeh, A. and Jagadish, H.V., 2022. A survey on techniques for identifying and resolving representation bias in data. CoRR, abs/2203.11852.

Suresh, H. and Guttag, J., 2021, October. A framework for understanding sources of harm throughout the machine learning life cycle. In Proceedings of the 1st ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (pp. 1-9).

Tay L, Woo SE, Hickman L, Booth BM, D’Mello S. A Conceptual Framework for Investigating and Mitigating Machine-Learning Measurement Bias (MLMB) in Psychological Assessment. Advances in Methods and Practices in Psychological Science. 2022;5(1).

Measurement Bias in ML

Dr Anandhi Dhukaram

Understanding Measurement Bias

Example for Measurement Bias in Machine Learning

Implications of Measurement Bias in Machine Learning

Design Approaches to Mitigate Measurement Bias in Machine learning

Free Resources for Measurement Bias Mitigation

AI Bias Mitigation Package – £999

Customised AI Bias Mitigation Package – £2499

Share:

Related Courses & Al Consulting

Designing Safe, Secure and Trustworthy Al

Workshop for meeting EU AI ACT Compliance for Al

Contact us to discuss your requirements

Related Guidelines

Esdha Office

Responsible AI Governance

Service Locations

Blog - Responsible AI Governance

Ⓒ2025,+ Esdha - Responsible AI Governance Consultancy and Partner

To download the guide, fill it out.