Historical Bias in Machine Learning

Understanding Historical Bias

A lot of the time algorithms fail even after following systematic processes and best practices for sampling data and building models. This is because the model reflects past societal or systemic inequalities, such as gender, racial, economic and other biases. This bias is referred to as historical bias which stems from the way the world was or is, rather than errors in the data itself. 

 

Example of Historical Bias in Machine Learning

In 2018, women made up just 5% of Fortune 500 CEOs. A quick Google search for “CEO” at the time would have revealed that most of the images predominantly featured men. The potential harm of such representation calls for thoughtful assessment by a diverse range of stakeholders, including those directly affected, to evaluate its societal impact and make informed decisions. These considerations may sometimes challenge the accuracy of the underlying data, even if it reflects reality. For example, Google recently revised its Image Search results for “CEO” to include a higher proportion of women, demonstrating a deliberate effort to address biases in representation (Suresh and Guttag, 2020).

Word embeddings are numerical representations of words that encode their meanings, making them useful for various natural language processing (NLP) tasks. These embeddings are generated by analyzing vast amounts of text data, such as news articles, web pages, and Wikipedia. However, studies have revealed that word embeddings often mirror the human biases present in the data from which they are derived.

For instance, a study by Fahse et al. (2021) revealed that word embeddings trained on text from a specific time period reflect the biases prevalent during that era. Gendered occupations, such as “nurse” and “engineer,” were strongly associated with women and men, respectively. As a result, natural language processing (NLP) systems that rely on these embeddings—such as chatbots, machine translation tools, or speech recognition systems—may inadvertently encode and perpetuate harmful stereotypes related to gender and ethnicity, leading to biased outcomes.

Historical bias Social biases can be encoded in data. If not accounted for, a machine learning model will reproduce these biases, resulting in unfair outcomes. 

 

Types of Historical Bias in Machine Learning

Historical bias comes in two flavors Weerts (2021).

  1. Bias from Social Biases in Human Decision-Making
    Historical bias can occur when social biases influence human decisions, particularly when target labels (e.g., hiring decisions) are based on human judgment. For example, if more men have been hired than women in the past, a model trained on these historical decisions is likely to reproduce this gender imbalance. This type of bias is known as construct validity bias because the historical hiring decisions are an inaccurate reflection of an applicant’s actual suitability for the job. Additionally, inaccurate stereotypes embedded in text, images, or annotations created by people can lead to systems that reinforce these biases.
  2. Bias from Data Representing a Biased Reality
    A second type of historical bias arises when the data accurately represents the real world, but the reality itself is biased. In the hiring example, the observed bias may reflect actual differences in candidate suitability, but these differences could stem from deeper structural inequalities in society. For instance, people from lower socioeconomic backgrounds might have fewer opportunities for quality education, making them less suitable for jobs that require such qualifications. This kind of bias is rooted in broader social inequalities rather than flawed data collection.

Similarly, some stereotypes are accurate at an aggregate level (even if they can be very inaccurate at an individual level!). For example, in many societies female nurses still greatly outnumber male nurses. Depending on your worldview, you might have a different definition of what is fair in each of these scenarios. In practice, it is usually impossible to distinguish between these two types of historical bias from observed data alone. Moreover, they can also occur simultaneously. 

 

Designing Mitigation for Historical Bias in Machine Learning

Mitigating historical bias requires a nuanced approach that combines technical solutions with an understanding of the social and cultural context. The socio-technical and user-centred design (UCD) approaches can be crucial in addressing historical bias. Here’s how these frameworks can be applied:

Product teams often focus on tackling representation bias during the data collection phase. However, addressing this issue requires a much broader approach. Historical bias starts creeping in right from the problem definition stage, and mitigating it effectively means taking thoughtful action at every step of the AI/ML pipeline, all through to ongoing monitoring and refinement.

Tackling representation bias requires a systematic, proactive approach.

You can get started with these resources:

Free Resources for Historical Bias Mitigation

Best practices and design considerations for mitigating Historical Bias from problem definition to model deployment. (coming soon)

AI Bias Mitigation Package – £999
The ultimate resource for organisations ready to tackle bias at scale starting from problem definition through to model monitoring to drive responsible AI practices.
dribbble, logo, media, social Mitigate and resolve 15 Types of Bias specific to your project with detailed guidance from problem definition to model monitoring.
dribbble, logo, media, social Packed with practical methods, research-based strategies, and critical questions to guide your team.
dribbble, logo, media, social Comprehensive checklists with +75 design cards for every phase in the AI/ ML pipeline
Get Bias Mitigation Package– (Delivery within 2-3 days)
Customised AI Bias Mitigation Package – £2499
We’ll customise the design cards and checklists to meet your specific use case and compliance requirements—ensuring the toolkit aligns perfectly with your goals and industry standards.
dribbble, logo, media, social Mitigate and resolve 15 Types of Bias specific to your project with detailed guidance from problem definition to model monitoring.
dribbble, logo, media, social Packed with practical methods, research-based strategies, and critical questions specific to your use case.
dribbble, logo, media, social Customised checklists and +75 design cards for every phase in the AI/ ML pipeline
Get Customised AI Bias Mitigation Package– (Delivery within 7 days)

 

 

Conclusion

Addressing historical bias is critical for building fair and equitable machine learning systems. By identifying gaps in datasets, enriching underrepresented samples, and leveraging targeted tools and techniques such as data augmentation, randomization, and A/B testing, organizations can mitigate bias effectively. Continuous monitoring, refinement, and stakeholder involvement are essential to ensure these efforts remain aligned with ethical and societal values. Through deliberate and informed actions, it is possible to reduce the impact of historical bias, fostering trust and fairness in AI systems.

 

Sources

Catania, B., Guerrini, G. and Janpih, Z., 2023, December. Mitigating Representation Bias in Data Transformations: A Constraint-based Optimization Approach. In 2023 IEEE International Conference on Big Data (BigData) (pp. 4127-4136). IEEE.

Fahse, T., Huber, V. and van Giffen, B., 2021. Managing bias in machine learning projects. In Innovation Through Information Systems: Volume II: A Collection of Latest Research on Technology Issues (pp. 94-109). Springer International Publishing.

Kim, J., 2024. Towards Algorithmic Justice: Human Centered Approaches to Artificial Intelligence Design to Support Fairness and Mitigate Bias in the Financial Services Sector.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A., 2021. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR)54(6), pp.1-35.

Mousavi, M., Shahbazi, N. and Asudeh, A., 2023. Data coverage for detecting representation bias in image datasets: A crowdsourcing approach. arXiv preprint arXiv:2306.13868.

Shahbazi, N., Lin, Y., Asudeh, A. and Jagadish, H.V., 2022. A survey on techniques for identifying and resolving representation bias in data. CoRR, abs/2203.11852.

Suresh, H. and Guttag, J.V., 2019. A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:1901.100022(8), p.73.

Suresh, H. and Guttag, J., 2021. Understanding potential sources of harm throughout the machine learning life cycle.

Weerts, H.J., 2021. An introduction to algorithmic fairness. arXiv preprint arXiv:2105.05595.

Share:

Related Courses & Al Consulting

Designing Safe, Secure and Trustworthy Al

Workshop for meeting EU AI ACT Compliance for Al

Contact us to discuss your requirements

Related Guidelines

Understanding Historical Bias A lot of the time algorithms fail even after following systematic processes and best practices for sampling

Dataset fairness is a cornerstone of building equitable and responsible AI systems. As AI permeates critical decision-making domains, the risks

Artificial Intelligence (AI) systems can inadvertently replicate and amplify societal biases, especially when trained on skewed or unrepresentative datasets. Addressing

Understanding Label Bias Supervised learning is built on the assumption that training datasets accurately represent the environments in which models

No data was found

To download the guide, fill it out.