Recent research has highlighted the vulnerabilities of modern machine learning based systems to bias, especially towards segments of society that are under-represented in training data. In this work, we develop a novel, tunable algorithm for mitigating the hidden, and ...
Various forms of artificial intelligence (AI) applications are being deployed and used in many healthcare systems. As the use of these applications increases, we are learning the failures of these models and how they can perpetuate bias. With these new lessons, we need to prioritize bias evaluat...
Integrating human feedback, especially in the fine-tuning phase, can address bias in LLMs. Reinforcing learning from human feedback (RLHF) is an advanced technique that stands at the frontier of bridging the gap between artificial intelligence and human intuition. It aims to adjust LLM behavior ...
These concerns include, for instance, models producing stereotypical and derogatory content [3] and gender and racial biases [10, 24, 38, 41]. Subsequently, approaches have been developed to, e.g., decrease the level of bias in these models [6, 39]....
Machine learning algorithms have become common in everyday decision making, and decision-assistance systems are ubiquitous in our everyday lives. Hence, research on the prevention and mitigation of potential bias and unfairness of the predictions made by these algorithms has been increasing in recent ...