When discussing Generative Artificial Intelligence, the results of these models are often compared to corresponding human performance. This raises a question: who are the humans being referred to when such comparisons are made? But who are the humans being referred to when such comparisons are made? What culture do they belong to? What values do they uphold?

Harvard’s answer

A group of researchers at Harvard University decided to look for an answer by analysing the “cultural proximity” of a chatbot, ChatGPT in this case, to different countries. The study starts from a simple and understandable assumption: talking generically about human beings is misleading, human beings are not a monolithic structure, culture, socio-economic context and spoken language shape the way we are, the decisions we make, our values… Based on data from the World Value Survey, one of the most comprehensive collections in the social sciences on the values that characterise different cultures, and by asking ChatGPT the same questions as in the survey, the researchers investigated which countries are “culturally closest” to ChatGPT, finding results that were hardly surprising.

The chatbot analysed belongs to a cluster of countries comprising the United States, Uruguay, Canada, Northern Ireland, New Zealand, Great Britain, Australia, Andorra, Germany and the Netherlands. Among the most culturally distant countries are Ethiopia and Pakistan. Investigating further, the researchers discovered that the greatest similarities in values with ChatGPT are found in the so-called “WEIRD” cultures: Western, Educated, Industrialised, Rich and Democratic, of which the United States is the main reference point.

This result is not surprising because the problem has been around for a long time: Large Language Models are trained using texts and data found on the web, where Western countries are significantly overrepresented compared to others. Considering the data on Common Crawl, one of the main public data sources used for LLM training, more than 44% of the sources are in English (spoken by 17% of the world’s population), while a language such as Hindi, spoken by 7.5% of the world’s population, accounts for less than 0.2% of the data. Added to this is the fact that, after initial training, models are further refined through fine-tuning procedures that incorporate norms, preferences and evaluation criteria often rooted in Western cultural contexts, further reinforcing this imbalance.

How AI processes inputs

This imbalance is then amplified by artificial intelligence systems due to their very functioning: artificially generated content derives from what is statistically most likely. Several studies show that outputs tend not to accurately reflect the distribution of input data, but to over-represent the most likely and under-represent the least likely, eventually leading to the disappearance of some of them.

With the increase in the use of artificial intelligence tools in content creation and moderation, screening and selection processes, these hidden biases need to be studied and investigated in depth to avoid unfair treatment. A study by two researchers at the University of Zurich attempted to reconstruct some of these biases by focusing on political values. They analysed the judgements of four different Large Language Models (ChatGPT, Mistral, DeepSeek and Grok) on various social or political statements to investigate possible biases. They found that in the absence of defined sources, chatbots show little bias, while they are strongly influenced by the type of source, if one is present. For example, when presented with the same source written by different authors, the four chatbots studied rated it worse when the source was identifiable as Chinese (even DeepSeek!). In addition, the chatbots also rated sources from other automatic content generation systems negatively, revealing a lack of confidence in these systems on the part of their designers.

Why reflect on these issues?

These studies help us understand how much work still needs to be done before we can consider Artificial Intelligence a reliable and fair tool, capable of respecting the cultural and value diversity of the real world. Understanding the biases of models, their origins and how they are amplified is not a theoretical exercise, but a necessary condition for avoiding discrimination or unfair treatment when AI is applied to content creation, selection, moderation or social decisions.

In the meantime, however, we can simply rejoice with Sam Altman that, for the past few days, if we ask ChatGPT to write without using the famous dashes (—), it now actually does what we ask!