Our hackathon 2025 has ended

The fourth edition of the SomosNLP hackathon has come to an end — what an experience!

🚀 Projects

The focus of this hackathon was the generation of open resources for evaluating and improving the cultural adequacy of LLMs for Ibero-American countries.

Curious to see the projects developed during the SomosNLP 2025 Hackathon? Here they are!

🎦 The presentation videos are available in this YouTube playlist along with the workshops and expert talks held during the hackathon.

🤗 All resources are available on the Hugging Face Hub: hf.co/somosnlp-hackathon-2025

We hope you enjoy them and that many applications emerge using these new open resources 💛

📚 Cultural knowledge benchmark: INCLUDE

This challenge consisted of collecting multiple-choice exams and extracting questions to generate a large LLM evaluation benchmark focused on regional knowledge.

In total, we collected more than 38,000 questions from 23 countries 🔥

In particular, we obtained more than 1,000 questions for México, Colombia, Perú, Argentina, Bolivia, España and Ecuador.

Thank you so much for your effort!

The people who contributed the most prompts were...

Rank	Name	Questions extracted
🥇	Francisco-Javier Rodrigo-Ginés	4599
🥈	Pablo Carrera	2830 *
🥉	Alfonso Amayuelas	2300
4	Naira Paola Arnez Jordan	1581
5	Oscar Cumbicus	1280
6	Jorge Vallego	927
7	Juan Calderón	902 *
8	Reewos Talla	608 *
9	Carlos Arriaga	598
10	Andrea Parra	577
11	Jorge Téllez	561 *
12	Susana Zhou	560
13	Enrique Paiva	502
14	David Quispe	449 *
15	Gonzalo Martínez	436
16	Guido Ivetta	393
17	Javier Conde	377
18	Fabian Perez	372
19	Andrés Sebastian	370
20	Gerardo Huerta	353
21	Marcos J. Gómez	348
22	David Nazareno Campo	303
23	Roverico	303 *
24	Henry Mantilla	302
25	Constanza Jeldres	300
26	Rasel Agüero Fernández	300
27	Rosabel F. Medina Sarmiento	300
28	Adrián Sáez	227 *
29	Gabriela Palomeque	120

The table includes the number of questions extracted (not collected) by each participant. Numbers with an asterisk indicate that payment of compensation requires the person to confirm the license of some exams. All people with more than 300 questions will be co-authors of the INCLUDE paper.

📚 Cultural knowledge benchmark: BLEND

This challenge consisted of answering questions about their country to extend the open BLEND benchmark for evaluating cultural knowledge of LLMs.

The countries with the highest participation were España, México, Chile, Cuba, and Perú. Great work! 👏

The annotation space is still open — join in!

📚 Stereotype validation

This challenge consisted of collecting and validating stereotypes about different nationalities. In total, we obtained nearly 1,000 stereotypes that will help us mitigate biases in LLMs.

The people who contributed the most prompts were...

Rank	Discord ID	Stereotypes validated
🥇	bea esparcia	126
🥈	neovalleltd	122
🥉	dreamripper1	85
4	andres_seba	70
5	alexis_castillo	68
6	elena w.	57
7	alebravo	30
8	jedzill4	27
9	gonznm	24
10	agumeister	21
11	adriszmar	20
12	jorge.vallego	14
13	jorgeav	13
14	maria isabel ll	12
15	clauvallory	5
16	dramos7	5
17	enpaiva93	3
18	lucase#5596	3
19	alvaro8gb	2
20	mcdaqc	2
21	xat.	2
22	freddyalfonsoboulton	1
23	roverico	1
24	valaery	1
25	yee51	1

📚 Preference dataset

This challenge consisted of designing prompts that evaluated cultural adequacy for each country, followed by choosing the best response in an LLM Arena.

🤗 The dataset with the prompt collection is available on Hugging Face: hf.co/datasets/somosnlp-hackathon-2025/dataset-preferencias-dpo-v0

The countries with the highest participation were Colombia, Chile, España, Perú, Paraguay, Nicaragua, and México.

The people who contributed the most prompts were...

Rank	Discord ID	Preferences
🥇	rasel3132	430
🥈	bel21093	206
🥉	conilinguist	196
4	roverico	164
5	pablo.ce	153
6	steminism	133
7	andres_seba	120
8	mcdaqc	118
9	susanazhou	111
10	enpaiva93	107
11	dreamripper1	83
12	bea esparcia	80
13	angustias22	63
14	henry mantilla	58
15	luceldasilva	58
16	fabianpp	50
17	alvaro8gb	42
18	ghuerta170	35
19	edmenciab	30
20	adriszmar	22
21	diegoacheve	21
22	danielcavilla	19
23	helenpy	19
24	gonzalo_40146	8

The number of preferences is the number of prompts each participant submitted to the Arena and voted on which was the best response generated by the LLMs. This number may not match the number of prompts designed and uploaded to the Hugging Face dataset by each team if not all prompts were submitted to the Arena.

And the three best corpora were… 🥁🥁🥁

🥇 TralaleloTralala-MemeAlign
🥈 IberoTales
🥉 HoCV-COL

Congratulations to the finalist teams (in alphabetical order):

👏 Comida Colombia + Ecuador
👏 Cresia
👏 Equipo LeIA
👏 Falsos Amigos
👏 Refranero Afro-Cubano
👏 Sabiduría Popular Castellana
👏 Think Paraguayo

Congratulations to all the teams!

🎁 Prizes and next steps

During the month of August, we will share more information about honorable mentions and contact all teams to deliver the corresponding prizes.
If you have any questions about the point count, don’t hesitate to ask. The email-Discord ID mapping was done with the data from the registration form.
If you want to continue contributing to the mini challenges and have a more active participation in the papers we are going to write, you can let us know in the #compare-tu-proyecto channel and we will invite you to the corresponding private channels.
If in the submission form you expressed interest in publishing a paper presenting your project, we will contact you in September for the mentoring sessions. You can start writing up your experiments in article format (introduction/motivation, methodology, results, and analysis).

🚀 Projects

📚 Cultural knowledge benchmark: INCLUDE

📚 Cultural knowledge benchmark: BLEND

📚 Stereotype validation

📚 Preference dataset

🎁 Prizes and next steps

💛 Thank you so much and see you next time!