Challenges #HackathonSomosNLP 2025

This year’s hackathon focuses on creating resources that enable the evaluation and alignment of language models with the culture of LATAM and Iberian Peninsula countries.

The hackathon consists of a main challenge and several mini-challenges through which you can also accumulate points for the final prizes and win extra prizes. The maximum total score is 10 points.

Before starting:

✅ Join theSomosNLP Discord server
✅ Create aHugging Face account
✅ Fill out theregistration form
✅ Join the hackathon’sHugging Face organization, where datasets, models, and demos will be shared
✅Create or join a team, creating a thread in the #encuentra-equipo channel is the way to register your team for the hackathon

If you have any questions:

Check the#anuncioschannel, we recommend enabling channel notifications, we post maximum once a day
Ask your questions in the Discord#ask-for-helpchannel so everyone can benefit from the answer
Events are announced in the#eventoschannel and added to theGoogle Calendar
You can give us feedback to improve the challenge guides with thisform(anonymous)

Let’s do this! 🚀

✨ Mini-challenges

✅ Exams (INCLUDE)

Look for multiple-choice exams from your country to evaluate LLMs’ knowledge. Prioritize exams in languages other than Spanish and/or focused on cultural topics (e.g., history, literature). We will use these questions and answers to extend the open INCLUDE benchmark.

April 9 - April 28 | max 1 point

Requirements: Know how to search on the internet

Resources:

GitHub repoamayuelas/corpus-automation

👀 Stereotypes

Share and evaluate stereotypes to help mitigate LLMs’ biases.

April 9 - May 7 | max 1 point

Requirements: Have lived in society

🔥 Main Challenge

⚙️ Option A: LLM Alignment

Process, filter, and extend the v0 preferences dataset adapting it to your use case. Use it to align an LLM using optimized training and alignment techniques such as LoRA, quantization, and Direct Preference Optimization (DPO). For this challenge, each team will have access to $500 worth of Cohere API credits and a Hugging Face L40S GPU.

April 21 - May 5 | max 3 points

Requirements: Know how to program

More information

Guidelines and support material:

Example notebook for aligning an LLM with DPO

Incentives:

Add up to 3 points to your team’s total score

Many thanks to:

Cohere: API credits worth $500 for each team
Hugging Face: L40S GPUs for each team (L40S = 8 vCPU, 62 GB RAM, 48 GB VRAM)

🎨 Option B: Cultural Multimodal Project

Create a multimodal model that generates image descriptions taking context into account. For this challenge, each team will have access to $500 worth of Cohere API credits and a Hugging Face L40S GPU.

April 21 - May 5 | max 3 points

Requirements: Have experience in NLP, there will be less support material for this challenge than for option A

More information

Guidelines and support material:

Example notebook for training an image description generation model

Incentives:

Add up to 3 points to your team’s total score

Many thanks to:

Cohere: API credits worth $500 for each team
Hugging Face: L40S GPUs for each team (L40S = 8 vCPU, 62 GB RAM, 48 GB VRAM)

🎥 Creating a Demo

Create a demo of your project in a Hugging Face Space so everyone can see your work.

April 21 - May 5 | max 0.5 points

More information

Guidelines and support material:

Example code for creating a demo on Hugging Face

Incentives:

Add up to 0.5 points to your team’s total score
Best 2 or 3 demos = ZeroGPU time extension
Required to consider the project finished and be eligible for prizes

Many thanks to:

Hugging Face: ZeroGPU for demos

🎥 5’ Video Presenting the Project

Record a 5-minute video presenting your project.

May 7 | max 0.5 points

More information

Guidelines and support material:

Recommendations for creating a presentation

Incentives:

Add up to 0.5 points to your team’s total score
Required by Mistral to give credits to the winning team
Required to consider the project finished and be eligible for prizes

📝 Optional: Paper Writing

With the help of PhD students and professors, write a paper presenting your project and submit it to the LatinX in NLP workshop at NeurIPS, one of the most important conferences in the field.

More information

Incentives:

Gain research experience
If your paper is accepted, you’ll have the opportunity to travel to Vancouver to present it!

Many thanks to:

LatinX in AI: Mentoring for paper writing