This year’s hackathon focuses on creating resources that enable the evaluation and alignment of language models with the culture of LATAM and Iberian Peninsula countries.
The hackathon consists of a main challenge and several mini-challenges through which you can also accumulate points for the final prizes and win extra prizes. The maximum total score is 10 points.
Before starting:
- ✅ Join theSomosNLP Discord server
- ✅ Create aHugging Face account
- ✅ Fill out theregistration form
- ✅ Join the hackathon’sHugging Face organization, where datasets, models, and demos will be shared
- ✅Create or join a team, creating a thread in the #encuentra-equipo channel is the way to register your team for the hackathon
If you have any questions:
- Check the#anuncioschannel, we recommend enabling channel notifications, we post maximum once a day
- Ask your questions in the Discord#ask-for-helpchannel so everyone can benefit from the answer
- Events are announced in the#eventoschannel and added to theGoogle Calendar
- You can give us feedback to improve the challenge guides with thisform(anonymous)
Let’s do this! 🚀
✨ Mini-challenges
✅ Exams (INCLUDE)
Look for multiple-choice exams from your country to evaluate LLMs’ knowledge. Prioritize exams in languages other than Spanish and/or focused on cultural topics (e.g., history, literature). We will use these questions and answers to extend the open INCLUDE benchmark.
April 9 - April 28 | max 1 point
Requirements: Know how to search on the internet
Resources:
- GitHub repoamayuelas/corpus-automation
👀 Stereotypes
Share and evaluate stereotypes to help mitigate LLMs’ biases.
April 9 - May 7 | max 1 point
Requirements: Have lived in society
🔥 Main Challenge
⚙️ Option A: LLM Alignment
Process, filter, and extend the v0 preferences dataset adapting it to your use case. Use it to align an LLM using optimized training and alignment techniques such as LoRA, quantization, and Direct Preference Optimization (DPO). For this challenge, each team will have access to $500 worth of Cohere API credits and a Hugging Face L40S GPU.
April 21 - May 5 | max 3 points
Requirements: Know how to program
More information
Guidelines and support material:
- Example notebook for aligning an LLM with DPO
Incentives:
- Add up to 3 points to your team’s total score
Many thanks to:
- Cohere: API credits worth $500 for each team
- Hugging Face: L40S GPUs for each team (L40S = 8 vCPU, 62 GB RAM, 48 GB VRAM)
🎨 Option B: Cultural Multimodal Project
Create a multimodal model that generates image descriptions taking context into account. For this challenge, each team will have access to $500 worth of Cohere API credits and a Hugging Face L40S GPU.
April 21 - May 5 | max 3 points
Requirements: Have experience in NLP, there will be less support material for this challenge than for option A
More information
Guidelines and support material:
- Example notebook for training an image description generation model
Incentives:
- Add up to 3 points to your team’s total score
Many thanks to:
- Cohere: API credits worth $500 for each team
- Hugging Face: L40S GPUs for each team (L40S = 8 vCPU, 62 GB RAM, 48 GB VRAM)
🎥 Creating a Demo
Create a demo of your project in a Hugging Face Space so everyone can see your work.
April 21 - May 5 | max 0.5 points
More information
Guidelines and support material:
- Example code for creating a demo on Hugging Face
Incentives:
- Add up to 0.5 points to your team’s total score
- Best 2 or 3 demos = ZeroGPU time extension
- Required to consider the project finished and be eligible for prizes
Many thanks to:
- Hugging Face: ZeroGPU for demos
🎥 5’ Video Presenting the Project
Record a 5-minute video presenting your project.
May 7 | max 0.5 points
More information
Guidelines and support material:
- Recommendations for creating a presentation
Incentives:
- Add up to 0.5 points to your team’s total score
- Required by Mistral to give credits to the winning team
- Required to consider the project finished and be eligible for prizes
📝 Optional: Paper Writing
With the help of PhD students and professors, write a paper presenting your project and submit it to the LatinX in NLP workshop at NeurIPS, one of the most important conferences in the field.
More information
Incentives:
- Gain research experience
- If your paper is accepted, you’ll have the opportunity to travel to Vancouver to present it!
Many thanks to:
- LatinX in AI: Mentoring for paper writing