Main Challenge #HackathonSomosNLP 2026: LLM and VLLM Alignment

How to participate in this challenge and help improve the cultural knowledge of language and vision-language models


🎯 Challenge objective

  • Choose one of the following options:
    • A. Align a language model (LLM) to generate text in a culturally appropriate way
    • B. Adapt a multimodal vision-language model (VLLM) to generate image descriptions that take cultural context into account
  • In Spanish, Portuguese or any language of the Iberian Peninsula or LATAM
  • Adapt an existing model (don’t pre-train one from scratch); we recommend starting from models around 7B (e.g. Salamandra, Mistral and Gemma)
  • Generate the dataset with the help of 500 USD in Cohere API credits! We recommend filtering and extending the v0 preferences dataset generated collectively in the Arena: somosnlp-hackathon-2025/dataset-preferencias-dpo-v0
  • Train your model directly in JupyterLab on the Hugging Face hub — we have GPUs sponsored by 🤗!
  • Upload the model(s) along with all the notebooks used to hf.co/somosnlp-hackathon-2026
  • Write the Model Card; include links to the dataset and the notebooks used (e.g. preprocessing, training)

Guide

✅ Preparation

Requirements per team
  1. Contribute 100 quality prompts to the preferences dataset
  2. Answer 200 questions from the evaluation dataset (BLEND)
  3. Request the 500 USD Cohere API credits (after completing points 1 and 2, mention @mariagrandury in your team’s channel for instructions)
  4. Create a Space in the organization hf.co/somosnlp-hackathon-2026 with the jupyterlab template
  5. Complete the registration form

📚 Dataset

Data is the most important thing in developing a model, and we’ll also give it more weight when evaluating the projects 👀

  • Generate a dataset for your project:
    • Use the one generated collectively in the Arena as the initial version for your dataset: somosnlp-hackathon-2025/dataset-preferencias-dpo-v0
    • Take advantage of the 500 USD in Cohere API credits that each team has to filter, improve and extend it with more prompts and responses specifically designed for your use case
    • Keep in mind that, since this is about cultural topics, it’s very important that everything generated synthetically is reviewed by a person (you can use Argilla)
  • Upload the dataset to hf.co/somosnlp-hackathon-2026 and iterate
  • Upload all the notebooks and scripts used to generate and process the dataset to the dataset repo
    • If you prefer to create a GitHub repo with all the code, you can — just don’t forget to include a link in the Dataset Card
  • Fill out the Dataset Card properly
    • “Dataset Card” is the name of the documentation for Hugging Face datasets — it’s the README.md of the dataset repository
    • NOTE: This is taken into account when evaluating the project
    • Include the project motivation and impact in the introduction
    • Detail the generation and processing pipeline, include the libraries used and mention the tests done, include links to the code
    • Specify the license: preferably apache-2.0; if not, explain why
    • Evaluate the dataset’s biases, whether it’s balanced, what language varieties or opinions it represents, etc.

How to name datasets:

  • The name of the dataset with the (minimum 100) prompts you submitted to the LLM Arena must contain prompt. For example: normas_culturales_colombia_prompts
  • The names of preference datasets must contain the name of the main algorithm they can be used for (dpo or kto). For example: normas_culturales_colombia_dpo
  • If the dataset is multimodal, it must contain image. For example: utensilios_ecuador_images_kto

⚙️ Model

  1. Create a Space in the organization hf.co/somosnlp-hackathon-2026 with the JupyterLab template
  2. The Hugging Face team will assign an L40S grant to the Space
    • Set the “auto-sleep” time to 5 minutes to ensure responsible use 🌱
  3. Design the training notebook
    • Save the resulting model directly to hf.co/somosnlp-hackathon-2026
    • Use the CodeCarbon library to assess the climate impact
  4. Run tests with small models and dataset subsets to verify the code is correct, so you don’t run into bugs after several hours of training.
  5. Launch the training, review the results and iterate
    • You can try e.g. different algorithms or base models
    • You don’t need to create a different repo for each model — if you push to the same repo, the updated model will be saved as a new commit (which you can link to from the Model Card if you want)
  6. Download the dataset processing and model training notebooks, upload them to the model repo (VERY IMPORTANT) and delete the JupyterLab Space
  7. Fill out the Model Card properly
    • “Model Card” is the name of the documentation for Hugging Face models — it’s the README.md of the model repository
    • NOTE: This is taken into account when evaluating the project
    • Recommendation: describe the tests as you do them, as well as the dataset improvement and model training process
    • Include the project motivation and impact in the introduction
    • Detail the training process, include the libraries used and mention the tests done, include links to the code
    • Specify the license: preferably apache-2.0; if not, explain why
    • Evaluate the model’s biases
    • Evaluate the environmental impact

Resources

Below we share plenty of resources so you can develop high-quality projects. Resources marked with ⭐ correspond to talks and workshops given during the hackathon and specifically designed to help you in this edition.

📚 Dataset

The Cohere API:

Dataset creation:

Inspiration:

⚙️ Model

Creating the training Space:

  • Docs: JupyterLab on Spaces, where you can run your notebooks as always. Be careful not to lose storage when restarting the Space — save your notebooks!

LLM Alignment:

Multimodal models:

LLM Fine-tuning:

Climate impact:

  • To evaluate the carbon footprint of your model training, you can use tools like Code Carbon (better, integrated into 🤗 Transformers) or ML CO2 Impact.
  • We recommend this video for motivation, this article from the HF blog, and the documentation section of 🤗 Transformers that covers this topic.

📝 Documentation

Back to challenges