Automatic translation of a Hugo website with AI

This is an automatic translation generated by artificial intelligence. May contain errors.

Technical guide: how we added multilingual support (English) to OfiLibre website

In this guide we explain, step by step, how we implemented English support on the OfiLibre website, based on the navigator-hugo template.

1. Multilingual configuration in config.toml

The first step was to declare the supported languages in the config.toml file. We added a new [languages] section with the configuration for es (Spanish) and en (English). Here is a simplified example:

# Languages
[languages]

# ---------- Spanish ----------
[languages.es]
weight = 1
languageCode = "es-es"
title = "OfiLibre URJC"
contentDir = "content/spanish"

[languages.es.params]
home = "Inicio"
dateFormat = "2 January 2006"

[languages.es.params.hero]
enable = true
heading = "La OfiLibre en dos minutos"
description = "En la OfiLibre queremos ayudar a la comunidad universitaria a comprender la cultura libre, la publicación libre, los datos abiertos y el software libre. Para ello nos dedicamos a explicar qué son, cómo funcionan, qué características tienen, y qué beneficios pueden producir, de forma que cualquiera pueda decidir cómo le conviene relacionarse con ellos."
button = true
btnText = "Saber más"
btnURL = "ofilibre"
videoURL = "https://tv.urjc.es/iframe/5d022dedd68b14cb308b6ae5"

[languages.es.params.blog]
enable = true
topTitle = "Últimos artículos publicados"
title = "Blog"

[languages.es.params.contacto]
enable = true
topTitle = "Déjanos un mensaje"
title = "Contacto"
subtitle = "La OfiLibre es una iniciativa que trata de contar con la participación de toda la comunidad universitaria para que, entre todos, podamos encontrar nuestro camino en el mundo del conocimiento y la cultura libres."
address = "Despacho 011, planta baja edificio Rectorado. C/ Tulipán s/n, 28933 Móstoles (Madrid)"
email = "ofilibre@urjc.es"
x = "https://x.com/OfiLibreURJC"
mastodon = "https://floss.social/@OfiLibreURJC"
telegram = "https://t.me/ofilibreurjc"
instagram = "https://www.instagram.com/ofilibreurjc/"
gitlab = "https://gitlab.etsit.urjc.es/ofilibre"
mapLatitude = "40.337887"
mapLongitude = "-3.874810"
mapMarker = "/contact/marker.png"

[languages.es.params.cta]
enable = true
bg = "/images/about/call-to-action/call-to-action-bg-2.jpg"
title = "Canales de información"
subtitle = "La OfiLibre mantiene varios canales de información."
btnText = "Contacto"
btnURL = "contacto"

# ---------- English ----------
[languages.en]
weight = 2
languageCode = "en-us"
title = "OfiLibre URJC english"
contentDir = "content/english"

[languages.en.params]
home = "Home"
dateFormat = "January 2, 2006"

[languages.en.params.hero]
enable = true
heading = "OfiLibre in Two Minutes"
description = "At OfiLibre, we aim to help the university community understand free culture, open publishing, open data, and free software. We explain what they are, how they work, what their characteristics are, and what benefits they can bring, so that everyone can decide how they want to engage with them."
button = true
btnText = "Learn more"
btnURL = "ofilibre/"
videoURL = "https://tv.urjc.es/iframe/5d022dedd68b14cb308b6ae5"

[languages.en.params.blog]
enable = true
topTitle = "Latest Published Articles"
title = "Blog"

[languages.en.params.contact]
enable = true
topTitle = "Leave Us a Message"
title = "Contact"
subtitle = "OfiLibre is an initiative that seeks the participation of the entire university community so that, together, we can find our path in the world of free knowledge and culture."
address = "Office 011, ground floor, Rectorate Building. Tulipán Street, 28933 Móstoles (Madrid), Spain"
email = "ofilibre@urjc.es"
x = "https://x.com/OfiLibreURJC"
mastodon = "https://floss.social/@OfiLibreURJC"
telegram = "https://t.me/ofilibreurjc"
instagram = "https://www.instagram.com/ofilibreurjc/"
gitlab = "https://gitlab.etsit.urjc.es/ofilibre"
mapLatitude = "40.337887"
mapLongitude = "-3.874810"
mapMarker = "/contact/marker.png"

[languages.en.params.cta]
enable = true
bg = "/images/about/call-to-action/call-to-action-bg-2.jpg"
title = "Information Channels"
subtitle = "OfiLibre maintains several information channels."
btnText = "Contact"
btnURL = "contacto"

This tells Hugo to generate two versions of the site, one in Spanish by default and one in English.

2. Translated data files

Inside the data/ directory we created a folder data/en/ with the .yml files duplicated but modified for the new language. These files contain exactly the same structure as the original, but with translated text.

For example, the resources.yml file, which generates the same page where the website resources are displayed, but in English.

enable  : true
topTitle: Material produced by OfiLibre and URJC staff
title   : OfiLibre Resources
item    :

  - icon    : tf-ion-monitor
    title   : Presentations
    link    : /pres/
    description : >
      Slides from talks and workshops previously held by OfiLibre.      

  - icon    : tf-ion-ios-world-outline
    title   : Catalog of Open Teaching Materials
    link    : /catalogo/
    description : >
      This catalog includes open access teaching materials produced by URJC staff. You can also add your own published works.      

  - icon    : tf-ion-clipboard
    title   : Sheets
    link    : /fichas/
    description : >
      Sheets on free software and resources. Information about programs and tutorials to help you get started.      

  - icon    : tf-ion-ios-paper-outline
    title   : Guides
    link    : /guias/
    description : >
      Guides and tutorials produced by OfiLibre.      

  - icon    : tf-ion-code
    title   : OfiLibre Code
    link    : https://gitlab.etsit.urjc.es/ofilibre/code
    description : >
      Public code repository related to OfiLibre’s work.      

3. Translation of labels and strings

Hugo allows defining translations for static texts (buttons, menus, titles, etc.) using .toml files in the /i18n folder. We created a file called en.toml with the equivalents for all the texts defined in es.toml.

[home]
other = "Home"

[about]
other = "About"

[activities]
other = "Activities"

[resources]
other = "Resources"

[contact]
other = "Contact"

[search]
other = "Search"

[blog]
other = "Blog"

[graphicdesign]
other = "Graphic Design"
[webdesign]
other = "Web Design"
[webdevelopment]
other = "Web Development"

[partners]
other = "Partners"
[faqs]
other = "FAQ's"
[badges]
other = "Badges"

[category_base]
other = "categories"

# Sheets

[logoOf]
other = "Logo of"

[licenses]
other = "Licenses"

[website]
other = "Website"

[websiteEs]
other = "Website (Spanish)"

[installGuides]
other = "Installation guides"

[tutorials]
other = "Tutorials"

[sourceCode]
other = "Source code"

[additionalLinks]
other = "Additional links"

[myapps]
other = "Available on MyApps URJC"

[lastUpdated]
other = "Last updated"

# Presentations

[description]
other = "Description"

[download]
other = "Download"

[moreInfo]
other = "More information"

# Search

[searchTitle]
other = "Search"

[searchPlaceholder]
other = "Type to search..."

[internalSearchTitle]
other = "OfiLibre web"

[externalSearchTitle]
other = "Internet Archive"

# Contact

[ubicacion]
other = "Location"

[correo]
other = "Email"

[redes]
other = "Social networks"

# Buttons

[learnMore]
other = "Learn more"

Thus, Hugo will know what to display when rendering the English version of the site.

4. Automated Translation Using an LLM

The content translation to English wasn’t done manually. To speed up the process while preserving each file’s technical structure (metadata and body text), we used a language model (LLM) through a Python script that automates the entire workflow. In our case, we used the Kimi K2 LLM with ID moonshotai/kimi-k2-instruct.

The goal is to translate .md files written in Spanish and generate their English versions while maintaining the original Markdown formatting and metadata (such as title, description, tags, etc.).

How Does This System Work?

The script performs three main tasks:

  1. Configuration Loading: Through a config.yml file, we define:

    • The language model access key (API Key)
    • The model to use (e.g., llama3-70b)
    • Source and target directories
    • Which fields to translate (e.g., title, description, or even nested fields like installs.name)
  2. Frontmatter Translation: This is the header section of Markdown files containing key properties like title, categories, or tags. The translation is done field-by-field, ensuring:

    • No content invention
    • Format preservation
    • Proper handling of lists or nested fields when specified
  3. Content Body Translation: The main article text is translated while respecting Markdown formatting, links, lists, and code blocks. Specific rules maintain links and proper nouns like “OfiLibre” untranslated.

Tools Used

  • Python: The scripting language
  • frontmatter: Library for reading/writing Markdown files with metadata
  • langchain and langchain_groq: For easy connection to the LLM hosted on Groq (a platform optimized for fast LLM execution)
  • tqdm: For displaying translation progress
  • yaml: For reading .yml configuration files

Usage Instructions

Configure the config.yml file, for example:

GROQ_API_KEY: "your_api_key"
MODEL_NAME: "moonshotai/kimi-k2-instruct"
SOURCE_DIR: "content/spanish/guides"
TARGET_DIR: "content/english/guides"
fields:
  - title
  - description
  - categories
  - tags

Run the script with Python:

python3 translator.py

The script will:

  • Read all .md files from the source directory
  • Translate specified fields in the frontmatter
  • Translate content while preserving formatting
  • Save translated files to the target directory
  • Additionally, if any file includes a translate: false field, the script automatically skips it.

Results

This process allowed us to translate dozens of files with high fidelity, minimizing human errors while maintaining the website’s technical style. Each translated file includes an initial note stating it’s an automated translation, so visitors are aware.

To see code details, visit this repository, specifically the /automatizacion/traduccion-llm/ directory.