Validating Image Links and Shortcodes in Hugo Markdown Files
Building a reliable content pipeline for Hugo can be challenging, especially when it comes to maintaining valid image links, internal page references, and shortcode usage across a growing site. Missing images and broken links can negatively impact user experience and SEO, but existing tools don’t fully address these validation needs.
This post contains affiliate links, which means I may receive a small commission, at no extra cost to you, if you make a purchase using these links.
That’s why we developed a custom GitHub Action to automate the process of validating Hugo Markdown files.
Of course if you dont have the time for all of this research and fine tuning, SquareSpace is here to get you up and running with a professional looking website in just minutes instead of weeks.
Common Challenges When Managing Hugo Sites
When publishing content with Hugo, creators often encounter the following issues:
- Missing or mislinked images: Especially when using content bundles (
index.md) withfeatured_imagefront matter. - Broken internal hyperlinks: Links between pages may reference bundles that don’t exist.
- Shortcode validation: A missing or misspelled shortcode can cause build failures or unexpected content rendering.
If you’ve ever struggled with these, you’re not alone. After searching for solutions, we found few tools capable of handling these problems directly—so we built our own.
Automating Content Validation with GitHub Actions
Our solution uses a custom Python script that does the following:
- Image validation: Ensures that any images referenced in front matter (
featured_image,recipe.image) or the Markdown body actually exist. - Internal page validation: Detects broken internal hyperlinks (e.g.,
[Post 2](/post-2/)) and verifies the corresponding content bundle exists (/content/post-2/index.md). - Shortcode validation: Confirms that every shortcode used in the Markdown has a corresponding
.htmlfile in thelayouts/shortcodes/directory. - Verbose output: When enabled, prints details for every file reviewed, including found images, shortcodes, and their status.
How to Set Up the GitHub Action
To get started, follow these steps:
1. Add the Script to Your Repository
Place the following Python script in your project’s scripts/ directory:
import os
import re
import yaml
def validate_content(base_path, verbose=False):
content_dir = os.path.join(base_path, "content")
shortcodes_dir = os.path.join(base_path, "layouts", "shortcodes")
available_shortcodes = [f.replace(".html", "") for f in os.listdir(shortcodes_dir) if f.endswith(".html")]
for root, _, files in os.walk(content_dir):
for file in files:
if file.endswith(".md"):
md_path = os.path.join(root, file)
validate_md_file(md_path, available_shortcodes, base_path, verbose)
def validate_md_file(md_path, shortcodes, base_path, verbose):
with open(md_path, "r") as file:
content = file.read()
validate_images(md_path, content, base_path, verbose)
validate_shortcodes(md_path, content, shortcodes, verbose)
validate_internal_links(md_path, content, base_path, verbose)
def validate_images(md_path, content, base_path, verbose):
image_pattern = re.compile(r'!$begin:math:display$.*?$end:math:display$$begin:math:text$(.*?)$end:math:text$')
for match in image_pattern.findall(content):
if not match.startswith("http"):
image_path = os.path.normpath(os.path.join(base_path, match.strip("/")))
if not os.path.exists(image_path):
print(f"Missing relative image '{match}' in {md_path}")
def validate_shortcodes(md_path, content, shortcodes, verbose):
shortcode_pattern = re.compile(r'{{<\s*(/)?(\w+).*?>}}')
for match in shortcode_pattern.findall(content):
name = match[1]
if name not in shortcodes:
print(f"Missing shortcode '{name}' in {md_path}")
def validate_internal_links(md_path, content, base_path, verbose):
link_pattern = re.compile(r'$begin:math:display$.*?$end:math:display$$begin:math:text$(/.*?)$end:math:text$')
for match in link_pattern.findall(content):
if not match.startswith("http"):
target_path = os.path.join(base_path, "content", match.strip("/"), "index.md")
if not os.path.exists(target_path):
print(f"Missing internal bundle '{match}' referenced in {md_path}")
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Validate Hugo content for images, shortcodes, and internal links.")
parser.add_argument("--base-path", default=os.path.dirname(__file__), help="Base path of the Hugo project.")
parser.add_argument("--verbose", action="store_true", help="Enable verbose output.")
args = parser.parse_args()
validate_content(args.base_path, args.verbose)
2. Create a GitHub Action Workflow
Create .github/workflows/validate-content.yml:
name: Validate Hugo Content
on:
push:
branches:
- main
pull_request:
jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install dependencies
run: pip install pyyaml
- name: Validate Hugo content
run: python scripts/validate_hugo_content.py --verbose
Benefits of Automating Content Validation
Running this workflow automatically on pull requests and pushes helps:
- Prevent broken links and missing images from slipping into production.
- Catch shortcode errors before they cause rendering issues.
- Save time on manual reviews for content issues.
By leveraging GitHub Actions, your Hugo site becomes more resilient, scalable, and easier to maintain.
Conclusion
If you’re tired of broken links and content errors slowing down your Hugo development, this validation script and GitHub Action can save you time and frustration. With automated checks for images, internal links, and shortcodes, your content pipeline becomes more reliable—and your audience gets a better experience.
Looking for more Hugo or automation tips? Follow our GitHub and stay tuned for more solutions.