Extension Use Cases

This page provides a catalog of simple examples to showcase how you can enhance the capabilities of Antora through the use of extensions. Each section introduces a different use case and presents the extension code you can build on as a starting point.

You can also reference official extension projects provided by the Antora project to study more complex examples.

Set global AsciiDoc attributes

If you want to define global AsciiDoc attributes that dynamic values, you can do using an extension. The playbook holds the AsciiDoc config object, which itself contains the global AsciiDoc attributes. An extension can listen for the playbookBuilt event and add attributes to this map.

Example 1. set-global-asciidoc-attributes-extension.js

module.exports.register = function () {
  this.on('beforeProcess', ({ siteAsciiDocConfig }) => {
    const buildDate = new Date().toISOString()
    siteAsciiDocConfig.attributes['build-date'] = buildDate
  })
}

The extension could read these values from a file or environment variables as well.

If you need to set AsciiDoc attributes that are scoped to a component version, then you’ll need to listen for the contentClassified event instead. From there, you can access the AsciiDoc attributes form the asciidoc property on a component version object. You can look up a component version by name and version using the getComponentVersion method on the content catalog object. Alternately, you can access component versions from the versions property on each component returned by the getComponents method on the content catalog object.

Print AsciiDoc attributes

If you’re troubleshooting your site, you can use an extension to generate a report of AsciiDoc attributes at the site level and those per component version. When making this report, you have a choice of whether you want to show the AsciiDoc attributes as they would be available to a page (aka compiled) or as defined (aka uncompiled)

You can use the following extension to print all the AsciiDoc attributes compiled for each component version. The extension also prints all the attributes compiled from the playbook, though keep in mind these are integrated into the attributes for each component version.

Example 2. print-compiled-asciidoc-attributes-extension.js

module.exports.register = function () {
  this.once('contentClassified', ({ siteAsciiDocConfig, contentCatalog }) => {
    console.log('site-wide attributes (compiled)')
    console.log(siteAsciiDocConfig.attributes)
    contentCatalog.getComponents().forEach((component) => {
      component.versions.forEach((componentVersion) => {
        console.log(`${componentVersion.version}@${componentVersion.name} attributes (compiled)`)
        if (componentVersion.asciidoc === siteAsciiDocConfig) {
          console.log('same as site-wide attributes')
        } else {
          console.log(componentVersion.asciidoc.attributes)
        }
      })
    })
  })
}

You can use the following extension to print all the AsciiDoc attributes as defined in the playbook and in the antora.yml file for each component version (by origin).

Example 3. print-defined-asciidoc-attributes-extension.js

module.exports.register = function () {
  this.once('contentClassified', ({ playbook, contentCatalog }) => {
    console.log('site-wide attributes (as defined in playbook)')
    console.log(playbook.asciidoc.attributes)
    contentCatalog.getComponents().forEach((component) => {
      component.versions.forEach((componentVersion) => {
        getUniqueOrigins(contentCatalog, componentVersion).forEach((origin) => {
          console.log(`${componentVersion.version}@${componentVersion.name} attributes (as defined in antora.yml)`)
          console.log(origin.descriptor.asciidoc?.attributes || {})
        })
      })
    })
  })
}

function getUniqueOrigins (contentCatalog, componentVersion) {
  return contentCatalog
    .findBy({ component: componentVersion.name, version: componentVersion.version })
    .reduce((origins, file) => {
      const origin = file.src.origin
      if (origin && !origins.includes(origin)) origins.push(origin)
      return origins
    }, [])
}

You may find it useful to make use of these collections of AsciiDoc attributes when writing other extensions.

Exclude private content sources

If some contributors or CI jobs don’t have permission to the private content sources in the playbook, you can use an extension to filter them out instead of having to modify the playbook file.

This extension runs during the playbookBuilt event. It retrieves the playbook, iterates over the content sources, and removes any content source that it detects as private and thus require authentication. We’ll rely on a convention to communicate to the extension which content source is private. That convention is to use an SSH URL that starts with git@. Antora automatically converts SSH URLs to HTTP URLs, so the use of this syntax merely serves as a hint to users and extensions that the URL is private and is going to request authentication.

Example 4. exclude-private-content-sources-extension.js

module.exports.register = function () {
  this.on('playbookBuilt', function ({ playbook }) {
    playbook.content.sources = playbook.content.sources.filter(({ url }) => !url.startsWith('git@'))
    this.updateVariables({ playbook })
  })
}

This extension works because the playbook is mutable until the end of this event, at which point Antora freezes it. The call to this.updateVariables to replace the playbook variable in the generator context is not required, but is used here to express intent and to future proof the extension.

Support remote repositories with git lfs enabled

Antora can only work with the worktree of a local repository if the repository has git lfs enabled. In other words, the git lfs objects have to already be inflated by the time Antora uses the worktree. Otherwise, Antora will publish the pointer file instead of the object referenced by that pointer.

We can use an extension to workaround this limitation. The following extension looks for all content sources in the playbook marked with the lfs key set to true. For example:

content:
  sources:
  - url: https://githost/example/repo-with-lfs.git
    lfs: true

The extension then uses the native git command to clone the marked repositories. It then replaces the reference to the remote repository in the playbook with the local one.

Code: gitlab.com/opendevise/oss/antora-binary-files-extension-suite/-/blob/main/packages/git-lfs-extension/lib/index.js

Note that this extension assumes that each lfs content source is configured with a single branch.

Unpublish flagged pages

If you don’t want a page to ever be published, you can prefix the filename with an underscore (e.g., _hidden.adoc). However, if you only want the page to be unpublished conditionally, then you need to reach for an extension.

When using this extension, any page that sets the page-unpublish page attribute will not be published (meaning it will be unpublished). For example:

= Secret Page
:page-unpublish:

This page will not be published.

You can set the page-unpublish page attribute based on the presence (or absence) of another AsciiDoc attribute, perhaps one set in the playbook or as a CLI option. For example:

= Secret Page
ifndef::include-secret[:page-unpublish:]

This page will not be published.

This extension runs during the documentsConverted event. This is the earliest event that provides access to the AsciiDoc metadata on the virtual file. The extension iterates over all publishable pages in the content catalog and unpublishes any page that sets the page-unpublish attribute. To unpublish the page, the extension removes the out property on the virtual file. If the out property is absent, the page will not be published.

Example 5. page-unpublish-tag-extension.js

module.exports.register = function () {
  this.on('documentsConverted', ({ contentCatalog }) => {
    contentCatalog.getPages((page) => {
      if (page.out && page.asciidoc?.attributes['page-unpublish'] != null) delete page.out
    })
  })
}

Keep in mind that there may be references to the unpublished page. While they will be resolved by Antora, the target of the reference will not be available, which will result in a 404 response from the web server.

For more fine-grained control over when a page is unpublished, you could write an extension that replaces the convertDocument or convertDocuments functions. Doing so would allow you to unpublish the page before references to it from other pages are resolved so that they appear as warnings.

Report unlisted pages

After you create a new page, it’s easy to forget to add it to the navigation so that the reader can access it. We can use an extension to identify pages which are not in the navigation and report them using the logger.

This extension runs during the navigationBuilt event. It iterates over each component version, retrieves a flattened list of its internal navigation entries, then checks to see if there are any pages that are not in that list, comparing pages by URL. If it finds any such pages, it creates a report of them, optionally adding them to the navigation.

Example 6. unlisted-pages-extension.js

module.exports.register = function ({ config }) {
  const { addToNavigation, unlistedPagesHeading = 'Unlisted Pages' } = config
  const logger = this.getLogger('unlisted-pages-extension')
  this
    .on('navigationBuilt', ({ contentCatalog }) => {
      contentCatalog.getComponents().forEach(({ versions }) => {
        versions.forEach(({ name: component, version, navigation: nav, url: defaultUrl }) => {
          const navEntriesByUrl = getNavEntriesByUrl(nav)
          const unlistedPages = contentCatalog
            .findBy({ component, version, family: 'page' })
            .filter((page) => page.out)
            .reduce((collector, page) => {
              if ((page.pub.url in navEntriesByUrl) || page.pub.url === defaultUrl) return collector
              logger.warn({ file: page.src, source: page.src.origin }, 'detected unlisted page')
              return collector.concat(page)
            }, [])
          if (unlistedPages.length && addToNavigation) {
            nav.push({
              content: unlistedPagesHeading,
              items: unlistedPages.map((page) => {
                const title = 'navtitle' in page.asciidoc
                  ? page.asciidoc.navtitle
                  : (page.src.module === 'ROOT' ? '' : page.src.module + ':') + page.src.relative
                return { content: title, url: page.pub.url, urlType: 'internal' }
              }),
              root: true,
            })
          }
        })
      })
    })
}

function getNavEntriesByUrl (items = [], accum = {}) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = item
    getNavEntriesByUrl(item.items, accum)
  })
  return accum
}

You can read more about this extension and how to configure it in the Extension Tutorial.

Unpublish unlisted pages

Instead of reporting unlisted pages, you could instead remove those pages from publishing. This is one way you can use the navigation to drive which pages are published.

Example 7. unpublish-unlisted-pages-extension.js

module.exports.register = function ({ config }) {
  this.on('navigationBuilt', ({ contentCatalog }) => {
    contentCatalog.getComponents().forEach(({ versions }) => {
      versions.forEach(({ name: component, version, navigation: nav, url: defaultUrl }) => {
        const navEntriesByUrl = getNavEntriesByUrl(nav)
        const unlistedPages = contentCatalog
          .findBy({ component, version, family: 'page' })
          .filter((page) => page.out)
          .reduce((collector, page) => {
            if ((page.pub.url in navEntriesByUrl) || page.pub.url === defaultUrl) return collector
            return collector.concat(page)
          }, [])
        if (unlistedPages.length) unlistedPages.forEach((page) => delete page.out)
      })
    })
  })
}

function getNavEntriesByUrl (items = [], accum = {}) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = item
    getNavEntriesByUrl(item.items, accum)
  })
  return accum
}

By removing the out property from the page, it prevents the page from being published, but is still referenceable using an include directive. Alternately, you could choose to remove the page entirely from the content catalog.

List discovered component versions

When you’re setting up your playbook, you may find that Antora is not discovering some of your component versions. Using an extension, it’s possible to list the component versions Antora discovers during content aggregation along with the content sources it took them from.

Example 8. discovered-component-versions-extension.js

module.exports.register = function () {
  this.once('contentAggregated', ({ contentAggregate }) => {
    console.log('Discovered the following component versions')
    contentAggregate.forEach((bucket) => {
      const sources = bucket.origins.map(({ url, refname }) => ({ url, refname }))
      console.log({ name: bucket.name, version: bucket.version, files: bucket.files.length, sources })
    })
  })
}

If an entry is missing, then you know you may need to tune the content source definitions in your playbook.

For more information, you can print the whole bucket entry.

Generate report of all pages

You can generate additional pages using an Antora extension. This offers a way to generate report pages that summarize information about the site.

In this example, we’ll generate a page that lists all the other pages in the same component version. This extension listens for the documentsConverted event, which is emitted once all the AsciiDoc-based pages have been converted to (embedded) HTML, but before the HTML layout has been applied. The reason for using this event is twofold. First, it provides access to the page title of each page. Second, the page layout will be applied to the newly generated page.

Example 9. all-pages-report-extension.js

module.exports.register = function () {
  const relativize = this.require('@antora/asciidoc-loader/util/compute-relative-url-path')
  this.once('documentsConverted', ({ contentCatalog }) => {
    contentCatalog.getComponents().forEach(({ versions }) => {
      versions.forEach(({ name: component, version, url }) => {
        const pageList = ['<ul>']
        const pages = contentCatalog
          .findBy({ component, version, family: 'page' })
          .sort((a, b) => a.title.localeCompare(b.title))
        for (const page of pages) {
          pageList.push(`<li><a href="${relativize(url, page.pub.url)}">${page.title}</a></li>`)
        }
        pageList.push('</ul>')
        const pageListFile = contentCatalog.addFile({
          contents: Buffer.from(pageList.join('\n') + '\n'),
          src: { component, version, module: 'ROOT', family: 'page', relative: 'all-pages.html' },
        })
        pageListFile.asciidoc = { doctitle: 'All Pages' }
        // use the following assignment instead to use a separate layout (e.g., report.hbs)
        //pageListFile.asciidoc = { doctitle: 'All Pages', attributes: { 'page-layout': 'report' } }
      })
    })
  })
}

The key step of this extension is the call to contentCatalog.addFile. This call adds a new file to the content catalog, in this case a page. When generating the list of links, we use the relativize function from the AsciiDoc Loader to compute the relative URL from the start page of the component version and the target page, emulating the behavior of the xref macro in AsciiDoc. The resulting report is written to the file all-page.html at the root of the component version (adjacent to the start page).

Redirect from component to latest version

If a component only has non-empty versions (e.g., 1.0, 2.0, etc), Antora will not create a redirect from the component URL (e.g., /component-name/) to the latest version of that component (e.g., /component-name/2.0/ or /component-name/latest). The reasoning is that doing so would be an overreach that assumes too much about what URLs the site should respond to.

One possible solution is to use a page alias so the index page (or start page) for the latest version claims the index page for the empty version.

:page-aliases: _@component-name::index.adoc

However, this page alias would have to be moved each time the latest version changes to avoid a conflict. Fortunately, this need can be addressed using an Antora extension instead. The extension can find all components without a empty (i.e., versionless) component version and add a page alias that routes from the index page of that component, specifically the versionless component version, to the start page of the latest version.

Example 10. alias-component-to-latest-version-extension.js

'use strict'

module.exports.register = function () {
  this.once('contentClassified', ({ contentCatalog }) => {
    contentCatalog.getComponents().forEach((component) => {
      if (component.versions.find((it) => !it.version)) return
      const rel = contentCatalog.resolvePage('index.adoc', { component: component.name })
      if (!rel) return
      contentCatalog.addFile({
        src: { component: component.name, version: '', module: 'ROOT', family: 'alias', relative: 'index.adoc' },
        rel,
      })
    })
  })
}

If the component only has non-empty versions, and the start page can be resolved for the component, an alias is created from the component to that start page. The redirect Antora creates is based on which redirect facility is used. This extension is effectively the same as the explicit page alias shown above.

Resolve attribute references in attachments

Files in the attachment family are passed directly through to the output site. Antora does not resolve AsciiDoc attribute references in attachment files. (Asciidoctor, on the other hand, will resolve AsciiDoc attribute references in the attachment’s contents only if the attachment is included in an AsciiDoc page where the attribute substitution is enabled.) You can use an Antora extension to have Antora resolve attribute references in the attachment file before that file is published.

This extension runs during the contentClassified event, which is when attachment files are first identified and classified. It iterates over all attachments and resolves any references to attributes scoped to that attachment’s component version. If any changes were made to the contents of the file, it replaces the contents on the virtual file with the updated value.

Example 11. resolve-attributes-references-in-attachments-extension.js

module.exports.register = function () {
  this.on('contentClassified', ({ contentCatalog }) => {
    const componentVersionTable = contentCatalog.getComponents().reduce((componentMap, component) => {
      componentMap[component.name] = component.versions.reduce((versionMap, componentVersion) => {
        versionMap[componentVersion.version] = componentVersion
        return versionMap
      }, {})
      return componentMap
    }, {})
    contentCatalog.findBy({ family: 'attachment' }).forEach((attachment) => {
      const componentVersion = componentVersionTable[attachment.src.component][attachment.src.version]
      let attributes = componentVersion.asciidoc?.attributes
      if (!attributes) return
      attributes = Object.entries(attributes).reduce((accum, [name, val]) => {
        accum[name] = val?.endsWith('@') ? val.slice(0, val.length - 1) : val
        return accum
      }, {})
      let modified
      const result = attachment.contents.toString().replace(/\{([\p{Alpha}\d_][\p{Alpha}\d_-]*)\}/gu, (match, name) => {
        if (!(name in attributes)) return match
        modified = true
        let value = attributes[name]
        if (value.endsWith('@')) value = value.slice(0, value.length - 1)
        return value
      })
      if (modified) attachment.contents = Buffer.from(result)
    })
  })
}

This extension is only know to work with text-based attachments. You may need to modify this extension for it to work with binary files.

Convert office attachments to PDF

Much like AsciiDoc files (.adoc) are converted to HTML (.html) by Antora, you can do the same with attachments. This extension runs during the contentClassified event, which is when attachment files are first identified and classified. It iterates over all attachments in an office format (i.e., .docx, .odt, .fodt) and uses the libreoffice command (LibreOffice in server mode) on Linux or docto.exe command (Microsoft Office via docto) on Windows to convert each file to PDF.

Code: gitlab.com/opendevise/oss/antora-binary-files-extension-suite/-/blob/main/packages/office-to-pdf-extension/lib/index.js

By converting the files and updating the metadata, it’s possible to reference the source document using the xref macro. That reference will automatically translate to a link to the PDF in the generated site.

Warehouse large files

If your site has a lot of large attachment files, this can place strain on the build and publishing process for the site. To alleviate this problem, you can siphon off attachments to a warehouse and reroute incoming requests to those files. By doing so, the amount of memory required for building and publishing the site is dramatically reduced since what remains is mostly just HTML pages and images.

The following extension reuses the playbook repository as the attachment warehouse. All attachment files are stored as git lfs objects in the files branch. The secondary benefit of using this approach is that only files that have been modified need to be pushed (git automatically tracks and manages changes).

The extension does assume you are running Antora on GitLab CI and GitLab Pages. Changes will be needed if you are using a different CI environment. You will also need to set up an orphan files branch on the playbook repository before using the extension the first time.

Code: gitlab.com/opendevise/oss/antora-binary-files-extension-suite/-/blob/main/packages/object-storage-extension/lib/index.js

The URL of each attachment remains unchanged. The extension relies on GitLab Pages’ redirect engine to reroute the request to the raw attachment file in the git repository.

This extension can be used in concert with the office-to-pdf-extension above.

Export content to file

If you are integrating with a search or AI engine, you may want to extract the plain text of the pages to a file along with the page url, title, and navigation path. You can use the following extension to do that as part of the site build.

Example 12. export-content-extension.js

const { parse: parseHTML } = require('node-html-parser')

/**
 * An Antora extension that exports the content of publishable pages in plain text to a JSON
 * file along with the page URL and title.
 */
module.exports.register = function () {
  this.once('navigationBuilt', ({ playbook, contentCatalog, siteCatalog }) => {
    const siteUrl = playbook.site.url
    const component = 'dfcs'
    const version = ''
    const componentVersion = contentCatalog.getComponentVersion(component, version)
    const dfcsNavEntriesByUrl = getNavEntriesByUrl(componentVersion.navigation)
    const pages = contentCatalog
      .getPages((it) => it.src.component === component && it.src.version === version && it.pub)
      .map((page) => {
        const siteRelativeUrl = page.pub.url
        const articleDom = parseHTML(`<article>${page.contents}</article>`)
        // TODO might want to apply the sentence newline replacement per paragraph
        const text = articleDom.textContent.trim().replace(/\n(\s*\n)+/g, '\n\n').replace(/\.\n(?!\n)/g, '. ')
        const path = [componentVersion.title]
        path.push(...(dfcsNavEntriesByUrl[siteRelativeUrl]?.path?.map((it) => it.content) || []))
        return { url: siteUrl + siteRelativeUrl, title: page.title, text, path }
      })
    siteCatalog.addFile({
      contents: Buffer.from(JSON.stringify({ pages }, null, '  ')),
      out: { path: 'site-content.json' },
    })
  })
}

function getNavEntriesByUrl (items = [], accum = {}, path = []) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = { item, path: path.concat(item) }
    getNavEntriesByUrl(item.items, accum, item.content ? path.concat(item) : path)
  })
  return accum
}

Note that this extension relies on the node-html-parser package. You will need to include that in your site package.json file in order to use this extension. In the future, Antora may provide a built-in HTML parser for extensions to use.