Extension Use Cases

This page provides a catalog of simple examples to showcase how you can enhance the capabilities of Antora through the use of extensions. Each section introduces a different use case and presents the extension code you can build on as a starting point.

You can also reference official extension projects provided by the Antora project to study more complex examples.

Set global AsciiDoc attributes

If you want to define global AsciiDoc attributes that dynamic values, you can do using an extension. The playbook holds the AsciiDoc config object, which itself contains the global AsciiDoc attributes. An extension can listen for the playbookBuilt event and add attributes to this map.

Example 1. set-global-asciidoc-attributes-extension.js
module.exports.register = function () {
  this.on('beforeProcess', ({ siteAsciiDocConfig }) => {
    const buildDate = new Date().toISOString()
    siteAsciiDocConfig.attributes['build-date'] = buildDate
  })
}

The extension could read these values from a file or environment variables as well.

If you need to set AsciiDoc attributes that are scoped to a component version, then you’ll need to listen for the contentClassified event instead. From there, you can access the AsciiDoc attributes form the asciidoc property on a component version object. You can look up a component version by name and version using the getComponentVersion method on the content catalog object. Alternately, you can access component versions from the versions property on each component returned by the getComponents method on the content catalog object.

If you’re troubleshooting your site, you can use an extension to generate a report of AsciiDoc attributes at the site level and those per component verison. When making this report, you have a choice of whether you want to show the AsciiDoc attributes as they would be available to a page (aka compiled) or as defined (aka uncompiled)

You can use the following extension to print all the AsciiDoc attributes compiled for each component version. The extension also prints all the attributes compiled from the playbook, though keep in mind these are integrated into the attributes for each component version.

Example 2. print-compiled-asciidoc-attributes-extension.js
module.exports.register = function () {
  this.once('contentClassified', ({ siteAsciiDocConfig, contentCatalog }) => {
    console.log('site-wide attributes (compiled)')
    console.log(siteAsciiDocConfig.attributes)
    contentCatalog.getComponents().forEach((component) => {
      component.versions.forEach((componentVersion) => {
        console.log(`${componentVersion.version}@${componentVersion.name} attributes (compiled)`)
        if (componentVersion.asciidoc === siteAsciiDocConfig) {
          console.log('same as site-wide attributes')
        } else {
          console.log(componentVersion.asciidoc.attributes)
        }
      })
    })
  })
}

You can use the following extension to print all the AsciiDoc attributes as defined in the playbook and in the antora.yml file for each component version (by origin).

Example 3. print-defined-asciidoc-attributes-extension.js
module.exports.register = function () {
  this.once('contentClassified', ({ playbook, contentCatalog }) => {
    console.log('site-wide attributes (as defined in playbook)')
    console.log(playbook.asciidoc.attributes)
    contentCatalog.getComponents().forEach((component) => {
      component.versions.forEach((componentVersion) => {
        getUniqueOrigins(contentCatalog, componentVersion).forEach((origin) => {
          console.log(`${componentVersion.version}@${componentVersion.name} attributes (as defined in antora.yml)`)
          console.log(origin.descriptor.asciidoc?.attributes || {})
        })
      })
    })
  })
}

function getUniqueOrigins (contentCatalog, componentVersion) {
  return contentCatalog
    .findBy({ component: componentVersion.name, version: componentVersion.version })
    .reduce((origins, file) => {
      const origin = file.src.origin
      if (origin && !origins.includes(origin)) origins.push(origin)
      return origins
    }, [])
}

You may find it useful to make use of these collections of AsciiDoc attributes when writing other extensions.

Exclude private content sources

If some contributors or CI jobs don’t have permission to the private content sources in the playbook, you can use an extension to filter them out instead of having to modify the playbook file.

This extension runs during the playbookBuilt event. It retrieves the playbook, iterates over the content sources, and removes any content source that it detects as private and thus require authentication. We’ll rely on a convention to communicate to the extension which content source is private. That convention is to use an SSH URL that starts with git@. Antora automatically converts SSH URLs to HTTP URLs, so the use of this syntax merely serves as a hint to users and extensions that the URL is private and is going to request authentication.

Example 4. exclude-private-content-sources-extension.js
module.exports.register = function () {
  this.on('playbookBuilt', function ({ playbook }) {
    playbook.content.sources = playbook.content.sources.filter(({ url }) => !url.startsWith('git@'))
    this.updateVariables({ playbook })
  })
}

This extension works because the playbook is mutable until the end of this event, at which point Antora freezes it. The call to this.updateVariables to replace the playbook variable in the generator context is not required, but is used here to express intent and to future proof the extension.

Unpublish flagged pages

If you don’t want a page to ever be published, you can prefix the filename with an underscore (e.g., _hidden.adoc). However, if you only want the page to be unpublished conditionally, then you need to reach for an extension.

When using this extension, any page that sets the page-unpublish page attribute will not be published (meaning it will be unpublished). For example:

= Secret Page
:page-unpublish:

This page will not be published.

You can set the page-unpublish page attribute based on the presence (or absence) of another AsciiDoc attribute, perhaps one set in the playbook or as a CLI option. For example:

= Secret Page
ifndef::include-secret[:page-unpublish:]

This page will not be published.

This extension runs during the documentsConverted event. This is the earliest event that provides access to the AsciiDoc metadata on the virtual file. The extension iterates over all publishable pages in the content catalog and unpublishes any page that sets the page-unpublish attribute. To unpublish the page, the extension removes the out property on the virtual file. If the out property is absent, the page will not be published.

Example 5. page-unpublish-tag-extension.js
module.exports.register = function () {
  this.on('documentsConverted', ({ contentCatalog }) => {
    contentCatalog.getPages((page) => {
      if (page.out && page.asciidoc?.attributes['page-unpublish'] != null) delete page.out
    })
  })
}

Keep in mind that there may be references to the unpublished page. While they will be resolved by Antora, the target of the reference will not be available, which will result in a 404 response from the web server.

For more fine-grained control over when a page is unpublished, you could write an extension that replaces the convertDocument or convertDocuments functions. Doing so would allow you to unpublish the page before references to it from other pages are resolved so that they appear as warnings.

Report unlisted pages

After you create a new page, it’s easy to forget to add it to the navigation so that the reader can access it. We can use an extension to identify pages which are not in the navigation and report them using the logger.

This extension runs during the navigationBuilt event. It iterates over each component version, retrieves a flattened list of its internal navigation entries, then checks to see if there are any pages that are not in that list, comparing pages by URL. If it finds any such pages, it creates a report of them, optionally adding them to the navigation.

Example 6. unlisted-pages-extension.js
module.exports.register = function ({ config }) {
  const { addToNavigation, unlistedPagesHeading = 'Unlisted Pages' } = config
  const logger = this.getLogger('unlisted-pages-extension')
  this
    .on('navigationBuilt', ({ contentCatalog }) => {
      contentCatalog.getComponents().forEach(({ versions }) => {
        versions.forEach(({ name: component, version, navigation: nav, url: defaultUrl }) => {
          const navEntriesByUrl = getNavEntriesByUrl(nav)
          const unlistedPages = contentCatalog
            .findBy({ component, version, family: 'page' })
            .filter((page) => page.out)
            .reduce((collector, page) => {
              if ((page.pub.url in navEntriesByUrl) || page.pub.url === defaultUrl) return collector
              logger.warn({ file: page.src, source: page.src.origin }, 'detected unlisted page')
              return collector.concat(page)
            }, [])
          if (unlistedPages.length && addToNavigation) {
            nav.push({
              content: unlistedPagesHeading,
              items: unlistedPages.map((page) => {
                const title = 'navtitle' in page.asciidoc
                  ? page.asciidoc.navtitle
                  : (page.src.module === 'ROOT' ? '' : page.src.module + ':') + page.src.relative
                return { content: title, url: page.pub.url, urlType: 'internal' }
              }),
              root: true,
            })
          }
        })
      })
    })
}

function getNavEntriesByUrl (items = [], accum = {}) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = item
    getNavEntriesByUrl(item.items, accum)
  })
  return accum
}

You can read more about this extension and how to configure it in the Extension Tutorial.

Unpublish unlisted pages

Instead of reporting unlisted pages, you could instead remove those pages from publishing. This is one way you can use the navigation to drive which pages are published.

This extension runs during the navigationBuilt event. It iterates over each component version, retrieves a flattened list of its internal navigation entries, then checks to see if there are any pages that are not in that list, comparing pages by URL. If it finds any such pages, it unpublishes them.

Example 7. unpublish-unlisted-pages-extension.js
module.exports.register = function ({ config }) {
  this.on('navigationBuilt', ({ contentCatalog }) => {
    contentCatalog.getComponents().forEach(({ versions }) => {
      versions.forEach(({ name: component, version, navigation: nav, url: defaultUrl }) => {
        const navEntriesByUrl = getNavEntriesByUrl(nav)
        const unlistedPages = contentCatalog
          .findBy({ component, version, family: 'page' })
          .filter((page) => page.out)
          .reduce((collector, page) => {
            if ((page.pub.url in navEntriesByUrl) || page.pub.url === defaultUrl) return collector
            return collector.concat(page)
          }, [])
        if (unlistedPages.length) unlistedPages.forEach((page) => delete page.out)
      })
    })
  })
}

function getNavEntriesByUrl (items = [], accum = {}) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = item
    getNavEntriesByUrl(item.items, accum)
  })
  return accum
}

By removing the out property from the page, it prevents the page from being published, but is still referenceable using an include directive. Alternately, you could choose to remove the page entirely from the content catalog.

List discovered component versions

When you’re setting up your playbook, you may find that Antora is not discovering some of your component versions. Using an extension, it’s possible to list the component versions Antora discovers during content aggregation along with the content sources it took them from.

Example 8. discovered-component-versions-extension.js
module.exports.register = function () {
  this.once('contentAggregated', ({ contentAggregate }) => {
    console.log('Discovered the following component versions')
    contentAggregate.forEach((bucket) => {
      const sources = bucket.origins.map(({ url, refname }) => ({ url, refname }))
      console.log({ name: bucket.name, version: bucket.version, files: bucket.files.length, sources })
    })
  })
}

If an entry is missing, then you know you may need to tune the content source definitions in your playbook.

For more information, you can print the whole bucket entry.

Generate report of all pages

You can generate additional pages using an Antora extension. This offers a way to generate report pages that summarize information about the site.

In this example, we’ll generate a page that lists all the other pages in the same component verison. This extension listens for the documentsConverted event, which is emitted once all the AsciiDoc-based pages have been converted to (embedded) HTML, but before the HTML layout has been applied. The reason for using this event is twofold. First, it provides access to the page title of each page. Second, the page layout will be applied to the newly generated page.

Example 9. all-pages-report-extension.js
module.exports.register = function () {
  const relativize = this.require('@antora/asciidoc-loader/util/compute-relative-url-path')
  this.once('documentsConverted', ({ contentCatalog }) => {
    contentCatalog.getComponents().forEach(({ versions }) => {
      versions.forEach(({ name: component, version, url }) => {
        const pageList = ['<ul>']
        const pages = contentCatalog
          .findBy({ component, version, family: 'page' })
          .sort((a, b) => a.title.localeCompare(b.title))
        for (const page of pages) {
          pageList.push(`<li><a href="${relativize(url, page.pub.url)}">${page.title}</a></li>`)
        }
        pageList.push('</ul>')
        const pageListFile = contentCatalog.addFile({
          contents: Buffer.from(pageList.join('\n') + '\n'),
          src: { component, version, module: 'ROOT', family: 'page', relative: 'all-pages.html' },
        })
        pageListFile.asciidoc = { doctitle: 'All Pages' }
        // use the following assignment instead to use a separate layout (e.g., report.hbs)
        //pageListFile.asciidoc = { doctitle: 'All Pages', attributes: { 'page-layout': 'report' } }
      })
    })
  })
}

The key step of this extension is the call to contentCatalog.addFile. This call adds a new file to the content catalog, in this case a page. When generating the list of links, we use the relativize function from the AsciiDoc Loader to compute the relative URL from the start page of the component version and the target page, emulating the behavior of the xref macro in AsciiDoc. The resulting report is written to the file all-page.html at the root of the component version (adjacent to the start page).

Resolve attribute references in attachments

Files in the attachment family are passed directly through to the output site. Antora does not resolve AsciiDoc attribute references in attachment files. (Asciidoctor, on the other hand, will resolve AsciiDoc attribute references in the attachment’s contents only if the attachment is included in an AsciiDoc page where the attribute substitution is enabled.) You can use an Antora extension to have Antora resolve attribute references in the attachment file before that file is published.

This extension runs during the contentClassified event, which is when attachment files are first identified and classified. It iterates over all attachments and resolves any references to attributes scoped to that attachment’s component version. If any changes were made to the contents of the file, it replaces the contents on the virtual file with the updated value.

Example 10. resolve-attributes-references-in-attachments-extension.js
module.exports.register = function () {
  this.on('contentClassified', ({ contentCatalog }) => {
    const componentVersionTable = contentCatalog.getComponents().reduce((componentMap, component) => {
      componentMap[component.name] = component.versions.reduce((versionMap, componentVersion) => {
        versionMap[componentVersion.version] = componentVersion
        return versionMap
      }, {})
      return componentMap
    }, {})
    contentCatalog.findBy({ family: 'attachment' }).forEach((attachment) => {
      const componentVersion = componentVersionTable[attachment.src.component][attachment.src.version]
      let attributes = componentVersion.asciidoc?.attributes
      if (!attributes) return
      attributes = Object.entries(attributes).reduce((accum, [name, val]) => {
        accum[name] = val?.endsWith('@') ? val.slice(0, val.length - 1) : val
        return accum
      }, {})
      let modified
      const result = attachment.contents.toString().replace(/\{([\p{Alpha}\d_][\p{Alpha}\d_-]*)\}/gu, (match, name) => {
        if (!(name in attributes)) return match
        modified = true
        let value = attributes[name]
        if (value.endsWith('@')) value = value.slice(0, value.length - 1)
        return value
      })
      if (modified) attachment.contents = Buffer.from(result)
    })
  })
}

This extension is only know to work with text-based attachments. You may need to modify this extension for it to work with binary files.

Convert word processor attachments to PDF

Much like AsciiDoc files (.adoc) are converted to HTML (.html) by Antora, you can do the same with attachments. This extension runs during the contentClassified event, which is when attachment files are first identified and classified. It iterates over all attachments in a word processor format (i.e., .docx, .odt, .fodt) and uses the libreoffcie command (LibreOffice in server mode) to convert each file to PDF.

Example 11. doc-to-pdf-extension.js
const fsp = require('fs/promises')
const ospath = require('path')
const { posix: path } = ospath
const { execFile } = require('child_process')

const IS_WIN = process.platform === 'win32'
const LIBREOFFICE_WIN = 'C:\\Program Files\\LibreOffice\\program\\soffice.exe'

module.exports.register = function ({ config }) {
  this.once('contentClassified', async ({ playbook, contentCatalog }) => {
    const logger = this.getLogger('doc-to-pdf-extension')
    let command = config.command || { windows: './docto.exe', linux: 'libreoffice' }
    if (command.constructor === Object) command = command[IS_WIN ? 'windows' : 'linux']
    if (IS_WIN) {
      if (command === 'libreoffice') {
        command = LIBREOFFICE_WIN
      } else if (!command.endsWith('.exe')) {
        command += '.exe'
      }
    }
    if (command.startsWith('./')) command = ospath.join(playbook.dir, command)
    const libreoffice = ospath.basename(command) !== 'docto.exe'
    const docExtnames = (config.extensions || ['.docx']).reduce((accum, it) => Object.assign(accum, { [it]: true }), {})
    const batchSize = (config.batchSize ?? 150) || Number.MAX_SAFE_INTEGER
    const batchDirs = []
    const filesToConvert = contentCatalog.getFiles().filter((file) => {
      const { src } = file
      if (!(src.family === 'attachment' && docExtnames[src.extname])) return
      const staticPdfFileId = Object.assign({}, src, { relative: src.relative.slice(0, -src.extname.length) + '.pdf' })
      const staticPdfFile = contentCatalog.getById(staticPdfFileId)
      if (!staticPdfFile) return true
      coerceToPdf(file, staticPdfFile.contents)
      contentCatalog.removeFile(staticPdfFile)
    })
    if (!filesToConvert.length) return
    const batchDirBase = ospath.join(playbook.dir, 'build/doc-to-pdf')
    const convertArgs = libreoffice
      ? ['--writer', '--headless', '--norestore']
          .concat(IS_WIN ? ['--safe-mode', '--print-to-file', '--printer-name', 'pdf', '--outdir', '.'] : ['--convert-to', 'pdf'])
      : ['-WD', '-f', '.', '-o', '.', '-FX', '.docx', '-t', 'wdFormatPDF']
    try {
      let i = 0
      for (let offset = 0, tot = filesToConvert.length; offset < tot; offset += batchSize) {
        const batchDir = batchDirBase + `-${++i}`
        batchDirs.push(batchDir)
        const convertOpts = { cwd: batchDir }
        if (IS_WIN) Object.assign(convertOpts, { windowsHide: true })
        await fsp.mkdir(batchDir, { recursive: true })
        const filesToConvertBatch = filesToConvert.slice(offset, offset + batchSize)
        const filepathsToConvert = await Promise.all(filesToConvertBatch.map((file) => {
          const sourceRelpath = makeSourceFilename(file.src)
          return fsp.writeFile(ospath.join(batchDir, sourceRelpath), file.contents).then(() => sourceRelpath)
        }))
        const convertArgsBatch = libreoffice ? convertArgs.slice().concat(filepathsToConvert) : convertArgs
        await new Promise((resolve, reject) => {
          logger.info([`[${batchDir}]`, command].concat(convertArgsBatch).join(' '))
          execFile(command, convertArgsBatch, convertOpts, (err, stdout, stderr) => {
            if (!err) return resolve()
            const splitIdx = stderr.indexOf('Usage: ')
            if (~splitIdx) stderr = stderr.slice(0, splitIdx).trimEnd()
            if (stderr) err.message += stderr
            reject(err)
          })
        })
        await Promise.all(filesToConvertBatch.map((file) => {
          const pdfRelpath = makeSourceFilename(file.src).slice(0, -file.src.extname.length) + '.pdf'
          return fsp.readFile(ospath.join(batchDir, pdfRelpath)).then(
            (contents) => coerceToPdf(file, contents),
            (err) => logger.error(err, 'failed to convert ' + makeSourceFilename(file.src))
          )
        }))
      }
    } finally {
      for (const batchDir of batchDirs) await fsp.rm(batchDir, { recursive: true, force: true })
    }
  })
}

function coerceToPdf (file, contents) {
  file.contents = contents
  const extnameLength = file.src.extname.length
  file.out.basename = file.out.basename.slice(0, -extnameLength) + '.pdf'
  file.out.path = path.join(file.out.dirname, file.out.basename)
  file.pub.url = file.pub.url.slice(0, -extnameLength) + '.pdf'
}

function makeSourceFilename (src) {
  return `${src.component}-${src.module}-${src.relative.replaceAll('/', '-')}`
}

By converting the files and updating the metadata, it’s possible to reference the source document using the xref macro. That reference will automatically translate to a link to the PDF in the generated site.

Export content to file

If you are integrating with a search or AI engine, you may want to extract the plain text of the pages to a file along with the page url, title, and navigation path. You can use the following extension to do that as part of the site build.

Example 12. export-content-extension.js
const { parse: parseHTML } = require('node-html-parser')

/**
 * An Antora extension that exports the content of publishable pages in plain text to a JSON
 * file along with the page URL and title.
 */
module.exports.register = function () {
  this.once('navigationBuilt', ({ playbook, contentCatalog, siteCatalog }) => {
    const siteUrl = playbook.site.url
    const component = 'dfcs'
    const version = ''
    const componentVersion = contentCatalog.getComponentVersion(component, version)
    const dfcsNavEntriesByUrl = getNavEntriesByUrl(componentVersion.navigation)
    const pages = contentCatalog
      .getPages((it) => it.src.component === component && it.src.version === version && it.pub)
      .map((page) => {
        const siteRelativeUrl = page.pub.url
        const articleDom = parseHTML(`<article>${page.contents}</article>`)
        // TODO might want to apply the sentence newline replacement per paragraph
        const text = articleDom.textContent.trim().replace(/\n(\s*\n)+/g, '\n\n').replace(/\.\n(?!\n)/g, '. ')
        const path = [componentVersion.title]
        path.push(...(dfcsNavEntriesByUrl[siteRelativeUrl]?.path?.map((it) => it.content) || []))
        return { url: siteUrl + siteRelativeUrl, title: page.title, text, path }
      })
    siteCatalog.addFile({
      contents: Buffer.from(JSON.stringify({ pages }, null, '  ')),
      out: { path: 'site-content.json' },
    })
  })
}

function getNavEntriesByUrl (items = [], accum = {}, path = []) {
  items.forEach((item) => {
    if (item.urlType === 'internal') accum[item.url.split('#')[0]] = { item, path: path.concat(item) }
    getNavEntriesByUrl(item.items, accum, item.content ? path.concat(item) : path)
  })
  return accum
}

Note that this extension relies on the node-html-parser package. You will need to include that in your site package.json file in order to use this extension. In the future, Antora may provide a built-in HTML parser for extensions to use.