Collector Use Cases

This page presents realistic use cases that can be accomplished using Collector. Each section introduces a different use case and presents the configuration and commands you can use as a starting point.

Import example code

You can use Collector to import files that are stored outside the content source root so they can be included in pages in the site.

Let’s assume that the branch of our content repository has the following structure:

pom.xml
docs/
  antora.yml
  modules/
    ROOT/
      pages/
        index.adoc
src/
  test/
    java/
      org/
        example/
          HelloWorld.java

To include the file HelloWorld.java into the page index.adoc, you’re instinct may be to use a relative path reference.

include::../../../../src/test/java/org/example/HelloWorld.java[]

However, Antora does not permit references to files that are stored outside of a content source root (as these files are outside of Antora’s virtual file system). Instead, we can use Collector to import those files into Antora’s content catalog so they can be referenced.

Here’s the configuration to add to antora.yml to set this up:

ext:
  collector:
    scan:
      dir: src/test/java
      into: modules/ROOT/examples

This configuration imports all the files under src/test/java into the modules/ROOT/examples virtual path in the component version bucket. (You can use the files key to filter the files that are matched). The result is that these files will be classified into the example family of the ROOT module.

Now you can include the HelloWorld.java file as follows:

include::example$org/example/HelloWorld.java[]

Collector has effectively copied the files from src/test/java into docs/modules/ROOT/examples at runtime without having to physically copy them.

Generate API docs as attachments

You may want to embed API docs into your Antora-generated site. The challenge is that since API docs are generated from the code, you don’t want to have to commit these files to repository. Collector can offer to generate the files from the code that’s in the same branch as the content while Antora is running and import them as attachments. This works even when the content repository is remote since Collector has the ability to checkout each git reference into a worktree if necessary.

Let’s assume that the branch of our content source repository has the following structure:

pom.xml
docs/
  antora.yml
  modules/
    ROOT/
      pages/
        index.adoc
src/
  main/
    java/
      com/
        acme/
          Application.java

We’ll also assume that we can build the API docs (in this case Javadoc) using the command mvn javadoc:javadoc.

Here’s the configuration to add to antora.yml to set this up:

ext:
  collector:
    run: mvn javadoc:javadoc
    scan:
      clean: true (1)
      dir: target/apidocs
      into: modules/api/attachments
1 Instructs Collector to clean the scan directory before running the command.

This Collector instance first runs the mvn javadoc:javadoc command (making the assumption that the mvn command is available on the current user’s PATH). It that scans for and imports all the files under the generated target/apidocs directory into the modules/api/attachments virtual path in the component version bucket. The result is that these files will be classified into the attachment family of the api module.

We can now link the reader to the API docs using the following xref:

Consult the xref:api:attachment$index.html[API].

This workflow can be done for all sorts of API docs.

Replace version with value from build

The documentation version (i.e., the component version) is typically stored in the component version descriptor (antora.yml) or derived from the git reference. A common need is to pick up the version from the project’s build. Replacing the documentation version with the value from the build can be orchestrated using Collector.

We’ll want to start by binding the documentation version to the git reference. This ensures that the version is still unique when Collector is not enabled. To do so, we’ll set the version to the value of true in the component descriptor.

docs/antora.yml
name: my-project
version: true
title: My Project
nav:
- modules/ROOT/nav.adoc

Next, we’ll want to add a task to the build that passes on the project version from the build to Antora. This hand-off is done by having the build generate an antora.yml to update the value of the version key.

Let’s assume that we’re using a Gradle build. First, we’ll define the project version in the gradle.properties file, where the version will be managed.

gradle.properties
version: 1.0.1

Next, let’s add a new task in build.gradle that generates the antora.yml file. It’s only necessary to include the keys you want updated, in this case version.

build.gradle
// ...

tasks.register('generateAntoraYml') {
  group = 'Documentation'
  description = 'Generates antora.yml to populate version'
  doLast {
    def toFile = new File("$buildDir/generated-docs/antora.yml")
    toFile.getParentFile().mkdirs()
    toFile.createNewFile()
    toFile.setText("version: '${project.version}'\n")
  }
}
This task could abbreviate the project version to the major.minor form (e.g., 1.0), as is customary for documentation sites.

Now we need to configure Collector to run the generateAntoraYml task and scan for the result. We’ll configure the project with the Gradle wrapper so Gradle can be run from the worktree (i.e., locally) using gradlew (which will be translated by Collector to gradlew.bat on Windows).

docs/antora.yml
name: my-project
version: true
# ...
ext:
  collector:
    run:
      command: gradlew -q generateAntoraYml
      local: true
    scan: build/generated-docs

This content source is now ready to be used in an Antora site. By the time Antora classifies the component version, the version value will be 1.0.1 rather than the name of the git reference.

Inject AsciiDoc attributes from build

In order to keep your configuration DRY, you may want to inject configuration values from the build into your documentation. This information can be passed through AsciiDoc attributes.

The handoff of this information is managed by Collector. Collector runs a command that assigns configuration values from the build to corresponding AsciiDoc attributes defined in a generated component version descriptor (antora.yml). The documentation can then read this information using a relevant attribute reference.

To start, we’ll want to add a task to the build that passes configuration values from the build to Antora.

Let’s assume that we’re using a Gradle build. First, let’s add a new task in build.gradle that generates the antora.yml file. It’s only necessary to include the keys you want updated, in this case asciidoc.

build.gradle
// ...

tasks.register('generateAntoraYml') {
  group = 'Documentation'
  description = 'Generates antora.yml to populate AsciiDoc attributes'
  doLast {
    def toFile = new File("$buildDir/generated-docs/antora.yml")
    toFile.getParentFile().mkdirs()
    toFile.createNewFile()
    def text = """asciidoc:
  attributes:
    project-artifact-id: ${project.name}
    project-group-id: ${project.group}
    project-description: ${project.description}"""
    project
      .configurations
      .testRuntimeClasspath
      .resolvedConfiguration
      .resolvedArtifacts
      .each { artifact ->
        text += "\n    version-${artifact.name.toLowerCase()}: ${artifact.moduleVersion.id.version}"
      }
    toFile.setText(text + "\n")
  }
}

As an example, you can now use {project-description} to access the project description defined in the build from the documentation.

If you want to be able to easily read these attributes from a page template, add the page- prefix to the attribute name.

Now we need to configure Collector to run the generateAntoraYml task and scan for the result. We’ll configure the project with the Gradle wrapper so Gradle can be run from the worktree (i.e., locally) using gradlew (which will be translated by Collector to gradlew.bat on Windows).

docs/antora.yml
name: my-project
version: '1.0'
# ...
ext:
  collector:
    run:
      command: gradlew -q generateAntoraYml
      local: true
    scan: build/generated-docs

This content source is now ready to be used in an Antora site.

You can combine the example from the previous section if you want to set the version from the build as well.

Vale integration

Vale is an open source, command-line tool that automatically enforces an editorial style guide on your site’s content. Collector offers a solution for integrating Vale into an Antora site build.

Typically, Vale runs on files present on a local file system (often under a specified directory). The challenge with using Vale on the content of an Antora site is that the content may sourced from multiple git repositories and branches. Source files in an Antora site may only exist in a git tree (represented as virtual files), not necessarily files on the local file system. In other words, Vale has to be run on files in a worktree.

Enter Collector. Collector ensures that each content root has a worktree, setting up a temporary one if needed. That makes it possible to run a command within that worktree, such as Vale.

To avoid having to configure Vale in every content repository, we’ll rely on shared configuration stored in the playbook repository. Let’s start by creating the directory .vale relative to the Antora playbook file, where we’ll store all the Vale configuration and styles.

$ mkdir .vale

You’ll likely need to create a vocabulary to declare proper nouns used in your site, so we’ll go ahead set that up now.

$ mkdir -p .vale/styles/config/vocabularies/Local
  echo Antora > .vale/styles/config/vocabularies/Local/accept.txt

You can add any additional words to the accept.txt file that Vale complains about.

Next, let’s create a Vale configuration file named .vale/config.ini relative to the Antora playbook file to control how Vale works.

We’re creating the configuration file under the .vale directory instead of the default location at .vale.ini to make it easier to share the Vale configuration and styles.

Since we’re working with AsciiDoc, we’ll add the AsciiDoc style package. We’ll also set up some ignores (e.g., IgnoredClasses, TokenIgnores) so Vale doesn’t bother us about technical terms and paths.

.vale/config.ini
StylesPath = styles
MinAlertLevel = suggestion
Packages = AsciiDoc
IgnoredClasses = path
Vocab = Local

[asciidoctor]
experimental = YES
attribute-missing = drop

[*.adoc]
BasedOnStyles = Vale, AsciiDoc
AsciiDoc.UnsetAttributes = NO
TokenIgnores = (\+[^+\n]+\+), (\b[a-z][a-z_]* key\b)

We’ll rely on npx to install Vale and Asciidoctor.js so we don’t need to worry about installing manually. Let’s use npx now to sync the packages listed in the config, which is a prerequisite.

$ npx -y --package @jti/vale --package @asciidoctor/cli vale --config .vale/config.ini sync

The @jit/vale package retrieves the vale binary. The @asciidoctor/cli package adds the asciidoctor bin script to the PATH so Vale can locate it to parse AsciiDoc. The sync command downloads and installs any packages listed in the config to the Vale styles directory.

We don’t want to have to configure Collector manually in every content source root to run Vale. Instead, we’ll use another extension, which follows, to automatically configure Collector. This extension assumes that Collector is not otherwise being used.

.vale/lib/antora-vale-integration-extension.js
'use strict'

module.exports.register = function () {
  this.once('contentAggregated', ({ contentAggregate }) => {
    for (const { origins } of contentAggregate) {
      for (const origin of origins) {
        addCollectorStep((origin.descriptor.ext ??= {}), {
          run: [
            'echo Running Vale (url: ${{origin.url}} | refname: ${{origin.refname}} | start_path: ${{origin.startPath}})',
            '$NPX -y --prefer-offine --package @jti/vale --package @asciidoctor/cli vale --config "${{playbook.dir}}/.vale/config.ini" --glob \'!**/.*.adoc\' --no-wrap ./${{origin.startPath}}'
          ],
        })
      }
    }
  })

  this.require('@antora/collector-extension').register.call(this, { config: {} })
}

function addCollectorStep (ext, step) {
  let append = Array.isArray(ext.collector)
  if (!append && ext.collector && (append = true)) ext.collector = [ext.collector]
  append ? ext.collector.push(step) : (ext.collector = step)
}

This extension iterates over all the origins (the content source roots) in an Antora site and injects Collector configuration into the descriptor to run Vale on that origin (using its worktree).

Notice we’re using a context variable to identify the config file relative to the Antora playbook file. The glob option ignores all hidden AsciiDoc files. We’re also using a context variable to specify the scan directory. By specifying the directory in this way, Vale will report the git path, which is the root-relative path in the repository branch (e.g., docs/modules/ROOT/pages/index.adoc). To provide additional context to the output, we use an echo command to identify the current URL, refname, and start path for the messages that follow. Finally, we require the Collector extension after our custom extension is registered so it discovers the newly added Collector configurations.

To run Antora, we have to add the Collector extension package and enable our custom extension (which, in turn, will enable the Collector extension). If you’re not using npx to install packages, declare these packages in your package.json file instead.

$ npx -y --package antora --package @antora/collector-extension antora --extension .vale/lib/antora-vale-integration-extension.js antora-playbook.yml

When Antora runs, Vale will check all the AsciiDoc files in the site and report any violations and/or suggestions to the terminal (i.e., STDOUT).

To simplify the Vale call, you can create a wrapper script named .vale/run for Collector to call. This approach also gives us the opportunity to redirect the Vale output from the terminal to a file. We’ll start with a Bash script. You can add a batch script for Windows later.

.vale/run
#!/bin/bash

if [ -z "$VALE_CONFIG_PATH" ]; then
  VALE_CONFIG_PATH="$ANTORA_PLAYBOOK_DIR/.vale/config.ini"
fi

echo "Running Vale (url: $ANTORA_ORIGIN_URL | refname: $ANTORA_ORIGIN_REFNAME | start_path: ${ANTORA_ORIGIN_START_PATH:-.})" >> "$ANTORA_PLAYBOOK_DIR/vale.log"

NO_COLOR=1 \
"$NPX" -y --prefer-offine --package @jti/vale --package @asciidoctor/cli vale \
--config "$VALE_CONFIG_PATH" --glob '!**/.*.adoc' --no-wrap ${ANTORA_ORIGIN_START_PATH:-.} \
>> "$ANTORA_PLAYBOOK_DIR/vale.log"

After creating the wrapper script, be sure to make it executable:

$ chmod 755 .vale/run

We now need to reconfigure the run entry in the custom extension to call our wrapper script. We’ll also add a contextStarted listener to remove any stale Vale log files.

.vale/lib/antora-vale-integration-extension.js
'use strict'

const fsp = require('node:fs/promises')
const ospath = require('node:path')

module.exports.register = function () {
  this.once('contextStarted', async ({ playbook }) => {
    for (const it of await fsp.readdir(playbook.dir, { withFileTypes: true })) {
      if (it.isFile() && it.name.startsWith('vale.')) await fsp.rm(ospath.join(playbook.dir, it.name))
    }
  })

  this.once('contentAggregated', ({ contentAggregate }) => {
    for (const { origins } of contentAggregate) {
      for (const origin of origins) {
        addCollectorStep((origin.descriptor.ext ??= {}), {
          collector: {
            run: {
              command: '${{playbook.dir}}/.vale/run',
              env: {
                ANTORA_PLAYBOOK_DIR: '${{playbook.dir}}',
                ANTORA_ORIGIN_REFNAME: '${{origin.refname}}',
                ANTORA_ORIGIN_START_PATH: '${{origin.startPath}}',
                ANTORA_ORIGIN_URL: '${{origin.url}}',
                ANTORA_ORIGIN_WORKTREE: '${{origin.worktree}}',
              },
            },
          },
        })
      }
    }
  })

  this.require('@antora/collector-extension').register.call(this, { config: {} })
}

function addCollectorStep (ext, step) {
  let append = Array.isArray(ext.collector)
  if (!append && ext.collector && (append = true)) ext.collector = [ext.collector]
  append ? ext.collector.push(step) : (ext.collector = step)
}

When we run Antora as before, the Vale output will now be written to the vale.log file adjacent to the playbook file. You could also choose to output JSON instead of plain text by passing the --output json CLI option.

We can take this one step further. If the content source has the same worktree as the playbook (meaning the same branch of the same repository), we can integrate those messages with reviewdog so they appear as a check in the pull request on GitHub, merge request on GitLab, and so forth. However, any message that pertains to a different repository or branch we have to keep in a separate report since they are out of scope. We can rely on our wrapper script to handle this partitioning.

First, let’s set up an output style template (rdjsonl format) so it can be ingested by reviewdog.

vale/templates/rdjsonl.tmpl
{{- range .Files}}
{{- $path := .Path -}}
{{- range .Alerts -}}
{{- $error := "" -}}
{{- if eq .Severity "error" -}}
  {{- $error = "ERROR" -}}
{{- else if eq .Severity "warning" -}}
  {{- $error = "WARNING" -}}
{{- else -}}
  {{- $error = "INFO" -}}
{{- end}}
{{- $line := printf "%d" .Line -}}
{{- $col := printf "%d" (index .Span 0) -}}
{{- $check := printf "%s" .Check -}}
{{- $message := printf "%s" .Message -}}
{"message": "[{{ $check }}] {{ $message | jsonEscape }}", "location": {"path": "{{ $path }}", "range": {"start": {"line": {{ $line }}, "column": {{ $col }}}}}, "severity": "{{ $error }}"}
{{end -}}
{{end -}}

Next, we’ll update the wrapper script to check whether the origin’s worktree matches the playbook dir. If so, we’ll output Vale messages formatted using the custom template to the file vale.rdjsonl. Otherwise, we’ll output the Vale messages in the CLI format to the file vale.log.

.vale/run
#!/bin/bash

if [ -z "$VALE_CONFIG_PATH" ]; then
  VALE_CONFIG_PATH="$ANTORA_PLAYBOOK_DIR/.vale/config.ini"
fi

OUTPUT_STYLE='CLI'
OUTPUT_FILE="$ANTORA_PLAYBOOK_DIR/vale.log"

if [ "$ANTORA_ORIGIN_WORKTREE" == "$ANTORA_PLAYBOOK_DIR" ]; then
  OUTPUT_STYLE="$ANTORA_PLAYBOOK_DIR/.vale/templates/rdjsonl.tmpl"
  OUTPUT_FILE="$ANTORA_PLAYBOOK_DIR/vale.rdjsonl"
else
  echo "Running Vale (url: $ANTORA_ORIGIN_URL | refname: $ANTORA_ORIGIN_REFNAME | start_path: ${ANTORA_ORIGIN_START_PATH:-.})" >> "$OUTPUT_FILE"
fi

NO_COLOR=1 \
"$NPX" -y --prefer-offine --package @jti/vale --package @asciidoctor/cli vale \
--config "$ANTORA_PLAYBOOK_DIR/.vale.ini" --glob '!**/.*.adoc' --no-wrap --output "$OUTPUT_STYLE" ${ANTORA_ORIGIN_START_PATH:-.} \
>> "$OUTPUT_FILE"

The final step is to pass the rdjsonl messages to reviewdog. We’ll use a GitHub Action that runs on a pull request for demonstration purposes.

jobs:
  build:
    runs-on: ubuntu-latest
  steps:
  # ...
  - name: Generate Site
    run: npx --package antora --package @antora/collector-extension antora --extension .vale/lib/antora-vale-integration-extension.js antora-playbook.yml
  - name: Install reviewdog
    uses: reviewdog/action-setup@v1
    with:
      reviewdog_version: latest
  - name: Run reviewdog
    env:
      REVIEWDOG_GITHUB_API_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    run: |
      [ -f vale.log ] && cat vale.log
      [ -f vale.rdjsonl ] && cat vale.rdjsonl | reviewdog -f=rdjsonl -reporter=github-pr-check

When Antora runs, it may generate the vale.rdjsonl file, which is then piped to reviewdog if it exists. The reviewdog command routes those messages to the GitHub checks system.

reviewdog can only account for paths in the current repository and branch, which is why the Vale output has to be split between vale.rdjsonl and vale.log.

In order to use this setup on individual content repositories, you’ll probably want to distribute the .vale directory as a repository or package. For example, in a worktree for a content repository that has an Antora playbook file for authoring, you can use the following command to retrieve the Vale configuration:

$ git clone https://gitlab.com/opendevise/oss/antora-dot-vale .vale

You can then run Antora with Vale integration extension as shown earlier. In this way, the Vale integration works equally well on an individual content source as it does on a whole site.

If you want to use a custom Vale configuration, you can override the config file location by setting the VALE_CONFIG_PATH environment variable when running Antora.

For more information on using Vale with AsciiDoc, refer to the page AsciiDoc style for Vale in the Red Hat Documentation.