Configure a Scan Action
The scan action is the most important part of Collector because it’s what imports additional files or metadata into the content aggregate. (Those files, in turn, get added to the content catalog). The scan action looks for matching files under the specified directory and imports them into a component version bucket. The scan action can also be used to update the metadata in antora.yml, or even change its identity.
You can configure one more more scan actions using the scan
key.
This page explains how to use the scan
key and the keys it accepts.
File import
The files discovered by a scan operation are added to the component version bucket in the content aggregate before the content is classified. If a file being imported already matches the identity of a file in the bucket, the contents and stat of the existing file is updated.
If antora.yml is discovered at the root of the scanned directory (often generated), the contents of that file are parsed.
By default, the parsed data is overlaid onto the current component version bucket.
That means a generated antora.yml file can be used to update the name, version, and prerelease of the bucket.
If the bucket has the prerelease
property set, but the imported file is missing this key, the prerelease
property is removed from the bucket.
To learn how to configure Collector to target a different bucket and creating it, if necessary, instead of updating the current one, refer to the section Target a different component version bucket.
The imported files are indistinguishable from ones discovered in a static Antora content root. It’s as though the discovered files are in the repository branch that Antora scans, only they are added after the fact.
scan key
A scan action is configured using the scan
configuration key for the Collector instance.
The scan
key must be nested under the collector
key.
ext:
collector:
scan:
dir: build-foo
If the value of the collector
key is an array, the scan
key must be specified as a key on an array entry.
The scan
key may be used in more than one entry in that array.
ext:
collector:
- scan:
- dir: build-foo
- dir: build-bar
- scan: build-baz
Here’s a real world example that shows how to configure a scan action:
name: colorado
title: Colorado
version: '5.6.0'
ext:
collector:
scan:
- dir: build/generated
files: '**/*.adoc'
clean: true
- dir: build/log
clean: true
The value of the scan
key can be a map, an array, or a string.
If the value is a string, the value is assumed to the value of the dir
key of a map.
If the value is an array (i.e., a list of entries), each entry must be a string or a map consisting of built-in key-value pairs.
Each scan
entry is invoked sequentially, in the order specified in the array.
If the value is a map (i.e., the leading -
marker is dropped), it’s assumed to be a single-entry array.
Acceptable keys for the map value are listed in the table below.
Key | Default | Type | Description |
---|---|---|---|
|
undefined |
String (absolute path, path relative to start path if starts with |
The directory to scan. |
**/* |
String (micromatch pattern) |
The pattern to use for filtering files. |
|
empty |
String (a base path) or Map |
The base path to prepend to imported files or a destination component version. |
|
false |
Boolean |
Whether to clean the scan directory before running commands. |
* required
dir key
The dir
key specifies a directory to scan.
This is the only required key for a run action.
It’s typically looking for files generated by a command which was run previously, though it can be used to discover any file in the worktree.
The extension then imports any files discovered into the current bucket in the content aggregate.
The value of the dir
key is a string path for a single directory.
If the path is absolute, that value is used as is.
If the path is .
or starts with ./
, it’s resolved starting from the start path (i.e., the root of the current origin).
Otherwise, the path is resolved relative to the worktree directory.
files key
The files
key specifies which files to import from the scan directory.
The files
key accepts a micromatch pattern (i.e., glob), supporting the same syntax as the worktrees
key on a content source in Antora.
scan:
dir: build
files: '**/*.adoc'
By default, all files found in the scan directory are imported (though not necessarily classified by Antora).
into key
For each file discovered by a scan (other than antora.yml), a virtual file is added to the bucket in the content aggregate for the current component and version (as specified in antora.yml) by default.
The file is added using the path relative to the scan dir.
Thus, there’s an assumption that the scanned files are organized according to the standard Antora structure (e.g., modules/ROOT/pages/generated.adoc).
The into
key allows this assumption to be broken.
If the files are not organized as an Antora content root, they can be remapped using the scan configuration.
The into
key specifies a path to prepend to the relative path of all files discovered by the current scan operation.
This path should be a relative directory and /
should be used to separator path segments (since it represents a virtual path) (e.g., modules/ROOT/pages).
This value is prepended as a base directory path to the relative virtual path of every scanned file.
This key is not set by default.
For example:
scan:
dir: build/pages
into: modules/ROOT/pages
The into
key accepts a base path as a string or a map.
When the value is a map, can also be used to specify the target component name and/or version.
By using the into
key, the scanned files do not have to be organized according to the standard Antora structure.
Instead, the relative path is synthetic.
Target a different component version bucket
The into
key also provides a way to import the files into a component version bucket that’s different from the one in which Collector is running.
To do so, the value of the into
key must be a map.
Acceptable keys when the value of the into
key is a map are listed in the table below.
Key | Default | Type | Description |
---|---|---|---|
|
undefined |
String |
The name of the target component version. |
|
undefined |
String |
The version of the target component version. |
|
empty |
String (a base path) |
The base path to prepend to imported files. |
The name and/or version of the target component version bucket can be specified using the name
and version
keys, respectively.
(If only one of the keys is specified, the value of the other key inherits from the current component version descriptor).
For example:
scan:
dir: build/apidocs
into:
name: apidocs
version: ~
The same can be achieved by specifying the target name and version in a scanned antora.yml file and also setting the create: true
key.
In this case, the into
key is not needed on the scan action.
For example:
name: apidocs
version: ~
create: true
Setting the create: true
key in the generated antora.yml file tells Collector to use the target bucket, creating it if necessary, instead of updating the identity of the current bucket.
If you want to prepend a base directory path to the files while also specifying a target component version bucket, the path should be specified in the dir
key.
For example:
scan:
dir: build/apidocs
into:
name: apidocs
version: ~
dir: modules/ROOT/attachments
The dir
key is effectively the same as an into
key with a string value, except that a target component version bucket is also specified.
clean key
If the clean
key is set to true
on the entry, that implicitly creates a separate clean entry using the same dir.
The clean is scoped to the current step (the same step that defines the run
key).
The clean action is only needed in the following two cases:
-
The Collector instance is running on an existing worktree. The worktree may have build files generated from a separate process or from a Collector instance from a previous Antora run. If you want to ensure that no residual files are discovered, the scan directory (and possibly other directories) should be cleaned.
-
The temporary worktree for the Collector instance is kept indefinitely. In this case, the Collector instance will be recycling a worktree it has used on a previous Antora run and thus may require cleaning.
If Collector creates a new worktree in which to run the Collector instance, there’s usually no need to clean. However, if the Collector instance has multiple steps, and those steps generate files into the same scan directory, it may be necessary to clean the scan directory at the start of each step. You’ll need to put some thought into when you want the scan directory to be cleaned.