How to Organize Your Content Files

Antora employs both convention and configuration to aggregate content and generate your site. Before setting up or migrating your repositories, let’s review some key concepts that could impact how you organize your documentation projects and content files to work with Antora.

Storing your content source files

Antora can retrieve content source files from numerous git repositories by searching for files or their symlinks under a start path or multiple start paths in branches, tags, or the local worktree. The repositories Antora uses don’t have to be reserved exclusively for storing documentation (hence the start path). Antora can retrieve files from repositories that also host application code, tests, and other materials in sibling hierarchies. Antora relies on both convention and configuration to identify the documentation content.

In order to fetch source files from multiple and multi-use repositories, Antora requires that the documentation files be:

Located under a content source root
Labeled with a file named antora.yml
Organized into a standard set of directories

Classifying your content source files

Once Antora collects the source files from all content source roots, it classifies each file by assigning metadata to it, which is used to uniquely identify the file within the site. The file’s identifier, called a resource ID, is used for creating references from pages, other resources, and the configuration. This step also implicitly partitions the source files into component versions.

Antora’s virtual filesystem

Antora decouples source files from their storage locations after it collects them. For all intents and purposes, the origin of each file is irrelevant. In other words, Antora never goes back to the filesystem or git repository to read the file once it’s discovered and loaded. Antora bases all of its file operations on the virtual filesystem (VFS) it creates after it collects the files.

The only aspect of a file that maps back to the location on the filesystem is the family-relative path. And even this association is maintained merely as a convenience for the author. Aside from the family-relative path, all other parts of the file’s identity are based on associative metadata, such as the component name, version, module name, and family.

File metadata

So how does a file get this metadata? All files in the same content source root inherit the component name and version from the component version descriptor file, named antora.yml. These descriptor files help Antora sort and organize all of the collected source files into component versions. You can think of a component version as all of the documentation for a version of a project. For example, you’re reading a page in the Antora 3.0 component version right now.

These antora.yml files are how content that belongs to the same version of a project can be identified by Antora. It’s also how component versions are defined and populated implicitly.

Inside a content source root, files are further grouped into module and family folders, which provide two more facets of a source file’s identity. Finally, the family-relative path is captured to uniquely identify a source file within a family, even across multiple repositories or git references.

File locations and URLs

The location of a source file doesn’t dictate the location of the published file. Once a source file is loaded into Antora’s VFS, the file’s metadata is manipulated, which includes computing the file’s output location and URL. Each family of files has different rules for how these values are computed. The association between where the source file is found, where the published file is placed in the site, or how that file is accessed isn’t hardwired.

See What’s a component version? and What’s antora.yml? to learn how to assign a component name, version and other optional information to groups of content source files.