Apply Tags from CSV¶
The -apply-tags-from-csv <file> mode applies reviewed annotation decisions back to source files (Java and C#). It reads a MethodAtlas CSV that a human has already reviewed and edited, then writes the tag and display-name annotations recorded in that CSV.
Language support¶
Source write-back is implemented only for Java and C# — the two languages whose discovery plugins ship a SourcePatcher SPI implementation. See the table in Source Write-back — Language support for the full matrix.
Files in any other discovered language (TypeScript/JS, Go, Python, PowerShell, SAP ABAP, COBOL) are skipped during the apply phase. The engine prints a per-file notice such as:
Apply-tags-from-csv: skipped tests/test_login.py — source write-back is not supported for this language (currently Java and C# only)
and appends the aggregate skip count to the summary line:
Apply-tags-from-csv complete: 6 change(s) in 2 file(s); 0 mismatch(es) skipped. 3 file(s) skipped (no source write-back support for the language).
CSV rows describing tests in unsupported languages are reported as mismatches (the source-method inventory only knows about files for which a SourcePatcher is available). When you scan a polyglot tree, lower the -mismatch-limit only after auditing which rows are expected to be unwritable.
When to use this mode¶
- Your organisation requires human sign-off before any source annotation is written — the CSV is the review artefact.
- You want a permanent, version-controlled record of every annotation decision (the committed CSV serves this purpose).
- You are applying bulk annotation decisions across a large test suite and want a single review step before any file is touched.
- You want to integrate with a spreadsheet-based review workflow: export the CSV, distribute it for review, collect the approved version, apply it.
If you want AI annotations applied immediately without a separate review step, use -apply-tags instead.
This mode is the recommended approach for teams that want human oversight before touching source code.
Typical workflow¶
# 1. Produce a CSV (optionally with AI suggestions)
./methodatlas -ai src/test/java > review.csv
# 2. Open review.csv in a spreadsheet or text editor.
# Adjust the tags and display_name columns to reflect the desired state.
# Save the file.
# 3. Apply the decisions back to source files
./methodatlas -apply-tags-from-csv review.csv src/test/java
After step 3, re-running MethodAtlas on the same source tree would produce a CSV
that matches the tags and display_name columns of review.csv.
What the engine does¶
For each test method found in the source tree that has a matching row in the CSV:
| CSV column | Value | Effect on source |
|---|---|---|
tags |
semicolon-separated list | Removes all existing @Tag and @Tags annotations; adds one @Tag("…") per entry |
display_name |
non-empty text | Replaces any existing @DisplayName with the given text |
display_name |
empty string | Removes @DisplayName if present |
display_name |
column absent from CSV | Leaves @DisplayName unchanged (backward compatibility with old CSV files) |
Required imports (org.junit.jupiter.api.Tag, org.junit.jupiter.api.DisplayName)
are added only to files where they become necessary.
All other formatting — whitespace, comments, blank lines, import order — is preserved by JavaParser's lexical-preserving printer.
The display_name three-way contract¶
The three behaviours of the display_name column are illustrated by this example CSV:
fqcn,method,loc,tags,display_name
com.example.AuthTest,loginWithExpiredToken,8,security;auth,SECURITY: auth — login is rejected when the token has expired
com.example.AuthTest,loginWithValidCredentials,12,,""
com.example.PaymentTest,chargeCard,15,payment,
| Row | display_name value |
What happens in source |
|---|---|---|
loginWithExpiredToken |
SECURITY: auth — login is rejected… (non-empty) |
@DisplayName is added with this text, or existing value is replaced |
loginWithValidCredentials |
"" (empty string — note the quoted empty field) |
Any existing @DisplayName is removed from the source |
chargeCard |
(column present but cell is empty — no quotes) | @DisplayName in source is left exactly as it is |
The distinction between an empty string ("") and an absent value (empty cell without quotes) is significant: the former is an explicit instruction to remove the annotation; the latter means "do not touch it". This allows you to produce a CSV that only specifies tags for some methods while leaving display names alone.
For flags and full format documentation, see CLI reference — -apply-tags-from-csv.
Promoting AI suggestions into the curated columns¶
A scan run with -ai populates the advisory ai_tags and ai_display_name columns. The apply engine never reads those columns by default: it applies only the human-curated tags and display_name columns. This separation is intentional — it is the human review step, and it keeps unvalidated AI output from reaching source code.
The recommended way to act on AI suggestions is therefore to review them and copy the approved values into the curated columns yourself (in a spreadsheet, a script, or by hand), exactly as the end-to-end scenario below describes. The curated columns are your sign-off.
-promote-ai is risky and not recommended
The -promote-ai flag short-circuits the review step: where a curated column
is blank, it copies the AI suggestion straight into source. This writes
unvalidated AI output into your codebase and defeats the purpose of the
apply-from-csv workflow. Do not enable it unless the promotion has been
deliberately rethought and approved for your environment.
When -promote-ai is supplied (on the command line, or via promoteAi: true in a YAML config file), promotion is applied per field, independently, and only where the curated value is blank and a non-blank AI value exists:
| Curated value | AI value | Outcome |
|---|---|---|
| present | anything | curated value is used (no promotion) |
| blank | present | AI value is promoted into source |
| blank | blank/absent | nothing to promote |
Every promoted value is counted on the summary line so the run leaves an audit trail of what came from AI:
Apply-tags-from-csv complete: 6 change(s) in 1 file(s); 0 mismatch(es) skipped. 6 value(s) promoted from AI columns (unvalidated — used with -promote-ai).
Add -verbose to itemise each promotion as a [promote-ai] line. See the CLI reference — -promote-ai for the full contract.
The CSV as desired state¶
The CSV is a complete desired-state specification, not an incremental patch. Every test method currently present in the source tree must have a corresponding row in the CSV, and every row in the CSV must correspond to a method in the source tree. Deviations in either direction are counted as mismatches.
This invariant is intentional: it prevents silent drift between the reviewed CSV and the codebase. If a method was added to or deleted from the source tree after the CSV was produced, the mismatch is surfaced before any source file is touched.
Mismatch handling¶
A mismatch occurs when:
- A row in the CSV has no matching method in the current source tree (method was deleted or renamed).
- A test method in the source tree has no matching row in the CSV (method was added after the CSV was produced).
The -mismatch-limit <n> flag controls the response:
| Setting | Behaviour |
|---|---|
-1 (default) |
Log each mismatch as a warning; proceed with all matched methods |
1 |
Abort without making any changes as soon as one mismatch is detected — recommended for CI |
n |
Abort when the mismatch count reaches or exceeds n |
When the limit is reached, MethodAtlas prints each mismatch and exits with code 1:
MISMATCH (in CSV, not in source): com.example.LoginTest::removedMethod
Apply-tags-from-csv aborted: 1 mismatch(es) >= limit 1. No source files were modified.
The mismatch count is computed before any file is written. Either all source files are modified or none are.
End-to-end scenario: human-reviewed annotation campaign¶
A security team wants to annotate 40 test methods across a legacy codebase. They require sign-off on every annotation before any source file is touched.
# Step 1: produce AI suggestions as a CSV
./methodatlas \
-ai \
-ai-provider openai \
-ai-api-key-env OPENAI_API_KEY \
src/test/java > review.csv
The review.csv file now contains AI-suggested display_name and ai_tags values for every test method. The security team opens it in a spreadsheet application and:
- Copies
ai_display_namevalues they agree with into thedisplay_namecolumn. - Copies
ai_tagsvalues they agree with into thetagscolumn (replacingsecurity;authetc. as appropriate). - Leaves
display_nameblank (not"") for methods where they do not want to set a display name. - Removes rows for methods they do not want to annotate.
After review, the security team saves the file as approved.csv and sends it back to the engineering team.
# Step 2: apply approved decisions — dry run with permissive mismatch limit
./methodatlas \
-apply-tags-from-csv approved.csv \
-mismatch-limit -1 \
src/test/java
# Step 3: review the diff
git diff src/test/java
# Step 4: commit
git add src/test/java approved.csv
git commit -m "chore: apply security team approved annotations"
Committing approved.csv alongside the source changes preserves the full record of what was approved, by whom, and when (via git log).
CI integration¶
Use a strict mismatch limit in automated pipelines to guard against a stale CSV:
A non-zero exit code fails the pipeline if the codebase has diverged from the reviewed CSV, requiring the team to re-run the review cycle before re-applying.
-mismatch-limit examples¶
The following examples illustrate the three meaningful settings:
| Command | Behaviour |
|---|---|
./methodatlas -apply-tags-from-csv r.csv -mismatch-limit -1 src/test/java |
Apply all matched rows; log mismatches as warnings and continue. Use during initial adoption or exploratory runs. |
./methodatlas -apply-tags-from-csv r.csv -mismatch-limit 1 src/test/java |
Abort immediately on the first mismatch, before modifying any file. Use in CI to enforce that the CSV is always current. |
./methodatlas -apply-tags-from-csv r.csv -mismatch-limit 5 src/test/java |
Allow up to 4 mismatches; abort if 5 or more are detected. Use when a small number of pending method additions is acceptable during a transition period. |
Summary output¶
After a run, MethodAtlas prints a summary to standard output:
Individual modified files are listed beforehand:
Modified: src/test/java/com/example/LoginTest.java (+3 change(s))
Modified: src/test/java/com/example/TokenTest.java (+4 change(s))
Apply-tags-from-csv complete: 7 change(s) in 2 file(s); 0 mismatch(es) skipped.
No CSV is produced. Standard output is consumed entirely by the summary.
Modifies source files in place
-apply-tags-from-csv edits .java files directly. There is no dry-run
mode. Commit or back up your work before running.
Troubleshooting: nothing was applied¶
If a run completes with 0 change(s) in 0 file(s), the CSV rows did not line up with any method discovered in the source tree. The engine matches on a lookup key of the form <fqcn>::<method>, built independently from the CSV's fqcn/method columns and from the source the patcher parses. When the two keys differ, the row is silently treated as a mismatch.
Re-run with -verbose to see exactly what was compared:
The output prints the working directory and resolved scan roots, every CSV key, every source key, and the key-by-key match result (CSV-only / SRC-only). Common causes it exposes:
- Wrong working directory — the scan root resolves to a tree without the expected files; the printed absolute scan root and an empty source-key list make this obvious.
- Fully qualified class name mismatch — the CSV's
fqcndiffers from the package the source declares (for example the CSV was generated against a different module or a relocated package). - Method-name mismatch — the CSV
methodcolumn holds something other than the method identifier the patcher discovers.
See the -verbose reference for an annotated example of the diagnostic output.
Reviewing changes¶
After write-back, inspect the diff before committing:
See CLI reference for the full flag reference.