fj run view: distinguish private-repo log auth failure from a missing run, instead of masking it as "no logs" #91

Closed
opened 2026-06-08 13:58:26 +00:00 by stephen · 1 comment
Owner

What

When fj run view <n> --log/--log-failed (and the plain fj run view <n> summary) cannot read a run's logs, distinguish a 404-because-private/unauthenticated response from a genuinely-missing run/job, and print an actionable error for the former instead of the misleading "no logs for that run/job/attempt".

Root cause: fj reads logs from the Forgejo web frontend route (POST/GET /{owner}/{repo}/actions/runs/{run}/jobs/{job}/attempt/{attempt}, built in web_run_url, src/api/workflow_view.rs), which authenticates with a session cookie + CSRF token, not an API token. On a private repo that route returns 404 to a token-only request to hide the resource. log_route_error (src/api/workflow_view.rs, ~line 226) maps every 404 to "no logs for that run/job/attempt", so an auth/permission failure is reported as a missing run, including for known-successful runs.

Shippable approach: the 404 alone cannot tell auth-rejection from a real miss, but the token-accessible /api/v1/repos/{owner}/{repo}/actions/tasks list can. When the web route 404s, probe that list (already the confirmed-working token surface, see the module header in src/api/workflow_view.rs):

  • run exists in /actions/tasks but the web route 404s -> auth/permission case. Print something like: "Cannot read logs for run N: the Forgejo web log route rejected token auth. Log retrieval on private repos needs session auth (cookie/CSRF); an API token is not accepted by this route. See rasterstate/fj#103." Distinguish this from a job-index-out-of-range case where the run exists but --job is wrong.
  • run absent from /actions/tasks (or job index out of range) -> keep the existing "check the run number against fj run list, and that --job exists" message.

This is the error-quality + detection slice only. Actually retrieving private-repo logs is split into rasterstate/fj#103.

Priority

p2. Not a crash and a fallback exists (/actions/tasks for pass/fail, web UI for logs), but the misleading "no logs" actively misdiagnoses real CI triage (see rasterstate/fj#92, which sent a triage chasing runner/autoscaler limits for an ordinary step failure whose logs existed).

Why

Reproduced on rasterhub.com against private repos rasterstate/fjord-ios (run 395 failed, 384 succeeded) and rasterstate/flux (run 171): every form of log retrieval fails identically for green and red runs. --debug shows the web-route POST returning 404 under token auth. The "no logs for that run/job/attempt" text implies a bad run/job number, but the numbers are valid (they appear in fj run list) and the logs exist (confirmed via action_task_step.log_length server-side). The wrong error is worse than no error: it points at a nonexistent cause. Duplicate report with the same root cause: rasterstate/fj#92.

Acceptance

  • On a private repo where the web log route 404s under token auth and the run is present in /api/v1/.../actions/tasks, fj run view <n> --log/--log-failed prints an actionable auth/permission error (naming session-vs-token auth and linking rasterstate/fj#103), not "no logs for that run/job/attempt".
  • A genuinely-missing run (absent from the tasks list) or an out-of-range --job still gets the existing "check the run number / job index" message.
  • The plain fj run view <n> summary path surfaces the same distinction when it hits the same web route.
  • Wiremock coverage in src/client/integration_tests.rs: a private-repo case (web route 404 + tasks list 200 that contains the run) asserts the auth-distinguishing message; a missing-run case (tasks list 200 without the run) asserts the existing message.
  • cargo fmt --check, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all pass.

Dependencies

None for this slice; it lands independently. It is the prerequisite for rasterstate/fj#103 (actually retrieving private-repo logs), whose documented-limitation path reuses the actionable error introduced here.

Out of scope

  • Retrieving private-repo logs via a token-accepting endpoint or cookie/CSRF auth: rasterstate/fj#103 (parked, pending confirmation that this Forgejo version exposes any token-auth log endpoint at all).
  • Public-repo log retrieval, which already works through the web route.

Size

S

## What When `fj run view <n> --log/--log-failed` (and the plain `fj run view <n>` summary) cannot read a run's logs, distinguish a **404-because-private/unauthenticated** response from a **genuinely-missing run/job**, and print an actionable error for the former instead of the misleading "no logs for that run/job/attempt". Root cause: `fj` reads logs from the Forgejo **web** frontend route (`POST/GET /{owner}/{repo}/actions/runs/{run}/jobs/{job}/attempt/{attempt}`, built in `web_run_url`, `src/api/workflow_view.rs`), which authenticates with a session cookie + CSRF token, not an API token. On a **private** repo that route returns 404 to a token-only request to hide the resource. `log_route_error` (`src/api/workflow_view.rs`, ~line 226) maps every 404 to "no logs for that run/job/attempt", so an auth/permission failure is reported as a missing run, including for known-successful runs. Shippable approach: the 404 alone cannot tell auth-rejection from a real miss, but the token-accessible `/api/v1/repos/{owner}/{repo}/actions/tasks` list can. When the web route 404s, probe that list (already the confirmed-working token surface, see the module header in `src/api/workflow_view.rs`): - run exists in `/actions/tasks` but the web route 404s -> auth/permission case. Print something like: "Cannot read logs for run N: the Forgejo web log route rejected token auth. Log retrieval on private repos needs session auth (cookie/CSRF); an API token is not accepted by this route. See rasterstate/fj#103." Distinguish this from a job-index-out-of-range case where the run exists but `--job` is wrong. - run absent from `/actions/tasks` (or job index out of range) -> keep the existing "check the run number against `fj run list`, and that `--job` exists" message. This is the error-quality + detection slice only. Actually retrieving private-repo logs is split into rasterstate/fj#103. ## Priority p2. Not a crash and a fallback exists (`/actions/tasks` for pass/fail, web UI for logs), but the misleading "no logs" actively misdiagnoses real CI triage (see rasterstate/fj#92, which sent a triage chasing runner/autoscaler limits for an ordinary step failure whose logs existed). ## Why Reproduced on rasterhub.com against private repos `rasterstate/fjord-ios` (run 395 failed, 384 succeeded) and `rasterstate/flux` (run 171): every form of log retrieval fails identically for green and red runs. `--debug` shows the web-route POST returning 404 under token auth. The "no logs for that run/job/attempt" text implies a bad run/job number, but the numbers are valid (they appear in `fj run list`) and the logs exist (confirmed via `action_task_step.log_length` server-side). The wrong error is worse than no error: it points at a nonexistent cause. Duplicate report with the same root cause: rasterstate/fj#92. ## Acceptance - [ ] On a private repo where the web log route 404s under token auth and the run **is** present in `/api/v1/.../actions/tasks`, `fj run view <n> --log/--log-failed` prints an actionable auth/permission error (naming session-vs-token auth and linking rasterstate/fj#103), not "no logs for that run/job/attempt". - [ ] A genuinely-missing run (absent from the tasks list) or an out-of-range `--job` still gets the existing "check the run number / job index" message. - [ ] The plain `fj run view <n>` summary path surfaces the same distinction when it hits the same web route. - [ ] Wiremock coverage in `src/client/integration_tests.rs`: a private-repo case (web route 404 + tasks list 200 that contains the run) asserts the auth-distinguishing message; a missing-run case (tasks list 200 without the run) asserts the existing message. - [ ] `cargo fmt --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all` pass. ## Dependencies None for this slice; it lands independently. It is the prerequisite for rasterstate/fj#103 (actually retrieving private-repo logs), whose documented-limitation path reuses the actionable error introduced here. ## Out of scope - Retrieving private-repo logs via a token-accepting endpoint or cookie/CSRF auth: rasterstate/fj#103 (parked, pending confirmation that this Forgejo version exposes any token-auth log endpoint at all). - Public-repo log retrieval, which already works through the web route. ## Size S
Author
Owner

While researching adoption blockers I found the auth-masking pattern this issue describes is not limited to run view --log; the sibling Actions commands hand-roll the same error construction and share the same blind spot, so it's worth fixing as one class rather than per-command.

In src/api/workflow_run.rs, several handlers do let body = res.text().await.unwrap_or_default(); and then build a message that (a) drops the HTTP status on the empty-body branch and (b) never special-cases 401/403:

  • dispatch (~:102-117): on any non-success it falls back to "check the workflow file name and ref" when the body is empty, so an auth failure reads as a bad workflow name.
  • list_artifacts (~:152-161): non-404 errors surface as "could not list artifacts ..." with an empty detail when the body is empty.
  • download_artifact (~:186-195): emits "(HTTP 403): " with nothing after the colon.
  • post_run_action (rerun/cancel, ~:276-292): non-404 errors surface as "could not {action} run #N: " with no status/diagnostic.

Same user-facing failure mode as this issue: a private-repo/permission problem looks like "your run/workflow is wrong." The fix in ask #1 here (distinguish unauthenticated/forbidden from genuinely-missing, and always include the status) would cover these too if applied at the shared layer. Filing here rather than as a new issue since it's the same root cause.

While researching adoption blockers I found the auth-masking pattern this issue describes is not limited to `run view --log`; the sibling Actions commands hand-roll the same error construction and share the same blind spot, so it's worth fixing as one class rather than per-command. In `src/api/workflow_run.rs`, several handlers do `let body = res.text().await.unwrap_or_default();` and then build a message that (a) drops the HTTP status on the empty-body branch and (b) never special-cases 401/403: - `dispatch` (~`:102-117`): on any non-success it falls back to "check the workflow file name and ref" when the body is empty, so an auth failure reads as a bad workflow name. - `list_artifacts` (~`:152-161`): non-404 errors surface as "could not list artifacts ..." with an empty detail when the body is empty. - `download_artifact` (~`:186-195`): emits "(HTTP 403): " with nothing after the colon. - `post_run_action` (rerun/cancel, ~`:276-292`): non-404 errors surface as "could not {action} run #N: " with no status/diagnostic. Same user-facing failure mode as this issue: a private-repo/permission problem looks like "your run/workflow is wrong." The fix in ask #1 here (distinguish unauthenticated/forbidden from genuinely-missing, and always include the status) would cover these too if applied at the shared layer. Filing here rather than as a new issue since it's the same root cause.
stephen changed title from fj run view --log returns 404 "no logs" for every run on private repos to fj run view: distinguish private-repo log auth failure from a missing run, instead of masking it as "no logs" 2026-06-10 23:50:28 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
rasterstate/fj#91
No description provided.