feat/e-diagnostics

younes-benmoussa commented

2026-06-13 00:00:55 +00:00

Owner

No description provided.

younes-benmoussa added 14 commits

2026-06-13 00:00:55 +00:00

fix(import): close the lstat-then-read TOCTOU window dbba3710cf

The external manifest source is now opened with O_NOFOLLOW (kernel
refuses symlinks, ELOOP mapped to the same user-facing error) and
O_NONBLOCK (a FIFO open cannot hang), then fstat'd and read THROUGH
THE SAME DESCRIPTOR — the previous check-by-path-then-read sequence
could be raced by substituting the file in between (CWE-367). The
regular-file and size-cap checks and the B3 test matrix (symlink,
symlink-to-FIFO, FIFO) are unchanged and stay green; UnicodeDecodeError
still propagates per the import contract.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

fix(integrity): surface bcintegrity stderr when a partial capture lacks its trailer ab0cf76cfd

Exit 74 (IOERR) is classified partial-success for the capture runner,
so a truncated manifest lands in the parse path and the user only saw
'integrity manifest missing trailer' — the same hole B1 fixed for the
hash runner (reproduced RED with a fake bcintegrity emitting stderr at
exit 74). The B1 pattern is factored into a shared
_FileOutputRunner._parser_error_message used by both runners.

The analogous duplicate-scanner branches stay as-is: its parse path is
only reachable at exit 0, where the 'file not found' emit is dead
defensive code and stderr carries no diagnostic value. The fake
bcintegrity fixture now finds the subcommand anywhere in argv (global
options precede it) and can emit stderr.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

style: format test_integrity_backend.py 03a4566cf8

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

refactor(ui): single binary-unit size formatter, translated severity fallback d2b4ee8bdd

duplicate_scan_page carried its own size formatter dividing by 1024
but labelling KB/MB/GB (binary maths, decimal labels); it now uses the
shared _utils._format_size (B/KiB/MiB/GiB) like every other surface.
The integrity change-detail line falls back through _severity_label
instead of echoing the raw severity key.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

feat(db): schema v41 — verifies.error_message via the first real migration c342faecbb

The verifies table gains a nullable error_message column (the raw tool
diagnostic, the WHY; the encoded verdict_summary stays the WHAT).
First shipped migration: ALTER TABLE ADD COLUMN, no data rewrite; the
baseline spells the new column exactly as ALTER rewrites sqlite_master
so the frozen-v40 anti-drift test proves fresh ≡ migrated. Round-trip
test: populated v40 file opens at 41 with rows intact and NULL
error_message on pre-v41 rows. record_verify gains the kwarg;
list_verifies/list_other_verifies/latest_verify and models.Verify
expose the field.

A backup test crafting 'CURRENT - 1' as an invalid version now uses
'CURRENT + 1': since v41 exists, CURRENT - 1 is the floor and a v40
backup is legitimately restorable (migrated on reopen via the existing
floor..current range).

Documented decisions (docs/schema-migrations.md): pre-migration
snapshot deferred with an explicit trigger criterion (first
data-rewriting migration); PRAGMA application_id deferred to its own
future migration.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

feat(jobs): persist the failure diagnostic into verifies.error_message ae326e6548

JobRecord gains error_message (single enrichment point: every failure
handler sets job.error_message before building the record), the four
verify-failure recorders pass it through to the new column, and
_job_record persists NULL — never the empty-string property default.
Success/baseline/empty/cancelled paths are untouched;
record_scan_failure keeps its preexisting message parameter (scans
table contract).

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

feat(ui): failure diagnostic excerpt in history rows 8f8e9536f4

Failed verify rows on the target and integrity history pages compose
their subtitle with the first line of the persisted diagnostic
(diagnostic_excerpt helper, 80 chars) through a translatable
'{summary} — {detail}' pattern, and carry the full raw text in the row
tooltip — the same gesture as the job-queue row. The hash branch also
renders the encoded verdict kind (capture/compare/kernel-gate) instead
of a hardcoded 'Verification failed'. Pre-v41 rows (NULL diagnostic)
keep their current subtitle, no tooltip. fr catalog updated.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

feat(ui): full failure diagnostic on the detail pages c186b5063f

The integrity changes page and the hash verify-detail page show the
complete raw diagnostic under the failed status page, in a selectable
wrapped monospace label (hidden for pre-v41 rows with no diagnostic).
The hash detail page gains a dedicated 'failed' stack page: a failed
verify previously fell through to the baseline page whose 'This is a
baseline — no diff is available' text was a lie. Reuses the existing
'Verification failed' msgid; no new translatable strings.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

feat(ui): startup-error screen replaces silent log-and-exit 5fabb4402c

The three fatal database-open failures were log + exit 1 — invisible
from a .desktop launch. app.run() now always starts; activate routes
to MainWindow or to a minimal StartupErrorWindow (HIG placeholder-page
pattern: ApplicationWindow + StatusPage, selectable monospace path
label kept out of the translatable strings):

- pre-0.4.0 data: 'Stored Data Is Too Old' with a destructive
  'Reset Data…' action behind an AlertDialog confirmation, backed by
  the new manifest_paths.reset_data_files (db + sidecars + archive
  contents, no openable Database required); a successful reset boots
  the real UI in the same process.
- newer data: 'Update Required', quit only — the message says not to
  delete and no reset is offered.
- failed migration: 'Could Not Update Stored Data' with a safe
  'Try Again' (database left unchanged by the rolled-back migration).

The destructive action is never the default widget; Ctrl+Q works in
error mode (on_quit guards the absent job_queue); the process still
exits 1 when the session never recovered. Verified under xvfb: both a
v999 and a v13 database now keep the app alive showing the screen
instead of exiting. Catalogs regenerated; fr complete (440 messages,
0 fuzzy), en mirrored.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

fix(jobs): drop the symbolic exit-name prefix from user-facing errors 33febe7a1d

'[generic] exit code 1: …' leaked the --describe exit-table jargon
into every failure message (QA feedback). The prefix carries no value
for users; the symbolic name stays in the _log lines where it helps
diagnosis. Verified against the real bchash: messages now read
'exit code 1: bchash: stat: …: not found'. Rows persisted by earlier
builds keep their old prefix (opaque pass-through).

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

fix(jobs): strip tool-name and syscall prefixes from displayed diagnostics 5595000a79

QA feedback: 'bchash: stat: /path: not found' kept two layers of
tool-internal framing. The runner now strips its own binary's name
prefix (redundant inside Vigil) and a lone syscall token directly
before a path; multi-word causes ('cannot read manifest …') are left
intact and raw stderr stays in the logs. Verified against the real
bchash: the user now reads 'exit code 1: /path: not found'.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

style: format test_hash_runner.py 2e7b8ee5f4

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

fix: audit follow-ups on the diagnostics branch 90549cfc48

- diagnostic_excerpt takes the first NON-EMPTY line (a stderr starting
  with a newline produced an empty excerpt).
- _open_database catches unexpected failures (permissions, full disk)
  so a Reset/Try Again recovery can never escape into the GTK
  dispatch; both recovery actions share a _recover helper that boots
  BEFORE closing the error window (GApplication use-count never dips
  to zero) and rebuilds the error screen when the failure case
  changed between retries.
- reset_data_files and wipe_and_reinit also remove a -journal sidecar
  (orphan rollback journal next to a recreated database is a SQLite
  corruption risk; pre-existing gap in wipe_and_reinit).
- the parse-path failure logs the RAW stderr before cleaning it for
  display, as the previous commit message already claimed.
- drop the dead getattr guard in _clean_stderr; truthful comments on
  what error_message actually contains.
- new e2e composition proof: a genuine v40 backup (frozen baseline +
  schema_version=40 metadata) restores under this v41 build and comes
  back migrated with its data intact.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

fix(targets): algorithm lock also engages on imported baselines

pr / lint (pull_request) Successful in 5s

Details

pr / build-test (pull_request) Successful in 11s

Details

e1913fabd6

Both the edit-dialog lock and the update_target backend guard keyed on
last_run_id, which external baseline imports never set (by design,
test_import_does_not_set_target_last_run): a sha256 import could be
re-edited to blake3 and every subsequent verify then died on
'bchash: algorithm mismatch' (exit 65, verified against the real
tool). New shared predicate Database.target_has_baseline_artifact
(archived run OR verify row with a non-empty manifest) drives both
layers; the duplicate tool stays exempt — its scans are stateless and
algorithm changes between scans are harmless. RED-first at both the
backend and dialog levels.

Signed-off-by: Younes Benmoussa <younes.benmoussa@pm.me>

younes-benmoussa merged commit 6415299c47 into main

2026-06-13 00:25:44 +00:00

younes-benmoussa referenced this pull request from a commit

2026-06-13 00:25:44 +00:00

Merge pull request 'feat/e-diagnostics' (#10) from feat/e-diagnostics into main

Rows
Columns

feat/e-diagnostics #10