Skip to content

fabricgov diff — Snapshot Comparison

The fabricgov diff command compares two fabricgov output snapshots and generates a diff.json file containing all differences found between the two points in time.

diff.json is designed to be consumed in the future by fabricgov report, adding comparison sections to the HTML report.


What is compared?

Dimension What is detected
Workspaces Added, removed, and changed (name, type, state, capacity)
Artifacts Reports, datasets, dataflows, lakehouses and others added or removed per workspace
Access Permissions granted, revoked, and roles changed (4 sources: workspace, report, dataset, dataflow)
Refresh (schedules) Schedules added or removed
Refresh (health) Datasets degraded (more failures) or improved (fewer failures)
Findings New findings, resolved findings, and findings with changed counts

Basic usage

# Automatically compares the 2 most recent runs in output/
fabricgov diff

# Explicit snapshots
fabricgov diff --from output/20260301_120000 --to output/20260309_143000

# Save diff to a different location
fabricgov diff --output ~/reports/diff.json

Available options

Option Default Description
--from PATH second-to-last run Base snapshot (the older one)
--to PATH latest run Current snapshot (the newer one)
--output-dir DIR output Root directory to search for automatic runs
--output FILE <to>/diff.json Path of the generated diff.json file

Terminal output

Displays an executive summary with totals per section:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  fabricgov diff — Executive Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Interval: 8 days between snapshots

  Workspaces    +3 added  -1 removed  ~2 changed
  Artifacts     +12 added  -2 removed
  Access        +5 granted  -2 revoked  ~1 role changed
  Refresh       ↓ 2 degraded  ↑ 1 improved
  Findings      ⚠ 1 new  ✓ 2 resolved

✅ diff.json saved to: output/20260309_143000/diff.json

diff.json structure

{
  "meta": {
    "snapshot_from": "output/20260301_120000",
    "snapshot_to":   "output/20260309_143000",
    "from_ts": "2026-03-01T12:00:00",
    "to_ts":   "2026-03-09T14:30:00",
    "days_between": 8,
    "generated_at": "2026-03-13T10:00:00"
  },
  "workspaces": {
    "available": true,
    "added":   [{"id": "...", "name": "New Workspace", "type": "Workspace", ...}],
    "removed": [...],
    "changed": [{"id": "...", "name": "...", "changes": ["state: 'Active' → 'Inactive'"]}]
  },
  "artifacts": {
    "available": true,
    "added":   [{"type": "reports", "workspace_id": "...", "artifact_id": "...", "name": "Sales KPI"}],
    "removed": [...]
  },
  "access": {
    "available": true,
    "granted":      [{"source": "workspace_access", "resource_name": "...", "user_email": "...", "role": "Admin"}],
    "revoked":      [...],
    "role_changed": [{"source": "...", "resource_name": "...", "user_email": "...", "role_before": ["Member"], "role_after": ["Admin"]}]
  },
  "refresh": {
    "schedules_available": true,
    "schedules_added":   [{"dataset_id": "...", "dataset_name": "..."}],
    "schedules_removed": [...],
    "health_available": true,
    "degraded": [{"name": "Sales Data", "workspace": "Marketing", "failures_before": 0, "failures_after": 3}],
    "improved": [...]
  },
  "findings": {
    "new":           [...],
    "resolved":      [...],
    "count_changed": [{"severity": "HIGH", "message": "...", "count_before": 2, "count_after": 5, "delta": 3}]
  },
  "summary": {
    "workspaces_added": 3,
    "workspaces_removed": 1,
    "workspaces_changed": 2,
    "artifacts_added": 12,
    "artifacts_removed": 2,
    "access_granted": 5,
    "access_revoked": 2,
    "access_role_changed": 1,
    "schedules_added": 1,
    "schedules_removed": 0,
    "datasets_degraded": 2,
    "datasets_improved": 1,
    "findings_new": 1,
    "findings_resolved": 2,
    "findings_count_changed": 1
  }
}

Data dependencies per section

Section Required CSV files
workspaces workspaces.csv
artifacts All artifact CSVs (reports, datasets, lakehouses, etc.)
access workspace_access.csv, report_access.csv, dataset_access.csv, dataflow_access.csv
refresh.schedules refresh_schedules.csv
refresh.health refresh_history.csv
findings All available data (runs InsightsEngine on each snapshot)

If a file does not exist in one or both snapshots, the corresponding section is marked "available": false and lists are empty — no error is raised.


Usage as a Python library

from fabricgov.diff import DiffEngine, Snapshot, find_run_dirs

# Auto-detect the 2 most recent runs
runs = find_run_dirs("output")
snap_from = Snapshot(runs[-2])
snap_to   = Snapshot(runs[-1])

engine = DiffEngine(snap_from, snap_to)
result = engine.run()

# Access the summary
print(result.summary)

# Save diff.json
from pathlib import Path
result.save(Path("output/diff.json"))

# Or convert to dict (for integration with other systems)
diff_dict = result.to_dict()

Use cases

Weekly access audit

from fabricgov.diff import DiffEngine, Snapshot, find_run_dirs

runs = find_run_dirs("output")
result = DiffEngine(Snapshot(runs[-2]), Snapshot(runs[-1])).run()

for entry in result.access["granted"]:
    if "#EXT#" in entry.get("user_email", ""):
        print(f"⚠ External access granted: {entry['user_email']} on {entry['resource_name']}")

Detect refresh degradation

for ds in result.refresh.get("degraded", []):
    print(f"↓ {ds['name']} ({ds['workspace']}): {ds['failures_before']}{ds['failures_after']} failures")

Check tenant growth

s = result.summary
print(f"Workspaces: {'+' if s['workspaces_added'] >= s['workspaces_removed'] else ''}{s['workspaces_added'] - s['workspaces_removed']}")
print(f"Artifacts:  {'+' if s['artifacts_added'] >= s['artifacts_removed'] else ''}{s['artifacts_added'] - s['artifacts_removed']}")

Common errors

Error Cause Solution
At least 2 output folders required Fewer than 2 runs in output/ Run fabricgov collect all twice, or use explicit --from/--to
Folder not found Provided path does not exist Check the path with ls output/
Section with "available": false CSV not present in snapshot Run the corresponding collector before generating the diff

← Back: Findings Analysis | Next: Authentication →