Skip to content

Governance Report — Complete Guide

Documentation for fabricgov report: how to generate it, what each section shows, where data comes from, and what rules are applied.


Generation

fabricgov report                                       # most recent folder in output/
fabricgov report --from output/20260227_143000/        # specific folder
fabricgov report --from output/20260227_143000/ --open # generate and open in browser

The command generates two standalone HTML files in the output folder:

File Language
report.html Portuguese
report.en.html English

The files are self-contained — no server required, shareable via email or cloud storage. Plotly and Bootstrap are loaded via CDN.

Required data: at least one .csv or .json file in the output folder. Sections whose data doesn't exist are hidden automatically.


Report Sections

1. Executive Summary

Tenant overview in colored KPI cards. Available whenever any data is present.

KPI Source Rule
Workspaces summary.jsontotal_workspaces; fallback: len(workspaces.csv) Total collected workspaces
Total Artifacts summary.jsontotal_items; fallback: sum of all artifact CSVs Sum of all items across all artifact types
Unique Users Union of user_email across workspace_access.csv, report_access.csv, dataset_access.csv, dataflow_access.csv Distinct email addresses across all access files
External Users Same access files Emails containing #EXT# (Azure AD guest pattern)
Datasets without Owner datasets.csvconfiguredBy column Rows where configuredBy is null or empty
Refresh Success Rate refresh_history.csvstatus column status == "Completed" ÷ total rows × 100
Workspaces on Dedicated Capacity workspaces.csvisOnDedicatedCapacity column Values true, 1, or yes (case-insensitive, type-agnostic)

Card color coding: - Blue — inventory metrics (counts) - Yellow / Red — risk metrics (external users, no owner, failures) - Green — health metrics (success rate, dedicated capacity)


2. Inventory

Overview of artifacts and workspaces with three charts.

Chart: Artifacts by Type (horizontal bar)

  • Source: summary.jsonitems_by_type; fallback: row count of each artifact CSV present in the folder
  • Rule: top 12 types by count, sorted descending
  • Recognized artifact CSVs: reports, datasets, dataflows, dashboards, datamarts, lakehouses, warehouses, notebooks, datasourceInstances, paginatedReports, Eventstream, Eventhouse, KQLDatabase, KQLDashboard, Reflex, DataPipeline, MirroredDatabase, SQLAnalyticsEndpoint

Chart: Workspace Types (donut)

  • Source: workspaces.csvtype column
  • Rule: value_counts() — distribution by type (Workspace, PersonalGroup, etc.)

Chart: Dedicated vs Shared (donut)

  • Source: workspaces.csvisOnDedicatedCapacity column
  • Rule: values true/1/yes = Dedicated; all others = Shared

3. Workspaces — Full Detail

Dedicated section with details for all collected workspaces. Requires workspaces.csv.

Internal KPI cards

  • Total Workspaces
  • On Dedicated Capacity
  • On Shared Capacity
  • Total Artifacts

Table: All Workspaces

  • Source: workspaces.csv + cross-count with artifact CSVs
  • Columns: Name, Type, State, Capacity (Dedicated/Shared), Capacity ID, Artifacts
  • Ordering: descending by artifact count
  • How artifact count is calculated: for each artifact CSV that contains a workspace_id column, count rows per workspace and accumulate

Chart: Top 10 Workspaces (horizontal bar)

  • Source: cross-reference workspace_id → row count from artifact CSVs
  • Rule: top 10 by total artifacts; names truncated at 35 characters

Cards: Artifacts by Type

  • Source: artifacts_by_type (same as Inventory section)
  • Individual cards per type with count

4. Access & Governance

Permission analysis and access exposure. Requires at least one *_access.csv file.

Files read: - workspace_access.csv — workspace roles - report_access.csv — report permissions - dataset_access.csv — dataset permissions - dataflow_access.csv — dataflow permissions

Chart: Role Distribution (donut)

  • Source: workspace_access.csvrole column
  • Rule: value_counts() — shows Admin, Member, Contributor, Viewer, and others

Chart: Principal Type (donut)

  • Source: workspace_access.csvprincipal_type column
  • Rule: value_counts() — User, Group, App, ServicePrincipal, etc.

Table: External Users with Access (#EXT#)

  • Source: all *_access.csv files → user_email column
  • Rule: emails containing #EXT# (Azure AD B2B guests)
  • Columns: Email, Roles (union across all files), workspace count
  • Limit: top 50 by workspace count

Top 10 Users by Access

  • Source: workspace_access.csv
  • Rule: groupby("user_email")["workspace_id"].nunique() — count of distinct workspaces per user
  • Columns: Email, Workspaces (count)

Workspaces with Only 1 User (Single Point of Failure)

  • Source: workspace_access.csv
  • Rule: groupby("workspace_id")["user_email"].nunique() == 1 — identifies workspaces where only one email is listed
  • Limit: top 20
  • Risk: if that user leaves the organization, the workspace will have no administrator

5. Refresh Health

Analysis of dataset and dataflow execution history. Requires refresh_history.csv.

Chart: Refresh Status (donut)

  • Source: refresh_history.csvstatus column
  • Colors: Completed = green, Failed = red, Unknown/Disabled = gray/orange
  • Rule: value_counts() across all history records

Chart: Refreshes per Day — Last 30 Days (line chart)

  • Source: refresh_history.csvstart_time column
  • Rule: pd.to_datetime(start_time) → grouped by date; filtered to the last 30 days from report generation time
  • Records without valid start_time: silently ignored

Table: Failed Refreshes

  • Source: refresh_history.csv
  • Rule: status in ["Failed", "Error", "Disabled"] (case-insensitive)
  • Columns: Artifact, Workspace, Start, Status, Error (when available in service_exception_json)
  • Limit: top 50 records

Table: Datasets without Refresh in the Last 30 Days

  • Source: refresh_history.csv
  • Rule: for each artifact (artifact_name), takes the most recent start_time; if it is older than 30 days from generation date, it appears in this list
  • Limit: top 50 artifacts

6. Infrastructure

Analysis of tenant capacities and workloads. Requires capacities.csv.

Table: Capacities

  • Source: capacities.csv
  • Available columns: Name (displayName), SKU, State (state), Region (region)

Chart: Workspaces by Capacity (bar)

  • Source: workspaces.csvcapacityId; cross-reference with capacities.csvdisplayName
  • Rule: value_counts() of capacityId in workspaces, mapped to capacity name

Chart: Capacities by SKU (bar)

  • Source: capacities.csvsku column
  • Rule: value_counts() — P1, P2, F2, F64, A1, etc.

Workloads by State

  • Source: workloads.csvstate column
  • Rule: value_counts() — Enabled, Disabled, Unsupported

Note: Workloads are only collected for Gen1 capacities (P-SKU and A-SKU). Fabric capacities (F-SKU) do not expose workloads via the API.


7. Tenant Domains

List of configured domains and sub-domains. Requires domains.csv.

  • Source: domains.csv
  • Columns: Name (displayName), Description (description), Parent Domain ID (parentDomainId)
  • Hierarchy rule: if parentDomainId is null/empty = Root; otherwise = Sub-domain
  • Limit: top 100 domains

8. Governance Findings

Prioritized list of alerts generated automatically based on collected data. Findings are ordered by severity and displayed even when data is partial.

Severity Color Finding Rule
CRITICAL Red Datasets without owner configuredBy null or empty in datasets.csv
HIGH Orange External users with access Emails with #EXT# in any access file
HIGH Orange Failed refreshes status == "Failed" in refresh_history.csv
MEDIUM Blue Workspaces with 1 unique user nunique(user_email) == 1 per workspace in workspace_access.csv
MEDIUM Blue Datasets without refresh in 30+ days Most recent start_time > 30 days in refresh_history.csv
OK Green No critical findings Displayed only when none of the above are detected

Findings only appear when the required data is available. If refresh_history.csv was not collected, refresh-related findings will not be generated.


Data Requirements by Section

Section Required file(s)
Executive Summary Any file in the folder
Inventory workspaces.csv, summary.json, or artifact CSVs
Workspaces — Detail workspaces.csv
Access & Governance workspace_access.csv (minimum)
Refresh Health refresh_history.csv
Infrastructure capacities.csv
Domains domains.csv
Findings Any combination of the above

Output

Two HTML files are generated in the data source folder (or the folder specified via --output):

output/20260227_143000/
├── report.html         # Portuguese
└── report.en.html      # English

The HTML file is standalone: all Plotly charts and Bootstrap CSS are loaded via CDN. Plotly JS is included once in the document <head>.


Limitations

  • Missing sections: if a data file doesn't exist, the corresponding section displays a subtle notice ("data not available") instead of an error
  • Table limits: 50 rows for failures and stale datasets; 20 for single-user workspaces; 10 for top workspaces/users
  • Refresh history: the Power BI API returns only the most recent refreshes per dataset (see limitations)
  • Workloads: available only for Gen1 capacities (P-SKU, A-SKU)
  • Charts: require CDN connectivity for Plotly and Bootstrap to render correctly

📘 Back to README | Collectors Guide | Limitations