Governance Report — Complete Guide¶
Documentation for
fabricgov report: how to generate it, what each section shows, where data comes from, and what rules are applied.
Generation¶
fabricgov report # most recent folder in output/
fabricgov report --from output/20260227_143000/ # specific folder
fabricgov report --from output/20260227_143000/ --open # generate and open in browser
The command generates two standalone HTML files in the output folder:
| File | Language |
|---|---|
report.html |
Portuguese |
report.en.html |
English |
The files are self-contained — no server required, shareable via email or cloud storage. Plotly and Bootstrap are loaded via CDN.
Required data: at least one
.csvor.jsonfile in the output folder. Sections whose data doesn't exist are hidden automatically.
Report Sections¶
1. Executive Summary¶
Tenant overview in colored KPI cards. Available whenever any data is present.
| KPI | Source | Rule |
|---|---|---|
| Workspaces | summary.json → total_workspaces; fallback: len(workspaces.csv) |
Total collected workspaces |
| Total Artifacts | summary.json → total_items; fallback: sum of all artifact CSVs |
Sum of all items across all artifact types |
| Unique Users | Union of user_email across workspace_access.csv, report_access.csv, dataset_access.csv, dataflow_access.csv |
Distinct email addresses across all access files |
| External Users | Same access files | Emails containing #EXT# (Azure AD guest pattern) |
| Datasets without Owner | datasets.csv → configuredBy column |
Rows where configuredBy is null or empty |
| Refresh Success Rate | refresh_history.csv → status column |
status == "Completed" ÷ total rows × 100 |
| Workspaces on Dedicated Capacity | workspaces.csv → isOnDedicatedCapacity column |
Values true, 1, or yes (case-insensitive, type-agnostic) |
Card color coding: - Blue — inventory metrics (counts) - Yellow / Red — risk metrics (external users, no owner, failures) - Green — health metrics (success rate, dedicated capacity)
2. Inventory¶
Overview of artifacts and workspaces with three charts.
Chart: Artifacts by Type (horizontal bar)¶
- Source:
summary.json→items_by_type; fallback: row count of each artifact CSV present in the folder - Rule: top 12 types by count, sorted descending
- Recognized artifact CSVs:
reports,datasets,dataflows,dashboards,datamarts,lakehouses,warehouses,notebooks,datasourceInstances,paginatedReports,Eventstream,Eventhouse,KQLDatabase,KQLDashboard,Reflex,DataPipeline,MirroredDatabase,SQLAnalyticsEndpoint
Chart: Workspace Types (donut)¶
- Source:
workspaces.csv→typecolumn - Rule:
value_counts()— distribution by type (Workspace,PersonalGroup, etc.)
Chart: Dedicated vs Shared (donut)¶
- Source:
workspaces.csv→isOnDedicatedCapacitycolumn - Rule: values
true/1/yes= Dedicated; all others = Shared
3. Workspaces — Full Detail¶
Dedicated section with details for all collected workspaces. Requires workspaces.csv.
Internal KPI cards¶
- Total Workspaces
- On Dedicated Capacity
- On Shared Capacity
- Total Artifacts
Table: All Workspaces¶
- Source:
workspaces.csv+ cross-count with artifact CSVs - Columns: Name, Type, State, Capacity (Dedicated/Shared), Capacity ID, Artifacts
- Ordering: descending by artifact count
- How artifact count is calculated: for each artifact CSV that contains a
workspace_idcolumn, count rows per workspace and accumulate
Chart: Top 10 Workspaces (horizontal bar)¶
- Source: cross-reference workspace_id → row count from artifact CSVs
- Rule: top 10 by total artifacts; names truncated at 35 characters
Cards: Artifacts by Type¶
- Source:
artifacts_by_type(same as Inventory section) - Individual cards per type with count
4. Access & Governance¶
Permission analysis and access exposure. Requires at least one *_access.csv file.
Files read:
- workspace_access.csv — workspace roles
- report_access.csv — report permissions
- dataset_access.csv — dataset permissions
- dataflow_access.csv — dataflow permissions
Chart: Role Distribution (donut)¶
- Source:
workspace_access.csv→rolecolumn - Rule:
value_counts()— shows Admin, Member, Contributor, Viewer, and others
Chart: Principal Type (donut)¶
- Source:
workspace_access.csv→principal_typecolumn - Rule:
value_counts()— User, Group, App, ServicePrincipal, etc.
Table: External Users with Access (#EXT#)¶
- Source: all
*_access.csvfiles →user_emailcolumn - Rule: emails containing
#EXT#(Azure AD B2B guests) - Columns: Email, Roles (union across all files), workspace count
- Limit: top 50 by workspace count
Top 10 Users by Access¶
- Source:
workspace_access.csv - Rule:
groupby("user_email")["workspace_id"].nunique()— count of distinct workspaces per user - Columns: Email, Workspaces (count)
Workspaces with Only 1 User (Single Point of Failure)¶
- Source:
workspace_access.csv - Rule:
groupby("workspace_id")["user_email"].nunique() == 1— identifies workspaces where only one email is listed - Limit: top 20
- Risk: if that user leaves the organization, the workspace will have no administrator
5. Refresh Health¶
Analysis of dataset and dataflow execution history. Requires refresh_history.csv.
Chart: Refresh Status (donut)¶
- Source:
refresh_history.csv→statuscolumn - Colors: Completed = green, Failed = red, Unknown/Disabled = gray/orange
- Rule:
value_counts()across all history records
Chart: Refreshes per Day — Last 30 Days (line chart)¶
- Source:
refresh_history.csv→start_timecolumn - Rule:
pd.to_datetime(start_time)→ grouped by date; filtered to the last 30 days from report generation time - Records without valid
start_time: silently ignored
Table: Failed Refreshes¶
- Source:
refresh_history.csv - Rule:
status in ["Failed", "Error", "Disabled"](case-insensitive) - Columns: Artifact, Workspace, Start, Status, Error (when available in
service_exception_json) - Limit: top 50 records
Table: Datasets without Refresh in the Last 30 Days¶
- Source:
refresh_history.csv - Rule: for each artifact (
artifact_name), takes the most recentstart_time; if it is older than 30 days from generation date, it appears in this list - Limit: top 50 artifacts
6. Infrastructure¶
Analysis of tenant capacities and workloads. Requires capacities.csv.
Table: Capacities¶
- Source:
capacities.csv - Available columns: Name (
displayName), SKU, State (state), Region (region)
Chart: Workspaces by Capacity (bar)¶
- Source:
workspaces.csv→capacityId; cross-reference withcapacities.csv→displayName - Rule:
value_counts()ofcapacityIdin workspaces, mapped to capacity name
Chart: Capacities by SKU (bar)¶
- Source:
capacities.csv→skucolumn - Rule:
value_counts()— P1, P2, F2, F64, A1, etc.
Workloads by State¶
- Source:
workloads.csv→statecolumn - Rule:
value_counts()— Enabled, Disabled, Unsupported
Note: Workloads are only collected for Gen1 capacities (P-SKU and A-SKU). Fabric capacities (F-SKU) do not expose workloads via the API.
7. Tenant Domains¶
List of configured domains and sub-domains. Requires domains.csv.
- Source:
domains.csv - Columns: Name (
displayName), Description (description), Parent Domain ID (parentDomainId) - Hierarchy rule: if
parentDomainIdis null/empty = Root; otherwise = Sub-domain - Limit: top 100 domains
8. Governance Findings¶
Prioritized list of alerts generated automatically based on collected data. Findings are ordered by severity and displayed even when data is partial.
| Severity | Color | Finding | Rule |
|---|---|---|---|
| CRITICAL | Red | Datasets without owner | configuredBy null or empty in datasets.csv |
| HIGH | Orange | External users with access | Emails with #EXT# in any access file |
| HIGH | Orange | Failed refreshes | status == "Failed" in refresh_history.csv |
| MEDIUM | Blue | Workspaces with 1 unique user | nunique(user_email) == 1 per workspace in workspace_access.csv |
| MEDIUM | Blue | Datasets without refresh in 30+ days | Most recent start_time > 30 days in refresh_history.csv |
| OK | Green | No critical findings | Displayed only when none of the above are detected |
Findings only appear when the required data is available. If
refresh_history.csvwas not collected, refresh-related findings will not be generated.
Data Requirements by Section¶
| Section | Required file(s) |
|---|---|
| Executive Summary | Any file in the folder |
| Inventory | workspaces.csv, summary.json, or artifact CSVs |
| Workspaces — Detail | workspaces.csv |
| Access & Governance | workspace_access.csv (minimum) |
| Refresh Health | refresh_history.csv |
| Infrastructure | capacities.csv |
| Domains | domains.csv |
| Findings | Any combination of the above |
Output¶
Two HTML files are generated in the data source folder (or the folder specified via --output):
The HTML file is standalone: all Plotly charts and Bootstrap CSS are loaded via CDN. Plotly JS is included once in the document <head>.
Limitations¶
- Missing sections: if a data file doesn't exist, the corresponding section displays a subtle notice ("data not available") instead of an error
- Table limits: 50 rows for failures and stale datasets; 20 for single-user workspaces; 10 for top workspaces/users
- Refresh history: the Power BI API returns only the most recent refreshes per dataset (see limitations)
- Workloads: available only for Gen1 capacities (P-SKU, A-SKU)
- Charts: require CDN connectivity for Plotly and Bootstrap to render correctly