Contributing Guide¶
Thank you for considering contributing to fabricgov! This guide will help you understand the project structure and how to add new collectors, exporters, or other improvements.
π Table of Contents¶
- Environment Setup
- Project Structure
- Code Conventions
- How to Add a New Collector
- How to Add Tests
- Review Process
- Commit Conventions
π οΈ Environment Setup¶
Prerequisites¶
- Python 3.12+
- Poetry 1.8+
- Git
Setup¶
# Clone the repository
git clone https://github.com/luhborba/fabricgov.git
cd fabricgov
# Install dependencies
poetry install
# Activate the virtual environment
poetry shell
# Run tests
poetry run pytest tests/ -v
Credential Configuration¶
Create a .env file in the project root:
FABRICGOV_TENANT_ID=your-tenant-id
FABRICGOV_CLIENT_ID=your-client-id
FABRICGOV_CLIENT_SECRET=your-client-secret
ποΈ Project Structure¶
fabricgov/
βββ fabricgov/
β βββ auth/ # Authentication module
β β βββ base.py # AuthProvider protocol
β β βββ service_principal.py
β β βββ device_flow.py
β βββ cli/ # CLI via Click
β β βββ main.py # Main `fabricgov` group
β β βββ auth.py # `fabricgov auth` commands
β β βββ collect.py # `fabricgov collect` commands
β β βββ report.py # `fabricgov report` command
β β βββ analyze.py # `fabricgov analyze` command
β β βββ session.py # Session management (`collect all`)
β βββ reporters/ # HTML report and data analysis
β β βββ insights.py # InsightsEngine β reads CSVs and computes metrics
β β βββ html_reporter.py # HtmlReporter β Plotly charts + Jinja2
β β βββ templates/ # Jinja2 templates
β βββ collectors/ # Data collectors (11 total)
β β βββ base.py # BaseCollector (retry, pagination, rate limiting)
β β βββ workspace_inventory.py
β β βββ workspace_access.py
β β βββ report_access.py
β β βββ dataset_access.py
β β βββ dataflow_access.py
β β βββ refresh_history.py
β β βββ refresh_schedule.py
β β βββ domain.py
β β βββ tag.py
β β βββ capacity.py
β β βββ workload.py
β βββ exporters/ # Result exporters
β β βββ file_exporter.py # JSON/CSV with run_dir support
β βββ config.py # Auth preference system
β βββ progress.py # ProgressManager (rich)
β βββ checkpoint.py # Checkpoint system
β βββ exceptions.py # Custom exceptions
βββ tests/
β βββ auth/ # Unit tests for the auth module
β βββ manual/ # Manual tests for development
β βββ pytest.ini
βββ docs/ # Documentation
β βββ en/ # English docs
β β βββ authentication.md
β β βββ collectors.md
β β βββ exporters.md
β β βββ limitations.md
β β βββ contributing.md
β βββ authentication.md # Portuguese docs
β βββ collectors.md
β βββ exporters.md
β βββ limitations.md
β βββ contributing.md
βββ pyproject.toml # Dependencies and Poetry configuration
βββ README.md
π Code Conventions¶
Code Style¶
We follow PEP 8 with some adaptations:
- Indentation: 4 spaces
- Max line length: 88 characters (Black default)
- Imports: grouped as stdlib β third-party β local
- Type hints: required on all public functions
Formatting¶
# Auto-format code
poetry run black fabricgov/ tests/
# Check style
poetry run flake8 fabricgov/ tests/
Docstrings¶
We use Google Style docstrings:
def collect(self) -> dict[str, Any]:
"""
Runs the full workspace inventory collection.
Returns:
Dictionary with workspaces, artifacts, and summary.
Raises:
ForbiddenError: if the SP lacks Admin permissions.
TimeoutError: if the scan exceeds max_poll_time.
"""
pass
π§ How to Add a New Collector¶
Step 1: Define the Domain¶
First, identify: - Which API will be used? (Fabric REST, Power BI REST, DAX query) - What data will be collected? - What is the recommended frequency? (daily, weekly, on-demand)
Step 2: Create the File¶
Step 3: Implement the Collector¶
Basic template:
from typing import Any
from fabricgov.auth.base import AuthProvider
from fabricgov.collectors.base import BaseCollector
class YourCollector(BaseCollector):
"""
Brief description of what this collector does.
API used: [API name]
Main endpoint: [endpoint]
Usage:
collector = YourCollector(auth=auth)
result = collector.collect()
"""
# Required OAuth2 scope
SCOPE = "https://api.fabric.microsoft.com/.default"
# or "https://analysis.windows.net/powerbi/api/.default"
def __init__(
self,
auth: AuthProvider,
**kwargs
):
"""
Args:
auth: Authentication provider
"""
# Set the correct base_url for the API
super().__init__(
auth=auth,
base_url="https://api.fabric.microsoft.com", # or powerbi.com
**kwargs
)
def collect(self) -> dict[str, Any]:
"""
Executes data collection.
Returns:
Structured dictionary with the collected data.
"""
# Simple GET example
response = self._get(
endpoint="/v1/your-endpoint",
scope=self.SCOPE,
params={"$top": 1000}
)
# GET with pagination example
items = self._paginate(
endpoint="/v1/your-endpoint",
scope=self.SCOPE,
params={"$top": 1000}
)
# Structure the result
return {
"items": items,
"summary": {
"total_items": len(items),
"collection_time": datetime.now().isoformat(),
}
}
Step 4: Expose in __init__.py¶
Edit fabricgov/collectors/__init__.py:
from fabricgov.collectors.base import BaseCollector
from fabricgov.collectors.workspace_inventory import WorkspaceInventoryCollector
from fabricgov.collectors.your_collector import YourCollector # Add
__all__ = [
"BaseCollector",
"WorkspaceInventoryCollector",
"YourCollector", # Add
]
Step 5: Create a Manual Test¶
Create tests/manual/test_your_collector.py:
from fabricgov.auth import ServicePrincipalAuth
from fabricgov.collectors import YourCollector
auth = ServicePrincipalAuth.from_env()
collector = YourCollector(auth=auth)
result = collector.collect()
print(f"Total items: {result['summary']['total_items']}")
Run it:
Step 6: Add Documentation¶
Add a section to docs/en/collectors.md describing:
- What the collector does
- Constructor parameters
- Output structure
- Usage examples
- Known limitations
π§ͺ How to Add Tests¶
Unit Tests¶
Create tests/collectors/test_your_collector.py:
import pytest
from unittest.mock import MagicMock
from fabricgov.collectors import YourCollector
@pytest.fixture(autouse=True)
def mock_http_client(mocker):
"""Mocks the HTTP client to avoid real API calls."""
mock_client = MagicMock()
mocker.patch("httpx.Client", return_value=mock_client)
return mock_client
class TestYourCollector:
def test_collect_returns_correct_structure(self, mock_http_client):
"""Validates that collect() returns the expected structure."""
# Arrange
mock_http_client.get.return_value.json.return_value = {
"value": [{"id": "item-1", "name": "Item 1"}]
}
mock_http_client.get.return_value.status_code = 200
auth = MagicMock()
auth.get_token.return_value = "fake-token"
collector = YourCollector(auth=auth)
# Act
result = collector.collect()
# Assert
assert "items" in result
assert "summary" in result
assert result["summary"]["total_items"] == 1
def test_collect_raises_on_403(self, mock_http_client):
"""Validates 403 error handling."""
# Arrange
import httpx
mock_response = MagicMock()
mock_response.status_code = 403
mock_response.text = '{"error": "Forbidden"}'
mock_http_client.get.return_value = mock_response
mock_http_client.get.return_value.raise_for_status.side_effect = (
httpx.HTTPStatusError("Forbidden", request=MagicMock(), response=mock_response)
)
auth = MagicMock()
auth.get_token.return_value = "fake-token"
collector = YourCollector(auth=auth)
# Act & Assert
from fabricgov.exceptions import ForbiddenError
with pytest.raises(ForbiddenError):
collector.collect()
Run tests:
π Review Process¶
Before Opening a Pull Request¶
-
Run tests:
-
Format code:
-
Validate type hints:
-
Test manually with real credentials
Pull Request Checklist¶
- [ ] Code is formatted (black)
- [ ] Unit tests added and passing
- [ ] Manual test executed successfully
- [ ] Documentation updated (
docs/en/collectors.mdor similar) - [ ]
__init__.pyupdated to expose new modules - [ ] Commit follows convention (see below)
What We Look for in a Review¶
- Clarity: Code is easy to understand
- Reuse: Makes use of
BaseCollectorfeatures - Error handling: Raises appropriate custom exceptions
- Performance: No unnecessary API calls
- Documentation: Complete docstrings and usage examples
π Commit Conventions¶
We follow Conventional Commits:
Types¶
featβ New featurefixβ Bug fixdocsβ Documentation changestestβ Adds or fixes testsrefactorβ Refactoring without changing functionalitychoreβ Maintenance tasks (build, CI, etc.)
Scopes¶
authβ Authentication modulereportersβ InsightsEngine, HtmlReporter, templatesanalyzeβfabricgov analyzecommandcollectorsβ Data collectorsexportersβ Exporterscliβ Command-line interfaceexceptionsβ Custom exceptionsdocsβ Documentation
Examples¶
# New feature
feat(collectors): add CapacityConsumptionCollector
# Bug fix
fix(auth): handle token expiration in ServicePrincipalAuth
# Documentation
docs(collectors): add examples for WorkspaceInventoryCollector
# Tests
test(auth): add unit tests for DeviceFlowAuth
# Refactoring
refactor(collectors): extract pagination logic to BaseCollector
# Maintenance
chore(deps): update msal to 1.35.0
π Reporting Bugs¶
Open an issue on GitHub with:
- Descriptive title: "ForbiddenError when collecting workspaces with SP"
- Python and fabricgov version
- Steps to reproduce
- Expected vs actual behavior
- Full traceback (without exposing credentials)
Template:
### Description
[short description of the problem]
### Environment
- Python: 3.12.2
- fabricgov: 0.6.0
- OS: Ubuntu 24.04
### Reproduction
1. Run `collector.collect()`
2. Observe 403 error
### Expected behavior
Should collect data without error
### Actual behavior
π‘ Contribution Ideas¶
Areas where contributions are especially welcome:
New governance findings (v0.9.0+)¶
- New finding types in
InsightsEngine._build_findings() - Snapshot comparison (
fabricgov diff) - Azure Key Vault integration for credentials
Exporters¶
- Export to Excel (.xlsx) with multiple sheets
- Azure Blob Storage integration
Documentation¶
- More real-world usage examples
- Troubleshooting guide
Tests¶
- Increase unit test coverage for collectors
- Integration tests with mocked API
π Contact¶
- Issues: github.com/luhborba/fabricgov/issues
- Discussions: github.com/luhborba/fabricgov/discussions
π License¶
By contributing, you agree that your contributions will be licensed under the MIT License.
Thank you for contributing to fabricgov!