Skip to content

Contributing Guide

Thank you for considering contributing to fabricgov! This guide will help you understand the project structure and how to add new collectors, exporters, or other improvements.


πŸ“‹ Table of Contents

  1. Environment Setup
  2. Project Structure
  3. Code Conventions
  4. How to Add a New Collector
  5. How to Add Tests
  6. Review Process
  7. Commit Conventions

πŸ› οΈ Environment Setup

Prerequisites

  • Python 3.12+
  • Poetry 1.8+
  • Git

Setup

# Clone the repository
git clone https://github.com/luhborba/fabricgov.git
cd fabricgov

# Install dependencies
poetry install

# Activate the virtual environment
poetry shell

# Run tests
poetry run pytest tests/ -v

Credential Configuration

Create a .env file in the project root:

FABRICGOV_TENANT_ID=your-tenant-id
FABRICGOV_CLIENT_ID=your-client-id
FABRICGOV_CLIENT_SECRET=your-client-secret


πŸ—οΈ Project Structure

fabricgov/
β”œβ”€β”€ fabricgov/
β”‚   β”œβ”€β”€ auth/                  # Authentication module
β”‚   β”‚   β”œβ”€β”€ base.py            # AuthProvider protocol
β”‚   β”‚   β”œβ”€β”€ service_principal.py
β”‚   β”‚   └── device_flow.py
β”‚   β”œβ”€β”€ cli/                   # CLI via Click
β”‚   β”‚   β”œβ”€β”€ main.py            # Main `fabricgov` group
β”‚   β”‚   β”œβ”€β”€ auth.py            # `fabricgov auth` commands
β”‚   β”‚   β”œβ”€β”€ collect.py         # `fabricgov collect` commands
β”‚   β”‚   β”œβ”€β”€ report.py          # `fabricgov report` command
β”‚   β”‚   β”œβ”€β”€ analyze.py         # `fabricgov analyze` command
β”‚   β”‚   └── session.py         # Session management (`collect all`)
β”‚   β”œβ”€β”€ reporters/             # HTML report and data analysis
β”‚   β”‚   β”œβ”€β”€ insights.py        # InsightsEngine β€” reads CSVs and computes metrics
β”‚   β”‚   β”œβ”€β”€ html_reporter.py   # HtmlReporter β€” Plotly charts + Jinja2
β”‚   β”‚   └── templates/         # Jinja2 templates
β”‚   β”œβ”€β”€ collectors/            # Data collectors (11 total)
β”‚   β”‚   β”œβ”€β”€ base.py            # BaseCollector (retry, pagination, rate limiting)
β”‚   β”‚   β”œβ”€β”€ workspace_inventory.py
β”‚   β”‚   β”œβ”€β”€ workspace_access.py
β”‚   β”‚   β”œβ”€β”€ report_access.py
β”‚   β”‚   β”œβ”€β”€ dataset_access.py
β”‚   β”‚   β”œβ”€β”€ dataflow_access.py
β”‚   β”‚   β”œβ”€β”€ refresh_history.py
β”‚   β”‚   β”œβ”€β”€ refresh_schedule.py
β”‚   β”‚   β”œβ”€β”€ domain.py
β”‚   β”‚   β”œβ”€β”€ tag.py
β”‚   β”‚   β”œβ”€β”€ capacity.py
β”‚   β”‚   └── workload.py
β”‚   β”œβ”€β”€ exporters/             # Result exporters
β”‚   β”‚   └── file_exporter.py   # JSON/CSV with run_dir support
β”‚   β”œβ”€β”€ config.py              # Auth preference system
β”‚   β”œβ”€β”€ progress.py            # ProgressManager (rich)
β”‚   β”œβ”€β”€ checkpoint.py          # Checkpoint system
β”‚   └── exceptions.py          # Custom exceptions
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ auth/                  # Unit tests for the auth module
β”‚   β”œβ”€β”€ manual/                # Manual tests for development
β”‚   └── pytest.ini
β”œβ”€β”€ docs/                      # Documentation
β”‚   β”œβ”€β”€ en/                    # English docs
β”‚   β”‚   β”œβ”€β”€ authentication.md
β”‚   β”‚   β”œβ”€β”€ collectors.md
β”‚   β”‚   β”œβ”€β”€ exporters.md
β”‚   β”‚   β”œβ”€β”€ limitations.md
β”‚   β”‚   └── contributing.md
β”‚   β”œβ”€β”€ authentication.md      # Portuguese docs
β”‚   β”œβ”€β”€ collectors.md
β”‚   β”œβ”€β”€ exporters.md
β”‚   β”œβ”€β”€ limitations.md
β”‚   └── contributing.md
β”œβ”€β”€ pyproject.toml             # Dependencies and Poetry configuration
└── README.md

πŸ“ Code Conventions

Code Style

We follow PEP 8 with some adaptations:

  • Indentation: 4 spaces
  • Max line length: 88 characters (Black default)
  • Imports: grouped as stdlib β†’ third-party β†’ local
  • Type hints: required on all public functions

Formatting

# Auto-format code
poetry run black fabricgov/ tests/

# Check style
poetry run flake8 fabricgov/ tests/

Docstrings

We use Google Style docstrings:

def collect(self) -> dict[str, Any]:
    """
    Runs the full workspace inventory collection.

    Returns:
        Dictionary with workspaces, artifacts, and summary.

    Raises:
        ForbiddenError: if the SP lacks Admin permissions.
        TimeoutError: if the scan exceeds max_poll_time.
    """
    pass


πŸ”§ How to Add a New Collector

Step 1: Define the Domain

First, identify: - Which API will be used? (Fabric REST, Power BI REST, DAX query) - What data will be collected? - What is the recommended frequency? (daily, weekly, on-demand)

Step 2: Create the File

touch fabricgov/collectors/your_collector.py

Step 3: Implement the Collector

Basic template:

from typing import Any
from fabricgov.auth.base import AuthProvider
from fabricgov.collectors.base import BaseCollector


class YourCollector(BaseCollector):
    """
    Brief description of what this collector does.

    API used: [API name]
    Main endpoint: [endpoint]

    Usage:
        collector = YourCollector(auth=auth)
        result = collector.collect()
    """

    # Required OAuth2 scope
    SCOPE = "https://api.fabric.microsoft.com/.default"
    # or "https://analysis.windows.net/powerbi/api/.default"

    def __init__(
        self,
        auth: AuthProvider,
        **kwargs
    ):
        """
        Args:
            auth: Authentication provider
        """
        # Set the correct base_url for the API
        super().__init__(
            auth=auth,
            base_url="https://api.fabric.microsoft.com",  # or powerbi.com
            **kwargs
        )

    def collect(self) -> dict[str, Any]:
        """
        Executes data collection.

        Returns:
            Structured dictionary with the collected data.
        """
        # Simple GET example
        response = self._get(
            endpoint="/v1/your-endpoint",
            scope=self.SCOPE,
            params={"$top": 1000}
        )

        # GET with pagination example
        items = self._paginate(
            endpoint="/v1/your-endpoint",
            scope=self.SCOPE,
            params={"$top": 1000}
        )

        # Structure the result
        return {
            "items": items,
            "summary": {
                "total_items": len(items),
                "collection_time": datetime.now().isoformat(),
            }
        }

Step 4: Expose in __init__.py

Edit fabricgov/collectors/__init__.py:

from fabricgov.collectors.base import BaseCollector
from fabricgov.collectors.workspace_inventory import WorkspaceInventoryCollector
from fabricgov.collectors.your_collector import YourCollector  # Add

__all__ = [
    "BaseCollector",
    "WorkspaceInventoryCollector",
    "YourCollector",  # Add
]

Step 5: Create a Manual Test

Create tests/manual/test_your_collector.py:

from fabricgov.auth import ServicePrincipalAuth
from fabricgov.collectors import YourCollector

auth = ServicePrincipalAuth.from_env()
collector = YourCollector(auth=auth)
result = collector.collect()

print(f"Total items: {result['summary']['total_items']}")

Run it:

poetry run python tests/manual/test_your_collector.py

Step 6: Add Documentation

Add a section to docs/en/collectors.md describing: - What the collector does - Constructor parameters - Output structure - Usage examples - Known limitations


πŸ§ͺ How to Add Tests

Unit Tests

Create tests/collectors/test_your_collector.py:

import pytest
from unittest.mock import MagicMock
from fabricgov.collectors import YourCollector


@pytest.fixture(autouse=True)
def mock_http_client(mocker):
    """Mocks the HTTP client to avoid real API calls."""
    mock_client = MagicMock()
    mocker.patch("httpx.Client", return_value=mock_client)
    return mock_client


class TestYourCollector:

    def test_collect_returns_correct_structure(self, mock_http_client):
        """Validates that collect() returns the expected structure."""
        # Arrange
        mock_http_client.get.return_value.json.return_value = {
            "value": [{"id": "item-1", "name": "Item 1"}]
        }
        mock_http_client.get.return_value.status_code = 200

        auth = MagicMock()
        auth.get_token.return_value = "fake-token"

        collector = YourCollector(auth=auth)

        # Act
        result = collector.collect()

        # Assert
        assert "items" in result
        assert "summary" in result
        assert result["summary"]["total_items"] == 1

    def test_collect_raises_on_403(self, mock_http_client):
        """Validates 403 error handling."""
        # Arrange
        import httpx
        mock_response = MagicMock()
        mock_response.status_code = 403
        mock_response.text = '{"error": "Forbidden"}'

        mock_http_client.get.return_value = mock_response
        mock_http_client.get.return_value.raise_for_status.side_effect = (
            httpx.HTTPStatusError("Forbidden", request=MagicMock(), response=mock_response)
        )

        auth = MagicMock()
        auth.get_token.return_value = "fake-token"

        collector = YourCollector(auth=auth)

        # Act & Assert
        from fabricgov.exceptions import ForbiddenError
        with pytest.raises(ForbiddenError):
            collector.collect()

Run tests:

poetry run pytest tests/collectors/test_your_collector.py -v


πŸ” Review Process

Before Opening a Pull Request

  1. Run tests:

       poetry run pytest tests/ -v
    

  2. Format code:

       poetry run black fabricgov/ tests/
    

  3. Validate type hints:

       poetry run mypy fabricgov/
    

  4. Test manually with real credentials

Pull Request Checklist

  • [ ] Code is formatted (black)
  • [ ] Unit tests added and passing
  • [ ] Manual test executed successfully
  • [ ] Documentation updated (docs/en/collectors.md or similar)
  • [ ] __init__.py updated to expose new modules
  • [ ] Commit follows convention (see below)

What We Look for in a Review

  • Clarity: Code is easy to understand
  • Reuse: Makes use of BaseCollector features
  • Error handling: Raises appropriate custom exceptions
  • Performance: No unnecessary API calls
  • Documentation: Complete docstrings and usage examples

πŸ“ Commit Conventions

We follow Conventional Commits:

<type>(<scope>): <short description>

<optional body>

Types

  • feat β€” New feature
  • fix β€” Bug fix
  • docs β€” Documentation changes
  • test β€” Adds or fixes tests
  • refactor β€” Refactoring without changing functionality
  • chore β€” Maintenance tasks (build, CI, etc.)

Scopes

  • auth β€” Authentication module
  • reporters β€” InsightsEngine, HtmlReporter, templates
  • analyze β€” fabricgov analyze command
  • collectors β€” Data collectors
  • exporters β€” Exporters
  • cli β€” Command-line interface
  • exceptions β€” Custom exceptions
  • docs β€” Documentation

Examples

# New feature
feat(collectors): add CapacityConsumptionCollector

# Bug fix
fix(auth): handle token expiration in ServicePrincipalAuth

# Documentation
docs(collectors): add examples for WorkspaceInventoryCollector

# Tests
test(auth): add unit tests for DeviceFlowAuth

# Refactoring
refactor(collectors): extract pagination logic to BaseCollector

# Maintenance
chore(deps): update msal to 1.35.0

πŸ› Reporting Bugs

Open an issue on GitHub with:

  1. Descriptive title: "ForbiddenError when collecting workspaces with SP"
  2. Python and fabricgov version
  3. Steps to reproduce
  4. Expected vs actual behavior
  5. Full traceback (without exposing credentials)

Template:

### Description
[short description of the problem]

### Environment
- Python: 3.12.2
- fabricgov: 0.6.0
- OS: Ubuntu 24.04

### Reproduction
1. Run `collector.collect()`
2. Observe 403 error

### Expected behavior
Should collect data without error

### Actual behavior
ForbiddenError: [403] Access denied...
### Additional context
The SP has Tenant.Read.All permissions configured.


πŸ’‘ Contribution Ideas

Areas where contributions are especially welcome:

New governance findings (v0.9.0+)

  • New finding types in InsightsEngine._build_findings()
  • Snapshot comparison (fabricgov diff)
  • Azure Key Vault integration for credentials

Exporters

  • Export to Excel (.xlsx) with multiple sheets
  • Azure Blob Storage integration

Documentation

  • More real-world usage examples
  • Troubleshooting guide

Tests

  • Increase unit test coverage for collectors
  • Integration tests with mocked API

πŸ“ž Contact


πŸ“„ License

By contributing, you agree that your contributions will be licensed under the MIT License.


Thank you for contributing to fabricgov!


← Back: Exporters | Back to README β†’