top of page

A Logging Framework for Microsoft Fabric

  • Feb 3
  • 4 min read

I was working on a client project with multiple Fabric notebooks running ETL pipelines, and we had no good way to track what was happening across the different processes. Each notebook was doing its own thing, and when something went wrong, it took forever to figure out where the problem started.

The client kept asking basic questions: How many records did we process yesterday? Which tables failed to load? What's our success rate this week? We had no answers because the logging was scattered everywhere.

So I built a simple framework that creates everything you need for monitoring in one go. One Python file, drop it into your notebook's Resources folder, and you get complete operational The concept behind this approach is to create logging for one project at a time rather than implementing monitoring for an entire tenant or workspace. The code is designed to be driven by the Project Name, while the solution functions as a framework.


The problem I kept running into

Working with different clients on Fabric projects, I see the same monitoring issues over and over:

  • Teams build custom logging tables for each project

  • No standard way to track operations across notebooks

  • When a data pipeline fails, there's no central place to see what happened

  • Building monitoring infrastructure takes weeks of development time

  • Each developer does logging differently

I wanted something I could deploy on day one of any Fabric project that would immediately give the team visibility into their data operations.


What it creates

The framework is one Python file that you upload to your notebook's Resources folder. When you initialise it, it automatically creates:

  • A lakehouse for monitoring data (or uses your default lakehouse if one is attached)

  • Tables for tracking operations and time dimensions

  • A Power BI semantic model with relationships already set up

  • DAX measures for common monitoring needs

Here's how simple it is to use:

from builtin.fabric_logging_utils import FabricLogger

# Creates monitoring infrastructure automatically
logger = FabricLogger("MyProject")

# Log any data operation
logger.log_operation(
    notebook_name="DailyETL",
    table_name="sales_data",
    operation_type="INSERT",
    rows_before=1000,
    rows_after=1500,
    execution_time=2.3,
    message="Daily load completed"
)

# See what happened recently
logger.show_recent(10)

That's it. minimum configuration, no setup wizards.


What gets created automatically

Component

Name

Description

Lakehouse

LH_{ProjectName}_Monitoring

Storage for monitoring data

Fact Table

monitoring_log

All logged operations

Date Dimension

dim_date

4 years of dates with attributes

Time Dimension

dim_time

1,440 time slots (every minute)

Semantic Model

SM_{ProjectName}_Monitoring

Direct Lake model with relationships and measures

The semantic model comes with 8 pre-built measures: Total Operations, Total Rows Changed, Average Execution Time, Error Count, Success Rate, Operations Today, Unique Tables, and Unique Notebooks. Obviously, you can expand this and the whole framework.



Use cases I see most often


  • ETL pipeline health monitoring One client runs 15 different notebooks for their daily data refresh. Now they have a dashboard showing which processes completed successfully, how long each took, and where failures occurred. The operations team checks this first thing every morning. Instead of adding resources to each Notebook, you can add them to the environment.

  • Data quality tracking Another client uses this to monitor data quality across their medallion architecture. They log row counts at each bronze, silver, and gold layer transformation. When row counts don't match expectations, they get immediate visibility into where the issue happened.

  • Performance optimisation Teams use the execution time tracking to identify slow-running processes. I had one client discover that a particular transformation was taking 45 minutes when it was expected to take 5. The logging data helped them pinpoint exactly which notebook was the bottleneck and when the impact had started.


Getting started

  1. Download fabric_logging_utils.py from the GitHub repo

  2. Upload it to your notebook's Resources folder

  3. Import and initialise:

from builtin.fabric_logging_utils import FabricLogger
logger = FabricLogger("YourProjectName")

That's all the setup you need. The framework handles creating tables, relationships, and semantic models. You focus on your data processing, and it takes care of the monitoring.





Potential enhancements

The framework includes 8 standard measures, but you can easily add project-specific ones. I often add measures for business-specific metrics like "Revenue Impact of Failed Loads" or "Customer Records Processed."

Other ideas I've seen teams implement:

  • Connect to Data Activator or Power Automate for automatic notifications when error rates spike

  • Combine execution time data with Fabric compute costs to understand which processes are most expensive

  • Deploy across dev, test, and production to compare performance across environments


What teams tell me

The most common feedback is that this changes how teams think about their data processes. Instead of finding out about problems when users complain, they have proactive visibility into what's happening.


Operations teams love having a single place to check pipeline health. Data engineers appreciate not having to build monitoring infrastructure for every project. Business stakeholders finally have visibility into the reliability of their data processes.


What began as a solution for a monitoring issue for one client has evolved into a tool I utilise for every Fabric project. I've made this open source in the hope that more people can find inspiration from it.


The Microsoft Fabric Logging Framework is available at: github.com/prathyusha-kamasani/Microsoft-Fabric-Logging-Framework


Until next time,

Prathy 🙂


Comments


bottom of page