top of page

Naming Convention Ideas for Microsoft Fabric Developments

Naming conventions are essential when creating data solutions, no matter the platform or artifact. I've seen various approaches in my career, but the one I followed most was Jamie Thomson's recommendations for SSIS, and I think these can also be applied to Fabric.


Anyone familiar with SSIS has likely encountered Jamie Thomson's Naming Conventions blog. Although I couldn't locate the original post, Koen has blogged about it and enhanced the original list—https://sqlkover.com/ssis-naming-conventions-2-0/. When I worked with SSIS, I strictly followed these naming conventions. Simplicity was the charm of this approach. Let's explore how this can be applied within Fabric.


Example

For example, let's assume my use case is to create an analytics project for a retail company. I have three tables: Sales, Products, and Customer. Imagine it's a data platform project with various data transformation stages, such as ingestion, transformation, and presentation layers. Below are my sample tables.

Product Table

  • Product_Name

  • Product_ID

Sales Table

  • Sales_ID

  • Product_ID

  • Customer_ID

  • DateTime

  • Amount

Customer Table

  • Customer_ID

  • Customer_Name

  • Customer_Address


Folder Naming

Workspace naming is probably more important than folders, but I think it will fall nicely under governance, and I will discuss it in another post. Let's assume all transformations occur in a single workspace, and I have Folders, Notebooks, Pipelines, Lakehouses, Semantic Models, and Reports. I will start with three folders for each data transformation stage: Ingest, Validate, and Curate. I am calling my project Data Nova Analytics(DNA) Sales. So, my folder naming will be something like this:

  • ING_DNA_Sales

  • VAL_DNA_Sales

  • CUR_DNA_Sales

Each of these folders will contain all items related to activities happening in that stage.


In typical situations, we have environments such as DEV, UAT, and PROD. In these cases, I will prefix the above with the appropriate environment name.

  • DEV_ING_DNA_Sales

  • DEV_VAL_DNA_Sales

  • DEV_CUR_DNA_Sales

But for this post, I am going to ignore environments.


Naming Conventions for Items

Next-level naming depends on the items I will be working with. Below are a few examples:

  • Lakehouse: ING_LH_DNA

  • Pipeline: ING_DP_DNA_Product_Sales

  • Notebook: ING_NB_DNA_Product_Dedupe

  • Semantic Model: ING_SM_DNA_Sales_MI

I am using capital letters for prefixes and key identifiers (e.g., ING, VAL, CUR) and capitalising the first letter of each word in the descriptive parts of the name (e.g., Product_Sales, Product_Dedupe). I think this helps maintain visibility and consistency but is still readable. Also, it works well in the presentation layer.


Items Within Lakehouse or Warehouse

The next level would be Items within. The same logic goes in there, too. However, at this level, I tend to remove the Data transformation layer prefixes like ING. Naming conventions help us understand what an item is doing. They will also be useful when we view the items in logging applications. In a Lakehouse or Warehouse, as a developer, we will already know which context we are working in, so I think we could skip the data transformation layer prefix.


Let's look into an example Lakehouse: ING_LH_DNA

Simple table names would be something like the one below.

  • DNA_Dim_Products

  • DNA_Fact_Sales

And Column Names, I will go for Snake Case with "_".

  • Product_Name

  • Product_ID


For some organisations, when we have Mesh architectures or solutions with multiple Domains or products, I would prefix them with "Domain" and "product."


E.g. - Domain_Product_LakeHouse representation _table type representation


Data Pipelines Naming

When it comes to Pipelines, I would blindly follow what I was doing with SSIS, which is the Accornm of the task name and description.

E.g.

  • Copy Data Activity: CD_SourceName_To_DestinationName

  • Semantic Model Refresh: SMDR_PROD_DNA_Sales


Presentation Layer Naming

I generally adhere to a straightforward rule: anything shared outside the developers' zone should not use developer naming conventions but should adhere to business naming conventions. Therefore, aside from the presentation layer, I consistently use full naming conventions. What you refer to as the presentation layer largely depends on your distribution approach.

Here are a few scenarios that come to mind:

  • Distribution solely through Power BI or Fabric App with view-only access, preventing updates to any reports or visualisations

  • Reports shared with Personalised Visuals or Data Exploration functionalities

  • General Self-Service BI where end users can create Semantic Models or reports


For the last scenario, it is common that end users wish to create their semantic models based on Lakehouse or develop reports on Semantic models. They want to understand which tables are Dimensions, facts, bridges, factless facts, etc. I prefer to offer documentation and include additional information in descriptions rather than incorporating Dim or Fact in table names.


Presentation Layer Lakehouses/Warehouses Naming

When it comes to Lakehouses/Warehouses, I use "_" in Table and column names; however, I remove them in Semantic Models. I use semantic Link labs to update the names. I will post a blog about it soon. However, here is a ref - https://github.com/microsoft/semantic-link-labs/blob/main/notebooks/Tabular%20Object%20Model.ipynb


For Self Service BI scenario, I would name Lakehouse: LH_DNA

For Fabric ORG APP distribution with Notebooks or View only Notebooks, I would name Notebook: DNA Sales Data

For Scenarios of Reporting development, I will name Semantic Models and Items within in model as below:

Semantic Model: DNA Sales MI

Tables Naming

  • Product

  • Sales

  • Customer

Columns Naming

  • Product ID

  • Product Name

  • Sales ID

  • Sale Date Time

  • Customer ID

  • Customer Name


Here are some examples of how it may look in Fabric:



Closing Thoughts

Naming conventions are a small but crucial part of creating effective data solutions. They help you maintain consistency. As a user, when you see an item, the name should work as an affordance. It could be in a workspace or a logging table. Johnny has a good blog about why naming conventions are important, and he also talks about some recommendations - https://www.advancinganalytics.co.uk/blog/2023/8/16/whats-in-a-name-naming-your-fabric-artifacts#identifying_artifact_types.


I didn't talk about each available item in Fabric, but I hope this blog post gives some ideas. Let me know what kind of naming conventions you use in your developments. Happy naming :)

2 comentarios


Greg Low
03 feb

In your semantic layer, why mix singular and plural names?

Me gusta
Prathy
24 feb
Contestando a

More of laziness for the blog post

Me gusta
bottom of page