Direct Lake on OneLake just went GA. Here's how I'd audit and migrate.

9 hours ago
9 min read

A client asked me about Direct Lake and the SQL endpoint connection, and it reminded me that Direct Lake on OneLake went generally available this month and I hadn't explored it much. I was prepping the newsletter and felt this needed a blog post rather than a section in it. It also gives me something to point at next time it comes up in a client meeting.

Two flavours of Direct Lake

There are two ways a Direct Lake semantic model can connect to its data.

Direct Lake on SQL endpoint is the original. The model goes through the lakehouse or warehouse SQL analytics endpoint to discover Delta tables and check permissions. If a query can't be served by Direct Lake, it falls back to DirectQuery via that same SQL endpoint. You can control the fallback behaviour in the semantic model settings.

Direct Lake on OneLake reads Delta tables directly from OneLake. It doesn't fall back to DirectQuery via the SQL endpoint. It also can't connect to SQL views unless they're materialised as Delta tables. Queries are processed by the VertiPaq engine, with only the columns needed for a query loaded into memory.

Direct Lake on SQL endpoint vs Direct Lake on OneLake architecture

One more thing worth flagging here. Power BI's April release introduced user-context-aware calculated columns in preview, where calculated columns can respond to USERPRINCIPALNAME(), USERCULTURE(), and CUSTOMDATA() at query time. It works on Direct Lake on OneLake, Import, and DirectQuery. More on it in the May edition, once I've had a proper play with it.

The lag question

The client's first question was about the lag. New tables take ages to show up in the semantic model. What about that?

That delay is the SQL analytics endpoint metadata sync. When a new Delta table is written to the lakehouse, the SQL endpoint has to discover it via a background sync before it becomes visible. Microsoft says this is usually under a minute, but in practice, with heavy ETL or multiple lakehouses in one workspace, it can run to many minutes.

Direct Lake on SQL inherits this delay because it goes through the SQL endpoint. Direct Lake on OneLake doesn't, it talks to OneLake directly. Whether that means new tables actually show up faster in practice, I don't have a Microsoft benchmark for it, and I haven't tested it on a workspace with large models yet. If you have, I'd love to hear what you saw.

The security question

This is the bigger one. Skipping the SQL endpoint isn't just a performance thing, it changes the security model. That's my take.

Direct Lake on SQL uses the SQL analytics endpoint for permission checks. If the underlying lakehouse or warehouse has SQL endpoint OLS or RLS defined, Direct Lake on SQL falls back to DirectQuery to honour those rules. If DirectQuery fallback is disabled, those queries fail.

Direct Lake on OneLake doesn't use the SQL endpoint for permission checks. It uses OneLake Security. When OneLake Security is enabled, Direct Lake on OneLake resolves OneLake Security roles using the current user (or a fixed identity) and enforces OLS and RLS on the underlying Fabric artifact. If OneLake Security isn't enabled, Direct Lake on OneLake requires the effective identity to have Read and ReadAll permissions on the target artifact to access its Delta tables.

In practice, moving a model from DL on SQL to DL on OneLake means moving from SQL endpoint security to OneLake security. If the model relied on SQL endpoint OLS or RLS, those rules don't follow it across. You need OneLake Security configured (or RLS defined inside the semantic model itself) for the equivalent protection.

This is what you actually need to look at before migrating. Performance is easy to swap and roll back, security needs more thought.

When you'd move, and when you wouldn't

Moving to Direct Lake on OneLake is the right call when:

You want multi-source models. Direct Lake on OneLake can mix tables from Lakehouse A, Lakehouse B, and Warehouse C in one model. Direct Lake on SQL is single-item only. What I've been doing is creating shortcuts from the source lakehouses into a single lakehouse, then using just that one in my semantic model. Multi-source removes that middle step, which I can see being the cleaner option for a lot of people.
You want a clean Direct Lake path without SQL endpoint quirks (sync lag, RLS fallback to DirectQuery, view fallback to DirectQuery).
Your security model is OneLake Security based, or RLS lives inside the semantic model itself.

You'd stay on Direct Lake on SQL when:

You rely on SQL views in the model. Direct Lake on OneLake won't see views unless they're materialised as Delta tables (and even on DL on SQL, views fall back to DirectQuery, so this matters when DirectQuery fallback is what you actually want).
You need DirectQuery fallback. Direct Lake on OneLake has none. If a query can't fit Direct Lake, it fails.
You depend on SQL endpoint OLS or RLS (column-level or row-level security defined at the warehouse SQL layer) and you don't want to redefine those rules in OneLake Security or in the semantic model.

Migrating an existing model

Two options.

Manual, in Power BI Desktop. Live-edit the model, open the TMDL view, find the connection expression at the bottom of the script, and swap it. Microsoft documented this path when announcing GA. Fine for one or two models.

Programmatic, with `semantic-link-labs`. This scales. The update_direct_lake_model_connection() function takes a use_sql_endpoint parameter. Set it to False and the model is re-pointed to the OneLake connection.

Behind the scenes, it generates a fresh M expression: AzureStorage.DataLake("https://onelake.dfs.fabric.microsoft.com/{workspace_id}/{item_id}") instead of Sql.Database(...), and writes it back to the model via TOM.

From here on, it's the actual workflow: audit, validate, migrate.

Migration workflow: audit, validate helper, migrate one, migrate bulk

Step 1: audit, find what's where

The audit lists every Direct Lake model in your workspace and tells you which ones are still on the SQL endpoint. It's safe to copy-paste and run as-is, it makes no changes.

%pip install semantic-link-labs --quiet

import sempy.fabric as fabric
import sempy_labs as labs
from sempy_labs.tom import connect_semantic_model
import pandas as pd
import time

WORKSPACE = fabric.get_workspace_id()
VERBOSE = True       # set False for quieter output on large workspaces
MAX_MODELS = None    # set to a number (e.g. 10) for first-run testing

datasets = fabric.list_datasets(workspace=WORKSPACE, mode="rest")
if MAX_MODELS:
    datasets = datasets.head(MAX_MODELS)

total = len(datasets)
cap_note = f" (capped at {MAX_MODELS})" if MAX_MODELS else ""
print(f"Auditing {total} semantic models in workspace {WORKSPACE}{cap_note}")
print("-" * 60)

audit = []
counts = {"dl_sql": 0, "dl_onelake": 0, "not_dl": 0, "errored": 0}
errors = []
start_all = time.time()

for i, (_, row) in enumerate(datasets.iterrows(), start=1):
    name = row["Dataset Name"]
    t0 = time.time()
    try:
        sources = labs.directlake.get_direct_lake_sources(
            dataset=name, workspace=WORKSPACE
        )
        elapsed = time.time() - t0

        if not sources:
            counts["not_dl"] += 1
            if VERBOSE:
                print(f"[{i}/{total}] {name}: not Direct Lake, skipped ({elapsed:.1f}s)")
            continue

        for s in sources:
            uses_sql = s["usesSqlEndpoint"]
            counts["dl_sql" if uses_sql else "dl_onelake"] += 1
            audit.append({
                "model": name,
                "source": s["itemName"],
                "sourceType": s["itemType"],
                "usesSqlEndpoint": uses_sql,
            })
        if VERBOSE:
            flavour = "SQL endpoint" if any(s["usesSqlEndpoint"] for s in sources) else "OneLake"
            print(f"[{i}/{total}] {name}: Direct Lake on {flavour} ({elapsed:.1f}s)")

    except Exception as e:
        counts["errored"] += 1
        errors.append({"model": name, "error": str(e)})
        elapsed = time.time() - t0
        if VERBOSE:
            print(f"[{i}/{total}] {name}: ⚠️  errored ({elapsed:.1f}s): {e}")

elapsed_all = time.time() - start_all
print("-" * 60)
print(f"Done in {elapsed_all:.1f}s. Summary:")
print(f"  Direct Lake on SQL endpoint : {counts['dl_sql']} sources")
print(f"  Direct Lake on OneLake      : {counts['dl_onelake']} sources")
print(f"  Not Direct Lake (skipped)   : {counts['not_dl']} models")
print(f"  Errored                     : {counts['errored']} models")

audit_df = pd.DataFrame(audit)
errors_df = pd.DataFrame(errors)

print("\nAudit:")
display(audit_df)

if not errors_df.empty:
    print("\nErrors (worth a look, usually default semantic models or models in an unusual state):")
    display(errors_df)

A few notes on the knobs:

`VERBOSE` prints one line per model. On a large workspace you'll want it off, the summary still prints.
`MAX_MODELS` caps the loop for first-run testing. Leave it None for a full pass.
The errored bucket usually contains default semantic models (auto-created on lakehouses and warehouses). They throw on TOM property lookups. Skipping them is the right behaviour.

Step 2: validate, define a helper before you migrate anything

Before any migration, define this helper. Both migration blocks call it. It checks that the swap actually took effect, by inspecting both the SLL discovery layer and the M expression body in the model itself.

def validate_migration(dataset, workspace, expected_sql_endpoint=False):
    """
    Confirms a Direct Lake model is on the expected flavour after a migration.
    Returns (ok: bool, details: dict).
    """
    try:
        sources = labs.directlake.get_direct_lake_sources(
            dataset=dataset, workspace=workspace
        )
    except Exception as e:
        return False, {"check": "get_direct_lake_sources", "error": str(e)}

    if not sources:
        return False, {"check": "get_direct_lake_sources", "error": "no DL sources found"}

    flag_ok = all(s["usesSqlEndpoint"] == expected_sql_endpoint for s in sources)

    expected_marker = "Sql.Database" if expected_sql_endpoint else "AzureStorage.DataLake"
    expression_ok = True
    expression_bodies = {}
    with connect_semantic_model(dataset=dataset, workspace=workspace) as tom:
        for e in tom.model.Expressions:
            expression_bodies[e.Name] = e.Expression
            if expected_marker not in e.Expression:
                expression_ok = False

    return (flag_ok and expression_ok), {
        "flagOk": flag_ok,
        "expressionOk": expression_ok,
        "sources": sources,
        "expressions": expression_bodies,
    }

Two checks. flagOk confirms the SLL discovery layer reports usesSqlEndpoint=False. expressionOk confirms the actual M expression body contains AzureStorage.DataLake. Both pass = the swap is real.

Step 3: migrate, one model first, always

Pick a single model from audit_df (one with usesSqlEndpoint=True). Do this first, validate it, look at the bound report, and only then think about going wider.

# ⚠️ READ FIRST.
#   - Pick a single model from audit_df above (usesSqlEndpoint=True)
#   - Run labs.directlake.show_unsupported_direct_lake_objects(MODEL, WORKSPACE) first
#   - Refresh and spot-check post-migration before doing more
#
# MODEL_TO_MIGRATE = "ReleasePlan"   # ← set this
#
# row = audit_df[audit_df["model"] == MODEL_TO_MIGRATE].iloc[0]
# labs.directlake.update_direct_lake_model_connection(
#     dataset=row["model"],
#     workspace=WORKSPACE,
#     source=row["source"],
#     source_type=row["sourceType"],
#     use_sql_endpoint=False,
# )
#
# ok, details = validate_migration(MODEL_TO_MIGRATE, WORKSPACE, expected_sql_endpoint=False)
# if ok:
#     print(f"✅ {MODEL_TO_MIGRATE} validated on Direct Lake on OneLake")
# else:
#     print(f"⚠️  {MODEL_TO_MIGRATE} validation failed:")
#     print(f"   flagOk={details.get('flagOk')}, expressionOk={details.get('expressionOk')}")
#     print(f"   {details}")

I've kept the migration commented out deliberately. Anyone copy-pasting gets the audit by default. The migration is an active choice.

Step 4: migrate in bulk, only after a successful single

Once you've migrated one or two models manually, validated them, and confirmed bound reports still work, the bulk loop is a reasonable next step. Same safety pattern: commented out, with a per-model validation built in.

# ⚠️ READ FIRST. THIS IS A BULK CHANGE.
#   - Confirm every model in to_migrate is OK to flip
#   - Run a single-model migration first to validate the pattern in your env
#   - Multi-expression models will print a message and skip themselves
#   - Cross-workspace sources need source_workspace passed explicitly
#
# to_migrate = audit_df[audit_df["usesSqlEndpoint"] == True]
# print(f"About to migrate {len(to_migrate)} sources across "
#       f"{to_migrate['model'].nunique()} models.")
#
# results = {"migrated_ok": 0, "migrated_validate_failed": 0, "migration_failed": 0}
# fail_log = []
# validation_log = []
#
# for i, (_, row) in enumerate(to_migrate.iterrows(), start=1):
#     model = row["model"]
#     try:
#         labs.directlake.update_direct_lake_model_connection(
#             dataset=model,
#             workspace=WORKSPACE,
#             source=row["source"],
#             source_type=row["sourceType"],
#             use_sql_endpoint=False,
#         )
#         ok, details = validate_migration(model, WORKSPACE, expected_sql_endpoint=False)
#         if ok:
#             results["migrated_ok"] += 1
#             print(f"[{i}/{len(to_migrate)}] ✅ {model} migrated and validated")
#         else:
#             results["migrated_validate_failed"] += 1
#             validation_log.append({
#                 "model": model,
#                 "flagOk": details.get("flagOk"),
#                 "expressionOk": details.get("expressionOk"),
#             })
#             print(f"[{i}/{len(to_migrate)}] ⚠️  {model} migrated but validation failed")
#
#     except Exception as e:
#         results["migration_failed"] += 1
#         fail_log.append({"model": model, "source": row["source"], "error": str(e)})
#         print(f"[{i}/{len(to_migrate)}] ⚠️  {model} migration error: {e}")
#
# print(f"\nDone.")
# print(f"  Migrated and validated   : {results['migrated_ok']}")
# print(f"  Migrated, validate failed: {results['migrated_validate_failed']}")
# print(f"  Migration errored        : {results['migration_failed']}")
#
# if fail_log:
#     print("\nMigration failures:")
#     display(pd.DataFrame(fail_log))
# if validation_log:
#     print("\nValidation failures (the swap ran but didn't take):")
#     display(pd.DataFrame(validation_log))

A few things to note

Expression name stays the same. After migration, the expression keeps its old name (DatabaseQuery is conventional for the SQL endpoint expression), but the body inside it has been rewritten to the OneLake connector. Cosmetic only, but the leftover name can confuse you when you open the model in TMDL view.
Multi-expression models need the `tables` parameter. A model with more than one source expression (multiple lakehouses or warehouses) won't migrate with a single function call. The function returns a message asking you to pass the table names explicitly.
Cross-workspace sources. get_direct_lake_sources() returns a workspaceName per source. A model in Workspace A can have sources in Workspace B. The migration call needs source_workspace set explicitly in that case.
Validate measures and relationships post-migration. The SLL source has a fix specifically for the reverse direction (OneLake to SQL) where measures and relationships can otherwise disappear. The forward direction looks safer, but I haven't tested that path properly. Refresh the model and run a few representative DAX queries before you trust it.
Pre-flight helpers. SLL exports show_unsupported_direct_lake_objects() and list_direct_lake_model_calc_tables(). Run both per model before migrating to flag what won't survive the move.

Where I've landed

After that client conversation, I'd start new semantic models on Direct Lake on OneLake. I'd only step back to Direct Lake on SQL when the security model or SQL views force me to.

For existing models, audit before you decide, don't bulk flip. Run the audit first. The security change is what decides whether you can flip a given model at all.

If you run the audit on one of your workspaces and find something that surprises you, drop me a note. I'll add the pattern to a future post.

Until next time,

Prathy :)

DATA NOVA