Resource Types¶

Supported Types¶

Data Engineering¶

Resource	Type Key	API Type	Definitions
Lakehouse	`lakehouses`	Lakehouse	Shortcuts, tables, schemas
Notebook	`notebooks`	Notebook	.py, .ipynb, .sql, .scala, .r
Environment	`environments`	SparkEnvironment	Runtime, libraries, Spark config
Spark Job Definition	`spark_job_definitions`	SparkJobDefinition	.py, .jar
GraphQL API	`graphql_apis`	GraphQLApi	Schema file
Snowflake Database	`snowflake_databases`	SnowflakeDatabase	Connection-based

Data Factory¶

Resource	Type Key	API Type	Definitions
Data Pipeline	`pipelines`	DataPipeline	YAML activities or JSON
Copy Job	`copy_jobs`	CopyJob	JSON definition
Mounted Data Factory	`mounted_data_factories`	MountedDataFactory	Metadata
Apache Airflow Job	`airflow_jobs`	ApacheAirflowJob	DAG file
dbt Job	`dbt_jobs`	DataBuildToolJob	dbt project

Data Warehouse¶

Resource	Type Key	API Type	Definitions
Warehouse	`warehouses`	Warehouse	SQL scripts
SQL Database	`sql_databases`	SQLDatabase	SQL scripts
Mirrored Database	`mirrored_databases`	MirroredDatabase	Connection-based
Mirrored Warehouse	`mirrored_warehouses`	MirroredWarehouse	List-only — cannot be created via API
Mirrored Databricks Catalog	`mirrored_databricks_catalogs`	MirroredAzureDatabricksCatalog	Connection-based
Cosmos DB Database	`cosmosdb_databases`	CosmosDBDatabase	Connection-based
Datamart	`datamarts`	Datamart	List-only — cannot be created via API

Power BI¶

Resource	Type Key	API Type	Definitions
Semantic Model	`semantic_models`	SemanticModel	TMDL or TMSL
Report	`reports`	Report	PBIR format
Paginated Report	`paginated_reports`	PaginatedReport	List-only — cannot be created via API
Dashboard	`dashboards`	Dashboard	List-only — cannot be created via API
Dataflow	`dataflows`	Dataflow	Not supported by Fabric API

Data Science¶

Resource	Type Key	API Type	Definitions
ML Model	`ml_models`	MLModel	MLflow model
ML Experiment	`ml_experiments`	MLExperiment	Metadata

Real-Time Intelligence¶

Resource	Type Key	API Type	Definitions
Eventhouse	`eventhouses`	Eventhouse	KQL scripts
Eventstream	`eventstreams`	Eventstream	JSON definition
KQL Database	`kql_databases`	KQLDatabase	KQL scripts
KQL Dashboard	`kql_dashboards`	KQLDashboard	Definition file
KQL Queryset	`kql_querysets`	KQLQueryset	Definition file
Reflex (Data Activator)	`reflex`	Reflex	JSON definition
Digital Twin Builder	`digital_twin_builders`	DigitalTwinBuilder	Definition file
Digital Twin Builder Flow	`digital_twin_builder_flows`	DigitalTwinBuilderFlow	Definition file
Event Schema Set	`event_schema_sets`	EventSchemaSet	Definition file
Graph Query Set	`graph_query_sets`	GraphQuerySet	Definition file

AI & Knowledge¶

Resource	Type Key	API Type	Definitions
Data Agent	`data_agents`	DataAgent	Instructions + examples
Operations Agent	`operations_agents`	OperationsAgent	Instructions
Anomaly Detector	`anomaly_detectors`	AnomalyDetector	Configuration
Ontology	`ontologies`	Ontology	Definition file

Other¶

Resource	Type Key	API Type	Definitions
Variable Library	`variable_libraries`	VariableLibrary	Key-value pairs
User Data Function	`user_data_functions`	UserDataFunction	Function definition
Graph	`graphs`	Graph	Definition file
Graph Model	`graph_models`	GraphModel	Definition file
Map	`map_items`	Map	Definition file
HLS Cohort	`hls_cohorts`	HLSCohort	Definition file

OneLake Shortcuts¶

Shortcuts are not a separate item type — they are sub-resources of Lakehouses:

lakehouses:
  bronze_lakehouse:
    shortcuts:
      - name: external_data
        target: "adls://storageaccount/container/path"
        path: Tables
        connection_id: "optional-connection-guid"
      - name: s3_data
        target: "s3://bucket-name/prefix"
      - name: cross_workspace
        target: "onelake://workspace-id/item-id/Tables/my_table"

Supported shortcut targets:

adls:// — Azure Data Lake Storage Gen2
s3:// — Amazon S3
onelake:// — Cross-workspace OneLake reference

Shortcut Transformations¶

Auto-convert source files to managed Delta tables — always in sync, no pipelines required.

File transformations convert CSV, Parquet, JSON, or Excel files into Delta tables:

lakehouses:
  bronze_lakehouse:
    shortcuts:
      - name: csv_sales_data
        target: "adls://datalake/sales/*.csv"
        path: Files
        transformation:
          type: file
          source_format: csv
          destination_table: raw_sales
          sync: true
          flatten: false

      - name: nested_json_events
        target: "adls://datalake/events/*.json"
        path: Files
        transformation:
          type: file
          source_format: json
          destination_table: raw_events
          flatten: true
          compression: gzip

      - name: excel_reports
        target: "adls://datalake/finance/*.xlsx"
        path: Files
        transformation:
          type: file
          source_format: excel
          destination_table: finance_reports

AI-powered transformations apply summarization, translation, or classification:

lakehouses:
  documents_lakehouse:
    shortcuts:
      - name: support_tickets
        target: "adls://datalake/tickets/*.json"
        path: Files
        transformation:
          type: ai
          ai_skill: summarize
          destination_table: ticket_summaries

      - name: multilingual_docs
        target: "adls://datalake/docs/*.json"
        path: Files
        transformation:
          type: ai
          ai_skill: translate
          ai_prompt: "Translate to English"
          destination_table: docs_english

      - name: email_classification
        target: "adls://datalake/emails/*.json"
        path: Files
        transformation:
          type: ai
          ai_skill: classify
          ai_prompt: "Classify as: complaint, inquiry, feedback, spam"
          destination_table: classified_emails

YAML Reference — All Resource Types¶

Data Engineering¶

Lakehouse¶

lakehouses:
  bronze_lakehouse:
    description: "Raw data landing zone"
    enable_schemas: true
    tables:
      raw_orders:
        schema_path: ./schemas/orders.json
        partition_by: [order_date]
    shortcuts:
      - name: external_data
        target: "adls://account/container/path"
        path: Tables
        connection_id: "optional-guid"
        transformation:
          type: file
          source_format: csv
          destination_table: raw_external

Notebook¶

notebooks:
  etl_pipeline:
    path: ./notebooks/etl.py
    description: "ETL pipeline"
    environment: spark_env
    default_lakehouse: bronze_lakehouse
    parameters:
      batch_size: 1000
      source_table: orders
    folder: ETL/Bronze

Environment¶

environments:
  spark_env:
    runtime: "1.3"
    libraries:
      - semantic-link-labs
      - delta-spark
    conda_dependencies:
      - numpy=1.24
    spark_properties:
      spark.sql.shuffle.partitions: "200"

Spark Job Definition¶

spark_job_definitions:
  distributed_training:
    path: ./spark_jobs/train.py
    description: "Distributed model training"
    environment: spark_env
    default_lakehouse: feature_store
    args: ["--epochs", "10", "--batch-size", "256"]
    conf:
      spark.executor.memory: "8g"
      spark.executor.cores: "4"

GraphQL API¶

graphql_apis:
  product_api:
    description: "GraphQL API over product data"
    path: ./graphql/schema.graphql
    data_source: gold_lakehouse

Snowflake Database¶

snowflake_databases:
  snowflake_mirror:
    description: "Mirrored Snowflake data"
    connection: snowflake_conn

Data Factory¶

Data Pipeline¶

pipelines:
  daily_refresh:
    description: "Daily ETL pipeline"
    schedule:
      cron: "0 6 * * *"
      timezone: America/Chicago
      enabled: true
    activities:
      - name: ingest
        notebook: ingest_notebook
      - name: transform
        notebook: transform_notebook
        depends_on: [ingest]
      - name: load
        notebook: load_notebook
        depends_on: [transform]

Copy Job¶

copy_jobs:
  copy_sales_data:
    description: "Copy sales data from Azure SQL"
    path: ./copy_jobs/sales_copy.json

Mounted Data Factory¶

mounted_data_factories:
  legacy_adf:
    description: "Mounted Azure Data Factory for legacy pipelines"
    data_factory_id: "/subscriptions/.../resourceGroups/.../providers/Microsoft.DataFactory/factories/my-adf"

Apache Airflow Job¶

airflow_jobs:
  airflow_etl:
    description: "Airflow DAG for complex orchestration"
    path: ./dags/etl_dag.py

dbt Job¶

dbt_jobs:
  dbt_transform:
    description: "dbt transformation project"
    path: ./dbt_project/
    environment: spark_env

Data Warehouse¶

Warehouse¶

warehouses:
  analytics_warehouse:
    description: "SQL analytics warehouse"
    sql_scripts:
      - ./sql/create_views.sql
      - ./sql/create_procedures.sql

SQL Database¶

sql_databases:
  operational_db:
    description: "Operational SQL database"
    sql_scripts:
      - ./sql/schema.sql
      - ./sql/seed_data.sql

Mirrored Database¶

mirrored_databases:
  azure_sql_mirror:
    description: "Mirrored Azure SQL database"
    source_type: "Azure SQL"
    connection: azure_sql_conn

Mirrored Warehouse¶

mirrored_warehouses:
  synapse_mirror:
    description: "Mirrored Synapse warehouse"
    source_type: "Synapse"

Mirrored Databricks Catalog¶

mirrored_databricks_catalogs:
  databricks_catalog:
    description: "Mirrored Databricks Unity Catalog"
    connection: databricks_conn

Cosmos DB Database¶

cosmosdb_databases:
  cosmos_mirror:
    description: "Mirrored Cosmos DB data"
    connection: cosmos_conn

Datamart¶

datamarts:
  sales_datamart:
    description: "Self-service sales datamart"
    path: ./datamarts/sales_definition.json

Power BI¶

Semantic Model¶

semantic_models:
  analytics_model:
    path: ./semantic_model/
    description: "Semantic model over gold lakehouse"
    default_lakehouse: gold_lakehouse
    auto_refresh: true
    refresh_timeout: 600
    folder: Models

Report¶

reports:
  executive_dashboard:
    path: ./reports/dashboard/
    description: "Executive dashboard (PBIR format)"
    semantic_model: analytics_model
    folder: Reports

Paginated Report¶

paginated_reports:
  monthly_invoice:
    description: "Monthly invoice report (RDL)"
    path: ./reports/invoice.rdl
    data_source: analytics_warehouse

Dashboard¶

dashboards:
  overview_dashboard:
    description: "High-level KPI dashboard"

Dataflow¶

dataflows:
  customer_transform:
    description: "Dataflow Gen2 for customer data"
    path: ./dataflows/customer_transform.json

Data Science¶

ML Model¶

ml_models:
  churn_model:
    path: ./models/churn_model/
    description: "Customer churn prediction model"
    framework: xgboost

ML Experiment¶

ml_experiments:
  churn_experiment:
    description: "Churn prediction experiment tracking"

Real-Time Intelligence¶

Eventhouse¶

eventhouses:
  telemetry_eventhouse:
    description: "IoT telemetry eventhouse"
    kql_scripts:
      - ./kql/create_tables.kql
      - ./kql/create_functions.kql
    retention_days: 365
    cache_days: 31

Eventstream¶

eventstreams:
  device_events:
    description: "Real-time device event stream"
    path: ./eventstreams/device_config.json
    sources:
      - type: event_hub
        name: iot-hub-events
    destinations:
      - type: eventhouse
        name: telemetry_eventhouse

KQL Database¶

kql_databases:
  telemetry_db:
    description: "KQL database for device telemetry"
    parent_eventhouse: telemetry_eventhouse
    kql_scripts:
      - ./kql/create_tables.kql

KQL Dashboard¶

kql_dashboards:
  ops_dashboard:
    description: "Real-time operations dashboard"
    path: ./dashboards/ops_dashboard.json
    data_source: telemetry_db

KQL Queryset¶

kql_querysets:
  telemetry_queries:
    description: "Pre-built KQL queries for analysis"
    path: ./kql/querysets/
    data_source: telemetry_db

Reflex (Data Activator)¶

reflex:
  anomaly_alerts:
    description: "Trigger alerts on anomalous readings"
    path: ./reflex/anomaly_rules.json

Digital Twin Builder¶

digital_twin_builders:
  factory_twin:
    description: "Digital twin of factory floor"
    path: ./twins/factory_definition.json

Digital Twin Builder Flow¶

digital_twin_builder_flows:
  factory_flow:
    description: "Data flow for factory twin"
    path: ./twins/factory_flow.json
    twin_builder: factory_twin

Event Schema Set¶

event_schema_sets:
  device_schemas:
    description: "Schema definitions for IoT events"
    path: ./schemas/device_events.json

Graph Query Set¶

graph_query_sets:
  network_queries:
    description: "Graph queries for network analysis"
    path: ./graph/queries/
    data_source: telemetry_db

AI & Knowledge¶

Data Agent¶

data_agents:
  analytics_agent:
    description: "Natural language interface to your data"
    sources:
      - gold_lakehouse
      - analytics_warehouse
    instructions: ./agent/instructions.md
    few_shot_examples: ./agent/examples.yaml
    tables_in_scope:
      - daily_order_summary
      - customer_360

Operations Agent¶

operations_agents:
  ops_agent:
    description: "Operations monitoring agent"
    sources:
      - telemetry_eventhouse
    instructions: ./agent/ops_instructions.md

Anomaly Detector¶

anomaly_detectors:
  revenue_detector:
    description: "Detect revenue anomalies"
    data_source: gold_lakehouse
    path: ./detectors/revenue_config.json

Ontology¶

ontologies:
  business_ontology:
    description: "Business domain knowledge graph"
    path: ./ontology/definition.json
    data_sources:
      - gold_lakehouse
      - analytics_warehouse

Other¶

Variable Library¶

variable_libraries:
  shared_config:
    description: "Shared configuration variables"
    variables:
      environment: production
      region: us-east
      log_level: info
      max_retries: "3"

User Data Function¶

user_data_functions:
  custom_transform:
    description: "Custom data transformation function"
    path: ./functions/transform.py
    runtime: python

Graph¶

graphs:
  knowledge_graph:
    description: "Product knowledge graph"
    path: ./graph/definition.json
    data_source: gold_lakehouse

Graph Model¶

graph_models:
  supply_chain_model:
    description: "Supply chain graph model"
    path: ./graph/supply_chain.json
    data_source: analytics_warehouse

Map¶

map_items:
  geo_mapping:
    description: "Geographic data mapping"
    path: ./maps/geo_config.json

HLS Cohort¶

hls_cohorts:
  patient_cohort:
    description: "Patient cohort for clinical analytics"
    path: ./cohorts/patient_definition.json