Microsoft Fabric Synapse Data Warehouse setup
Below is a guide for use with "Synapse Data Warehouse" a new product within Microsoft Fabric (preview) (more info)
To learn how to set up dbt with Azure Synapse Dedicated Pools, see Microsoft Azure Synapse DWH setup
Overview of dbt-fabric
- Maintained by: Microsoft
- Authors: [Microsoft](https://github.com/Microsoft)
- GitHub repo: Microsoft/dbt-fabric
- PyPI package:
dbt-fabric
- Slack channel:
- Supported dbt Core version: 1.4.0 and newer
- dbt Cloud support: Not Supported
Installing dbt-fabric
pip is the easiest way to install the adapter:
pip install dbt-fabric
Installing dbt-fabric
will also install dbt-core
and any other dependencies.
Configuring dbt-fabric
For Microsoft Fabric-specifc configuration please refer to Microsoft Fabric Configuration
For further info, refer to the GitHub repository: Microsoft/dbt-fabric
Prerequisites
On Debian/Ubuntu make sure you have the ODBC header files before installing
sudo apt install unixodbc-dev
Download and install the Microsoft ODBC Driver 18 for SQL Server. If you already have ODBC Driver 17 installed, then that one will work as well.
Supported configurations
- The adapter is tested with Microsoft Fabric Synapse Data Warehouse.
- We test all combinations with Microsoft ODBC Driver 17 and Microsoft ODBC Driver 18.
- The collations we run our tests on are
Latin1_General_100_BIN2_UTF8
.
The adapter support is not limited to the matrix of the above configurations. If you notice an issue with any other configuration, let us know by opening an issue on GitHub.
Authentication methods & profile configuration
Common configuration
For all the authentication methods, refer to the following configuration options that can be set in your profiles.yml
file.
A complete reference of all options can be found at the end of this page.
Configuration option | Description | Type | Example |
---|---|---|---|
driver | The ODBC driver to use | Required | ODBC Driver 18 for SQL Server |
server | The server hostname | Required | localhost |
port | The server port | Required | 1433 |
database | The database name | Required | Not applicable |
schema | The schema name | Required | dbo |
retries | The number of automatic times to retry a query before failing. Defaults to 1 . Queries with syntax errors will not be retried. This setting can be used to overcome intermittent network issues. | Optional | Not applicable |
login_timeout | The number of seconds used to establish a connection before failing. Defaults to 0 , which means that the timeout is disabled or uses the default system settings. | Optional | Not applicable |
query_timeout | The number of seconds used to wait for a query before failing. Defaults to 0 , which means that the timeout is disabled or uses the default system settings. | Optional | Not applicable |
schema_authorization | Optionally set this to the principal who should own the schemas created by dbt. Read more about schema authorization. | Optional | Not applicable |
encrypt | Whether to encrypt the connection to the server. Defaults to true . Read more about connection encryption. | Optional | Not applicable |
trust_cert | Whether to trust the server certificate. Defaults to false . Read more about connection encryption. | Optional | Not applicable |
Connection encryption
Microsoft made several changes in the release of ODBC Driver 18 that affects how connection encryption is configured.
To accommodate these changes, starting in dbt-sqlserver 1.2.0 or newer the default values of encrypt
and trust_cert
have changed.
Both of these settings will now always be included in the connection string to the server, regardless if you've left them out of your profile configuration or not.
- The default value of
encrypt
istrue
, meaning that connections are encrypted by default. - The default value of
trust_cert
isfalse
, meaning that the server certificate will be validated. By setting this totrue
, a self-signed certificate will be accepted.
More details about how these values affect your connection and how they are used differently in versions of the ODBC driver can be found in the Microsoft documentation.
Standard SQL Server authentication
SQL Server and windows authentication are not supported by Microsoft Fabric Synapse Data Warehouse.
Azure Active Directory Authentication (AAD)
Azure Active Directory authentication is a default authentication mechanism in Microsoft Fabric Synapse Data Warehouse.
The following additional methods are available to authenticate to Azure SQL products:
- AAD username and password
- Service principal (a.k.a. AAD Application)
- Environment-based authentication
- Azure CLI authentication
- VS Code authentication (available through the automatic option below)
- Azure PowerShell module authentication (available through the automatic option below)
- Automatic authentication
The automatic authentication setting is in most cases the easiest choice and works for all of the above.
- AAD username & password
- Service principal
- Managed Identity
- Environment-based
- Azure CLI
- Automatic
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: ActiveDirectoryPassword
user: bill.gates@microsoft.com
password: iheartopensource
Client ID is often also referred to as Application ID.
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: ServicePrincipal
tenant_id: 00000000-0000-0000-0000-000000001234
client_id: 00000000-0000-0000-0000-000000001234
client_secret: S3cret!
This authentication option allows you to dynamically select an authentication method depending on the available environment variables.
The Microsoft docs on EnvironmentCredential explain the available combinations of environment variables you can use.
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: environment
First, install the Azure CLI, then, log in:
az login
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: CLI
This authentication option will automatically try to use all available authentication methods.
The following methods are tried in order:
- Environment-based authentication
- Managed Identity authentication. Managed Identity is not supported at this time.
- Visual Studio authentication (Windows only, ignored on other operating systems)
- Visual Studio Code authentication
- Azure CLI authentication
- Azure PowerShell module authentication
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: auto
Additional options for AAD on Windows
On Windows systems, the following additional authentication methods are also available for Azure SQL:
- AAD interactive
- AAD integrated
- Visual Studio authentication (available through the automatic option above)
- AAD interactive
- AAD integrated
This setting can optionally show Multi-Factor Authentication prompts.
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: ActiveDirectoryInteractive
user: bill.gates@microsoft.com
This uses the credentials you're logged in with on the current machine.
your_profile_name:
target: dev
outputs:
dev:
type: fabric
driver: 'ODBC Driver 18 for SQL Server' # (The ODBC Driver installed on your system)
server: hostname or IP of your server
port: 1433
database: exampledb
schema: schema_name
authentication: ActiveDirectoryIntegrated
Automatic AAD principal provisioning for grants
Please note that automatic AAD principal provisioning is not supported by Microsoft Fabric Synapse Data Warehouse at this time. Even though in dbt 1.2 or newer you can use the grants config block to automatically grant/revoke permissions on your models to users or groups, the data warehouse does not support this feature at this time.
You need to add the service principal or AAD identity to a Fabric Workspace as an admin
Schema authorization
You can optionally set the principal who should own all schemas created by dbt. This is then used in the CREATE SCHEMA
statement like so:
CREATE SCHEMA [schema_name] AUTHORIZATION [schema_authorization]
A common use case is to use this when you are authenticating with a principal who has permissions based on a group, such as an AAD group. When that principal creates a schema, the server will first try to create an individual login for this principal and then link the schema to that principal. If you would be using Azure AD in this case, then this would fail since Azure SQL can't create logins for individuals part of an AD group automatically.
Reference of all connection options
Configuration option | Description | Required | Default value |
---|---|---|---|
driver | The ODBC driver to use. | ✅ | |
host | The hostname of the database server. | ✅ | |
port | The port of the database server. | 1433 | |
database | The name of the database to connect to. | ✅ | |
schema | The schema to use. | ✅ | |
authentication | The authentication method to use. This is not required for Windows authentication. | 'sql' | |
UID | Username used to authenticate. This can be left out depending on the authentication method. | ||
PWD | Password used to authenticate. This can be left out depending on the authentication method. | ||
tenant_id | The tenant ID of the Azure Active Directory instance. This is only used when connecting to Azure SQL with a service principal. | ||
client_id | The client ID of the Azure Active Directory service principal. This is only used when connecting to Azure SQL with an AAD service principal. | ||
client_secret | The client secret of the Azure Active Directory service principal. This is only used when connecting to Azure SQL with an AAD service principal. | ||
encrypt | Set this to false to disable the use of encryption. See above. | true | |
trust_cert | Set this to true to trust the server certificate. See above. | false | |
retries | The number of times to retry a failed connection. | 1 | |
schema_authorization | Optionally set this to the principal who should own the schemas created by dbt. Details above. | ||
login_timeout | The amount of seconds to wait until a response from the server is received when establishing a connection. 0 means that the timeout is disabled. | 0 | |
query_timeout | The amount of seconds to wait until a response from the server is received when executing a query. 0 means that the timeout is disabled. | 0 |
Valid values for authentication
:
ActiveDirectoryPassword
: Active Directory authentication using username and passwordActiveDirectoryInteractive
: Active Directory authentication using a username and MFA promptsActiveDirectoryIntegrated
: Active Directory authentication using the current user's credentialsServicePrincipal
: Azure Active Directory authentication using a service principalCLI
: Azure Active Directory authentication using the account you're logged in within the Azure CLIenvironment
: Azure Active Directory authentication using environment variables as documented hereauto
: Azure Active Directory authentication trying the previous authentication methods until it finds one that works