dbt Core
An overview of the dbt Core integration with Secoda
Getting Started with dbt Core
dbt is a secondary integration that adds additional metadata on to your data warehouse or relational database tables. Before connecting dbt make sure to connect a data warehouse or relational database first. These include Snowflake, BigQuery, Postgres, Redshift, etc.
The metadata extracted from dbt core is: Descriptions, Lineage, SQL Query, and Tags (optional).
There are several options to connect dbt core with Secoda:
Connect an AWS S3 or GCP GCS bucket (Recommended)
Upload a
manifest.json
andrun_results.json
through the UIUpload a
manifest.json
andrun_results.json
through the Secoda API
Option 1 -> Connect a Bucket (Recommended)
This option is recommended to ensure that Secoda always has the latest manifest.json
and run_results.json
files from dbt Core. Secoda will only sync these files from the bucket.
Connect an AWS S3 bucket
You can connect to the AWS S3 bucket using an AWS IAM user, or AWS Roles.
Connect a GCS GCP bucket
Login to GCP cloud console.
Create a service account.
Grant access to the service account from the Bucket page as “Storage Object Viewer”.
Turn on interoperability on the bucket. Generate HMAC keys for a service account with read access to the bucket. Both located here:
Setup CORS. GCP requires this be done over CLI. Like the following:
cors.json
Save the HMAC keys to be used in the connection form.
Acess Key Id
Secret
Region bucket region for GCP
S3 Endpoint must be added and set to
https://storage.googleapis.com
Connect your S3 bucket to Secoda
Navigate to https://app.secoda.co/integrations/new and click dbt Core
Choose the Access Key tab and add the HMAC keys saved above to the relevant fields.
Test the Connection - if successful you'll be prompted to run your initial sync
Option 2 -> Upload manifest.json
This is a one time sync with your manifest.json file. You can upload the file following these steps:
Navigate to https://app.secoda.co/integrations/new and click dbt Core
Choose the File Upload tab and select your manifest.json and run_results.json files using the file select
Test the Connection - if successful you'll be prompted to run your initial sync
Option 3 -> Secoda API
The API provides an endpoint to upload your manifest.json and run_results.json file. This is convenient if you run dbt with Airflow because you can upload the manifest.json at the end of a dbt run. Follow these instructions to upload your manifest.json via the API:
Create a blank dbt core integration by going to https://app.secoda.co/integrations/new and selecting the "dbt Core" integration and then click "Test Connection". And run the initial extraction. This extraction will fail, but that's intended.
Return to https://app.secoda.co/integrations and click on the dbt Core integration that was just created. Save the ID which is contained in the URL.
Use the endpoints below to upload your files. This will trigger an extraction to run on the integration you created in step #1.
Endpoints ->
Manifest.json: https://api.secoda.co/integration/dbt/manifest/
Run_results.json: https://api.secoda.co/integration/dbt/run_results/
Method -> POST
Sample Request for Manifest file (Python) ->
Sample Response ->
Last updated