Monitors

Create data monitors for visibility into the health of your data stack

Introduction

Monitoring plays a crucial role in maintaining data quality by allowing you to configure alerts for changes in your data. Automatically schedule monitors and set thresholds to track run history and visualize monitor performance.

Admins and Editors can access existing Monitors from the Monitors page accessible via the side panel. Here, you can view all monitors and incidents across the platform and create new ones.

To learn about how our current customers are using Monitors in Secoda to improve their data quality, check out this list of Monitoring Use Cases.

Note: Read permissions for the source data (in addition to the metadata) are required for the monitoring feature.

Types of Monitors

Select from a variety of Monitors to suit your needs:

Tables

Row Count - The number of rows over time
Freshness - The time elapsed since last update

Columns

Cardinality - The number of distinct values of a given column
Maximum - The highest value of a numeric column
Minimum - The lowest value of a numeric column
Mean - The arithmetic mean of a numeric column
Null Percentage - The percentage of values in a column that are null
Unique Percentage - The percentage of values in a column that are unique
Custom SQL - Define a monitor by writing your own SQL query

Pipelines

Job duration - Average length of job runs
Job error rate - Percentage of failed job runs
Job success rate - Percentage of successful job runs

Snowflake

Cost - Daily total cost
Query Volume - Daily queries run
Compute Credits - Daily compute credits consumed
Storage Usage - Total storage usage

The monitor will alert if any of these values are higher or lower than expected.

Creating Monitors

Monitors can be created via the Monitors section in the sidebar or through the Monitors tab on the resource page:

Navigate to "Monitors" and click "Create monitor."

Choose the monitor type and select the integration. If adding a new Monitor from the resource itself, the integration will be pre-selected.

Select one or multiple resources that you'd like to add the monitor to.

Adjust the Threshold and Schedule to your preferred configuration.

Schedule Options: Daily, Every 12, 6 or 3 hours, or Hourly
Threshold: Automatic or Manual
- Note: For Automatic thresholds to be set, it can take 4 days for hourly, 6 days for multiple times a day, and 8-9 days for daily monitors.

Once configured, click add monitor and it show now show up within the list of monitors. You can view and edit the configurations from the sidebar on the monitor page

Note: You can only add a monitor type which each of the columns support.

For example, if you have 3 numeric columns selected, you can add a "MIN" or "MAX" monitor, but you cannot do it if even one string column is selected in the modal.

Custom SQL Monitors

A user is be able to create a monitor that runs custom SQL to create an output. The only requirement is that the final output of the custom SQL must be a single value.

Follow the same steps as above, but choose "Custom SQL" as the Monitor type. After creating, click into it so that you can add your desired query in the right side panel.

WHERE clause

Standard monitors such as nullness, row count, etc can be modified with custom SQL that’s added as a WHERE clause within the standard SQL.

Managing Monitors

View Status, Last and Next Run details, and a Chart Visualization of the monitor's historical performance.

Thresholds and Incidents

If you've chosen Automatic thresholds, it can take up to a week for Secoda to finish learning what the right thresholds should be for your monitors.

The lighter green surrounding the main line represent the threshold limits - once the threshold is passed, it'll show a red dot indicating an incident

At the bottom of the Monitor page, you will find a history of all measurements, including those that have triggered incidents. You can easily filter this list to view only incidents and specify a particular time period.

When an incident occurs, you have two primary options for handling it: you can either acknowledge the incident to recognize that it has occurred or resolve it once appropriate actions have been taken.

The incident page provides visibility into all resources impacted downstream of the affected table or column. Additionally, the activity log allows team members to comment on the incident, tag others, and add relevant context for resolution.

If you have an existing integration with Jira, you can create a ticket for the incident directly from the Secoda incident page, streamlining your workflow.

Incidents are automatically acknowledged if a comment is made or a Jira ticket is created. They are automatically resolved if the measurement returns to range after three consecutive monitoring runs.

This incident management system is designed to enhance team collaboration and streamline the resolution process.

Annotations

The Monitoring Annotations feature provides a way to mark specific data points within monitoring incidents, giving teams a deeper context and helping refine automatic thresholds over time. This feature includes a Normal button, which allows users to indicate that a particular data point is not an anomaly, but rather expected behavior.

Using Monitoring Annotations with the Normal Button

Navigate to the Incident that you wish to annotate.
Select the Normal button for the data point that you want to classify as expected.
This action marks the data point as typical, contributing to the accuracy of Secoda's automatic thresholding over time.

By tagging data points through Monitoring Annotations, you help Secoda’s monitoring become more attuned to your data’s unique patterns, reducing false alerts and improving the identification of true anomalies.

Errors

You may receive an error on your Monitors for various reasons. The Error will appear under Status. You are able to click into the error to see exactly what went wrong with the Monitor. In the example below, a Custom SQL monitor was chosen but a query was never provided, causing it to error out.

Best Practices

To optimize the effectiveness of data monitoring and manage resource utilization, consider these best practices:

Selective Monitoring: Focus on the most critical data elements. Prioritize columns and tables that are essential for your business operations to avoid unnecessary strain on resources.
Optimize Frequency: Set monitoring frequencies that balance timeliness and resource consumption. For many applications, configuring monitors to run daily is sufficient to catch issues without incurring excessive costs.
Regular Reviews: Periodically review data quality monitoring configurations. This ensures that your monitoring strategies stay aligned with evolving business needs and data landscapes.
Workflow Integration: Embed monitoring alerts into your team’s daily workflows using tools like Slack or email (see Monitoring Notifications). This ensures that the right personnel are promptly notified, enabling swift action.
Documentation and Training: Keep detailed documentation of your monitor setups and procedures. Train your team on the importance of monitoring and the actions required when specific alerts are triggered.
Trend Analysis: Leverage historical data from your monitoring activities to identify trends and patterns. This analysis can help refine your data management practices and predictive monitoring over time.

By following these guidelines, you can ensure your monitoring processes are both efficient and effective, providing critical insights while maintaining control over costs and resource use.

Monitoring Permissions

Monitoring functionality is primarily intended for users with Edit or Admin roles, rather than end business users who typically have Viewer permissions.

Creating:
- Any Admin or Editor can now create monitors, without requiring specific integration permissions.
Editing and Owning:
- Initially, any Admin or Editor can edit monitors. However, once an owner is designated for a monitor, only the owner and Admins will have editing rights.
- Owners and Admins will also have the ability to invite additional owners to the monitors.
Deleting, Running, and Resolving:
- The permissions for deleting, running, and resolving monitors will mirror those for editing. This means that only the monitor's owner and Admins can perform these actions once an owner is assigned.
Viewing:
- Viewing permissions remain unchanged, with any Admin or Editor able to view monitors. We are planning future updates that will allow more granular control over who can view monitor outputs, enhancing privacy and data security.

Monitoring Notifications

Stay informed about the status of your monitors by adjusting your Notification settings. Specify your preferred channels for receiving alerts—whether through Slack DMs, email, or directly within the app.

Notifications for monitor incidents are issued only after the first occurrence following a successful run. The same incident will not trigger new alerts unless the issue has been resolved and another incident occurs. This policy minimizes repetitive alerts and ensures that notifications remain meaningful and actionable.

Configuring Slack Channel Notifications

Admins can direct monitoring notifications to specific Slack channels, distinct from other notification settings. This ensures that the right team members are alerted promptly. For detailed steps on setting this up, visit here.

Email Monitoring Notifications

Email notifications provide direct links to the relevant sections in Secoda. As shown in the image below, clicking the "Open Secoda" link takes you to the Inbox notification, while other links direct you to specific incidents or tables.

Monitors as Code

You can declare Secoda monitors directly in your dbt model YAML files. These monitors are managed entirely through code and cannot be modified through the Secoda UI. Changes are applied when the dbt integration syncs with Secoda.

Monitor Configuration Formats

Monitors can be specified using two formats:

List Format (Default Configuration)

monitors:
  - mean
  - max
  - null_percentage

When using list format, each monitor uses these default settings:

Automatic thresholds (sensitivity: 5)
Daily schedule (runs once per day at UTC midnight)
Both upper and lower bounds
Auto-generated monitor name

Dictionary Format (Custom Configuration)

monitors:
  mean: {}  # Empty dict for default configuration (same defaults as list format)
  null_percentage:
    name: "Custom Null Check"
    thresholds:
      sensitivity: 8
      bounds: lower

You can mix both formats across different models and columns, but each individual monitors section must use either list or dictionary format, not both. Use list format for simplicity when default settings are sufficient, and dictionary format when you need to customize any monitor settings.

Available Monitor Types

Table-Level Monitors

row_count: Tracks number of rows
freshness: Monitors last update time (Snowflake/Redshift only)
custom_sql: Execute custom SQL queries

Column-Level Monitors

mean: Average value
max: Maximum value
min: Minimum value
cardinality: Number of unique values
null_percentage: Percentage of null values
unique_percentage: Percentage of unique values

Configuration Options

Here are all available configuration properties with their default values:

monitors:
  metric_name:
    # Threshold Configuration
    thresholds:
      method: automatic   # or manual
      min: null           # required if method is manual
      max: null           # required if method is manual
      sensitivity: 5      # scale 1-10, used for automatic method
      bounds: both        # or upper or lower
    
    # Schedule Configuration
    schedule:
      cadence: daily     # or hourly
      hour_utc: 0        # hour of day (0-23) for daily cadence
      frequency: 1       # run frequency in hours for hourly cadence
    
    # General Configuration
    name: ""             # custom monitor name
    description: ""      # optional description
    query: null          # required for custom_sql monitors

Complete Example

version: 2
models:
  - name: stg_orders
    meta:
      secoda:
        monitors:
          custom_sql:
            query: "SELECT AVG(amount) FROM analytics.dbt_prod.stg_orders"
            schedule:
              cadence: hourly
          row_count:
            name: "Orders Volume Monitor"
            thresholds:
              method: manual
              min: 50
              max: 500
    columns:
      - name: order_amount
        meta:
          secoda:
            monitors:
              - mean
              - max
              - null_percentage
      - name: status
        meta:
          secoda:
            monitors:
              unique_percentage: {}
              cardinality:
                thresholds:
                  method: manual
                  min: 3
                  max: 10