Column Profiling

Secoda provides high level insight into the columns of your datasets with Column Profiling.
Column profiling is a process of analyzing the characteristics and patterns of data within a specific column or field in a data set. Column profiling can be used to identify data quality issues and help organizations improve the quality of their data.
There are several benefits to column profiling for data quality:
  1. 1.
    Improved data accuracy: Column profiling can help identify data errors or inconsistencies within a specific column, allowing organizations to correct these issues and improve the overall accuracy of their data.
  2. 2.
    Enhanced data integrity: By identifying data quality issues within a specific column, organizations can take steps to ensure that data is being entered and stored correctly, improving data integrity.
  3. 3.
    Improved data governance: Column profiling can help organizations identify and address data quality issues, improving data governance and reducing the risk of errors or misunderstandings.
  4. 4.
    Enhanced data trustworthiness: By identifying and correcting data quality issues, organizations can improve the trustworthiness of their data, making it more useful and reliable for decision-making and analysis.

How to run column profiling

You're able to view the distribution of your data, the column count, and how many unique columns you have. And, if you hover over the distribution visualization, you can see the distribution percentage and what the column name is.
Wondering how up to date the column profile is? Secoda shows this on top of the table in the right hand corner.
Results of the column profiler
Double click on the results of the column profiler for more information

How it works

Column profiling runs a SELECT query directly on the database/data warehouse, processing the data to determine the Minimum, Maximum, Range, etc of the columns. The processed data will not be saved. As soon as the calculations are complete, we save the metadata results, but the data itself that the calculation is done on is not persisted anywhere in our database.
Not using Secoda to manage your data documentation yet? Sign up for free here 👈