Links

Redshift

An overview of the Redshift integration with Secoda

Getting Started with Redshift

There are three steps to connect Redshift with Secoda:
  1. 1.
    Create a database user
  2. 2.
    Connect Redshift to Secoda
  3. 3.
    Whitelist Secoda IP Address

Create a Database User

The username and password you’ve already created for your cluster is your admin password, which you should keep for your own usage. For Secoda, and any other 3rd-parties, it is best to create distinct users. This will allow you to isolate queries from one another using WLM and perform audits easier.
To create a new user, you’ll need to log into the Redshift database directly and run the following SQL commands:
Redshift doesn't allow for non-super users to access the system tables, which is where we pull the metadata from. Below is a workaround so you aren't giving Secoda superuser access. Secoda only uses the system tables for our metadata extraction, the extraction query can be viewed here.
-- Create a user named "secoda" that Secoda will use when connecting to your Redshift cluster.
CREATE USER secoda PASSWORD '<enter password here>';
-- Allows the non super user "secoda" to query metadata
-- Explaination of query here -> https://stackoverflow.com/questions/48567440/granting-permissions-on-redshift-system-tables-to-non-superusers
ALTER USER secoda SYSLOG ACCESS UNRESTRICTED;
-- Complete this query for any schemas you would like Secoda to extract
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO secoda
GRANT SELECT ON svv_table_info TO secoda;
When connecting to Redshift in Secoda, use the username/password you’ve created here instead of your admin account.

Connect Redshift to Secoda

After creating a Redshift warehouse, the next step is to connect Secoda:
  1. 1.
    In the Secoda App, select ‘Add Integration’ on the Integrations tab
  2. 2.
    Search for and select ‘Redshift’
  3. 3.
    Enter your Redshift credentials
  4. 4.
    Click 'Connect'

Security

VPCs keep servers inaccessible to traffic from the internet. With VPC, you’re able to designate specific web servers access to your servers. In this case, you will be whitelisting the Secoda IPs to read from your data warehouse.

Best Practice

Networking

Redshift clusters can either be in a EC2 Classic subnet or VPC subnet.
If your cluster has a field called Cluster Security Groups, proceed to EC2 Classic
Or if your cluster has a field called VPC Security Groups, proceed to EC2 VPC

EC2-Classic

  1. 1.
    Navigate to your Redshift Cluster settings: Redshift Dashboard > Clusters > Select Your Cluster
  2. 2.
    Click on the Cluster Security Groups
  3. 3.
    Open the Cluster Security Group
  4. 4.
    Click on “Add Connection Type”
  5. 5.
    Choose Connection Type CIDR/IP and authorize Secoda to read into your Redshift Port using
    35.175.75.15/32
    34.230.160.245/32
    54.175.53.193/32

EC2-VPC

  1. 1.
    Navigate to your Redshift Dashboard > Clusters > Select Your Cluster
  2. 2.
    Click on the VPC Security Groups
  3. 3.
    Select the “Inbound” tab and then “Edit”
  4. 4.
    Allow Secoda to read into your Redshift Port using
    35.175.75.15/32
    34.230.160.245/32
    54.175.53.193/32
You can find more information on that here.
  1. 1.
    Navigate back to your Redshift Cluster Settings: Redshift Dashboard > Clusters > Select Your Cluster
  2. 2.
    Select the “Cluster” button and then “Modify”
  3. 3.
    Make sure the “Publicly Accessible” option is set to “Yes”