Articles in this section
Category / Section

Connecting Amazon Redshift as destination using Bold Data Hub

Published:

Introduction

This guide will walk you through the steps to connect Amazon Redshift as an destination using Bold Data Hub.

Prerequisites
  • Credentials: Ensure that the Amazon Redshift credentials are correct and have the necessary permissions.
Steps
Step 1: Open Bold Data Hub

Launch the Bold Data Hub application.

DataHub.png

Step 2: Set up the Amazon Redshift Credentials
  • Navigate to the Settings section in Bold Data Hub.
  • Click on the settings icon.

settings.png

Step 3: Select Connection Type
  • Choose the connection type as New.

3.png

Step 4: Configure Amazon Redshift
  • Select the server type as Amazon Redshift.
  • Fill in your Amazon Redshift credentials as follows:
    • Datastore Name: Enter a meaningful name; this is how the Amazon Redshift credentials will be stored in Bold Data Hub.
    • Server: Enter the server name.
      Example: redshift-cluster-1706.cldsfy4.us-east-1.redshift.amazonaws.com
    • Username: Enter your Amazon Redshift username.
    • Password: Input your Amazon Redshift password.
    • Database: Enter your Amazon Redshift database name.
    • Staging: Optionally configure Redshift with Staging support for faster data access.
Connecting Redshift using S3 Bucket

Fill in your Amazon S3 credentials as follows:

  • Bucket Name: Enter the bucket name.
  • Key Name: Enter the key name (folder name).
  • Role: Specify the IAM role that grants Redshift permission to access the S3 bucket. The IAM role must have the necessary permissions to perform COPY operations, which typically includes:
    • s3:GetObject to access the files in the specified S3 bucket.
    • sts:AssumeRole to allow the Redshift cluster to assume the role.
  • Secret Access Key: Enter the secret access key for Amazon Redshift.
  • Access Key ID: Enter the access key ID for Amazon Redshift.
  • Region: Select the region of the S3 bucket from the list of available regions.
How to Assign the Role

If a role hasn’t been created yet for Redshift to access S3, follow these steps:

  1. Create a new IAM role with the necessary S3 permissions (as described above).
  2. Attach this role to your Redshift cluster through the Redshift Console under Cluster Properties -> Cluster Permissions -> Manage IAM roles.
Retrieve the IAM Role Name

To retrieve the IAM role name required for the COPY command, follow these steps:

  1. Navigate to the IAM Console:
    • Go to the AWS Management Console.
    • In the search bar, type IAM and select IAM (Identity and Access Management) from the services list.
  2. Find the Role:
    • In the IAM dashboard, click on Roles in the left-hand menu.
    • Look for the role that was created to grant Redshift access to the S3 bucket. This role should be associated with your Redshift cluster.
  3. Copy the Role Name and ARN:
    • In the IAM console, click on the role to open its details.
    • You will find the Role name and the ARN (Amazon Resource Name) on the summary page of the role.
    • The ARN format is:
      arn:aws:iam::account-id:role/role-name

5.png

Step 5: Save the Credentials

Click Save to store the credentials.

6.png

Step 6: Add Pipeline

Click on Add Pipeline, give the pipeline a meaningful name, and click the tick icon or press Enter.

7.png

8.png

Note: In Bold BI, the data source will be created under the given pipeline name.

Step 7: Add Template Details

Click the pipeline and then choose connector from which you want extract data.
Click the add template button and add the details in the template,

Picture9.png

Picture10.png

Step 8: Save Template Details

Click the Save button. Choose the data store name from the dropdown and click Yes. Once saved, validation will be completed, and the pipeline will start.

Picture11.png

Picture12.png

Picture13.png

Step 9: Check Logs:

Navigate to the logs page to verify whether the data source has been created in Bold BI.

Picture14.png

  • Note: When running the same pipeline again, a new property called isDropTable is added in the YAML template. By default, it is set to true.
    • If isDropTable is set to true, the existing table will be dropped and recreated before the data is moved, ensuring that no duplicate rows are added.
    • If isDropTable is set to false, the data will be moved again without deleting or modifying the existing data.

Picture17.png

Step 9: Verify Data Source:

Check the data source in Bold BI.

Picture15.png

Picture16.png

Additional References

-BoldBI Documentation

Was this article useful?
Like
Dislike
Help us improve this page
Please provide feedback or comments
SE
Written by Sangavi Eswaramoorthi
Updated
Comments (0)
Please  to leave a comment
Access denied
Access denied