Connecting Amazon Redshift as destination using Bold Data Hub
Introduction
This guide will walk you through the steps to connect Amazon Redshift as an destination using Bold Data Hub.
Prerequisites
- Credentials: Ensure that the Amazon Redshift credentials are correct and have the necessary permissions.
Steps
Step 1: Open Bold Data Hub
Launch the Bold Data Hub application.
Step 2: Set up the Amazon Redshift Credentials
- Navigate to the Settings section in Bold Data Hub.
- Click on the settings icon.
Step 3: Select Connection Type
- Choose the connection type as New.
Step 4: Configure Amazon Redshift
- Select the server type as Amazon Redshift.
- Fill in your Amazon Redshift credentials as follows:
- Datastore Name: Enter a meaningful name; this is how the Amazon Redshift credentials will be stored in Bold Data Hub.
- Server: Enter the server name.
Example:redshift-cluster-1706.cldsfy4.us-east-1.redshift.amazonaws.com
- Username: Enter your Amazon Redshift username.
- Password: Input your Amazon Redshift password.
- Database: Enter your Amazon Redshift database name.
- Staging: Optionally configure Redshift with Staging support for faster data access.
Connecting Redshift using S3 Bucket
Fill in your Amazon S3 credentials as follows:
- Bucket Name: Enter the bucket name.
- Key Name: Enter the key name (folder name).
- Role: Specify the IAM role that grants Redshift permission to access the S3 bucket. The IAM role must have the necessary permissions to perform COPY operations, which typically includes:
s3:GetObject
to access the files in the specified S3 bucket.sts:AssumeRole
to allow the Redshift cluster to assume the role.
- Secret Access Key: Enter the secret access key for Amazon Redshift.
- Access Key ID: Enter the access key ID for Amazon Redshift.
- Region: Select the region of the S3 bucket from the list of available regions.
How to Assign the Role
If a role hasn’t been created yet for Redshift to access S3, follow these steps:
- Create a new IAM role with the necessary S3 permissions (as described above).
- Attach this role to your Redshift cluster through the Redshift Console under Cluster Properties -> Cluster Permissions -> Manage IAM roles.
Retrieve the IAM Role Name
To retrieve the IAM role name required for the COPY command, follow these steps:
- Navigate to the IAM Console:
- Go to the AWS Management Console.
- In the search bar, type IAM and select IAM (Identity and Access Management) from the services list.
- Find the Role:
- In the IAM dashboard, click on Roles in the left-hand menu.
- Look for the role that was created to grant Redshift access to the S3 bucket. This role should be associated with your Redshift cluster.
- Copy the Role Name and ARN:
- In the IAM console, click on the role to open its details.
- You will find the Role name and the ARN (Amazon Resource Name) on the summary page of the role.
- The ARN format is:
arn:aws:iam::account-id:role/role-name
Step 5: Save the Credentials
Click Save to store the credentials.
Step 6: Add Pipeline
Click on Add Pipeline, give the pipeline a meaningful name, and click the tick icon or press Enter.
Note: In Bold BI, the data source will be created under the given pipeline name.
Step 7: Add Template Details
Click the pipeline and then choose connector from which you want extract data.
Click the add template button and add the details in the template,
Step 8: Save Template Details
Click the Save button. Choose the data store name from the dropdown and click Yes. Once saved, validation will be completed, and the pipeline will start.
Step 9: Check Logs:
Navigate to the logs page to verify whether the data source has been created in Bold BI.
- Note: When running the same pipeline again, a new property called
isDropTable
is added in the YAML template. By default, it is set to true.- If
isDropTable
is set to true, the existing table will be dropped and recreated before the data is moved, ensuring that no duplicate rows are added. - If
isDropTable
is set to false, the data will be moved again without deleting or modifying the existing data.
- If
Step 9: Verify Data Source:
Check the data source in Bold BI.