Working with pipeline using custom query in Data Hub
Data Hub allows the creation of pipelines using custom queries to obtain aggregated or filtered data from a large database, rather than extracting all of the data.
How to create pipeline using custom query in Data Hub
Steps on how to create the pipeline using a custom query:
- Click Data Hub.
- Click ‘+’ icon to Add Pipeline and provide the name for the pipeline. Save the pipeline.
-
Select the newly created pipeline and choose the SQL / MySQL / Oracle / PostgreSQL connector. Double-click or click on the Add Template option to add the template.
-
Add Query under Properties and include the query. Please note that when using Query in the template, the Select parameter is used to create the table for storing the query result instead of extracting the data from the name given in it.
Sample YAML Template
version: 1.0.1
encrypt_credentials: false
direct_target_import: false
union_all_tables: true
add_dbname_column: false
direct_load_to_destination: true
plugins:
extractors:
- name: MSSQL
connectorname: MSSQL
schemaname: dbo
config:
host: localhost
port: 1433
username: sa
database: Retail
password: Syncfusion@123
drivername: mssql+pyodbc
driver: ODBC+Driver+17+for+SQL+Server
properties:
query: SELECT AVG(unit_cost) AS average_unit_cost FROM [Retail].[dbo].[Product_Details];
metadata:
select:
- aggregated_table
- Click Save and choose the desired destination to save the pipeline.
-
Now the pipeline will be saved and started automatically.
-
Click on Logs to see if the run is completed and a data source is created in Bold BI/Bold Reports.
-
An equivalent live data source will be created by Data Hub using the datastore details in Bold BI.
-
Click Edit DataSource Option to view the created tables.
-
To configure interval-based scheduling, click on the schedules tab and click on the schedule icon and configure it.
Limitations:
- Support only select queries.