Utilizing materialized views in Apache Doris
Materialized views in Apache Doris are different from those in traditional databases like PostgreSQL and MSSQL. They are pre-calculated datasets stored in a special table in Doris, which can significantly enhance dashboard performance by optimizing query execution. Bold BI does not offer a direct connector for Apache Doris. However, you can connect Apache Doris data to Bold BI using the MySQL connector.
Key Features of Materialized Views in Apache Doris
- Automatic Maintenance: Doris automatically maintains the data of the materialized view, ensuring consistency with the base table during import or delete operations.
- Supported Aggregate Functions:
- Version 0.12:
SUM
,MIN
,MAX
- Version 0.13:
COUNT
,BITMAP_UNION
,HLL_UNION
- Version 0.12:
- Query Matching: Once the materialized view is created, the user’s query remains unchanged, meaning it still refers to the base table. Doris will automatically choose the most suitable materialized view based on the query, retrieve data from it, and perform the calculations
- Users may utilize the EXPLAIN command to determine if the current query employs a materialized view.
Explain [query]
Managing Materialized Views
Checking Existing Materialized Views
To view the materialized views associated with a table and it’s structure, use the following command:
desc <Table_name> all;
Creating a Materialized View
To create a materialized view, use the CREATE MATERIALIZED VIEW
statement:
CREATE MATERIALIZED VIEW [MV name] AS [query];
- MV name: The name of the materialized view (must be unique for the same table).
- query: The SQL query that defines the data for the materialized view.
Deleting a Materialized View
If a materialized view is no longer needed, it can be deleted using the DROP
statement:
DROP MATERIALIZED VIEW <view_name> ON <table_name>;
Note: There is no Direct connector for Apache Doris in Bold BI. We can utilize the MySQL connector to connect the Apache Doris data in Bold BI.
Performance Improvement
The following dashboard was developed using data from Apache Doris and contains 1 billion rows. Find the dashboard loading metrics below for the dashboard with 1 billion data, with and without materialized views.
Without Materialized Views | With Materialized Views |
---|---|
2 mins | 12 secs |
NYC Taxi Dashboard:
Important Considerations
- Limitations on Aggregate Functions: The parameters of aggregate functions in materialized views must be single columns. For example,
SUM(a+b)
is not supported. - Unique Aggregate Functions: The same column cannot be used with different aggregate functions in a single materialized view. For instance,
SELECT SUM(a), MIN(a) FROM table
is not allowed. - Delete Operations: If a conditional column in a delete statement exists in the materialized view, the delete operation cannot proceed. The materialized view must be deleted first.
- Performance Impact: Having too many materialized views on a single table can slow down data imports. If a table has more than 10 materialized views, the import speed may be significantly affected.
- Unique Key Data Model: In the Unique Key data model, materialized views can only change the column order and cannot perform aggregation.