Articles in this section
Category / Section

Materialized Views in MySQL: Optimizing Dashboard Performance

Published:

Introduction to Materialized Views

What is a Materialized View?

A materialized view is a physical representation of the result set of a query that is stored in a table-like structure. It captures the output of a query and holds the data persistently, eliminating the need to recompute the results every time the query is executed. This pre-aggregated or pre-joined data is particularly beneficial when dealing with complex calculations and aggregations, as it significantly reduces the processing time required for each query.

Understanding the Benefits

  • Accelerated Query Performance: Materialized views in MySQL significantly boost query execution times by storing precomputed results. This leads to faster data retrieval, particularly beneficial when dealing with complex calculations and aggregations.

  • Reduced Resource Consumption: By persistently storing pre-aggregated data, materialized views alleviate the computational load on the database server. This reduction in resource consumption enhances overall system performance and scalability.

  • Support for Large Datasets: Materialized views are particularly advantageous when dealing with large datasets. They ensure that queries are performed on pre-aggregated data, making them well-suited for handling the computational demands of extensive and intricate datasets.

  • Optimized Reporting and Dashboard Performance: Materialized views play a crucial role in enhancing the performance of reporting tools and dashboards. They provide a reliable and quick source of data, contributing to the responsiveness of analytical and reporting applications.

Optimizing Dashboards with Materialized Views

Materialized views play a crucial role in enhancing dashboard performance for several reasons:

  • Query Optimization: By precomputing and storing the results of complex queries, materialized views eliminate the need for recalculating these results each time a user accesses the dashboard. This leads to significantly faster response times.

  • Reduced Load on Source Database: Dashboards often require aggregations and summaries of large datasets. Materialized views allow the offloading of resource-intensive computations from the main database, reducing the load on the source system and improving overall system performance.

  • Caching Frequently Accessed Data: Materialized views act as a cache for frequently accessed data, minimizing the need to execute resource-intensive queries repeatedly. This caching mechanism contributes to a more responsive and efficient dashboard experience.

  • Support for Historical Data: Materialized views are especially beneficial when dealing with historical data or snapshots at specific points in time. They provide a way to efficiently store and retrieve historical information, which is valuable for trend analysis and reporting.

  • Offline Analytics: Materialized views can be refreshed periodically, allowing for offline analytics on a precomputed dataset. This is particularly advantageous when real-time data is not critical, and users can work with the most recent precomputed results.

Example Dashboard Implementation

Please refer to the Marketing Analytics Dashboard mentioned in the article Optimizing Database Performance with Materialized Views and Pre-Aggregated Tables for further information. In this article, we will provide examples based on the dashboard.

image.png

Support for Materialized Views in MySQL

MySQL does not have native support for materialized views. However, some alternative approaches and workarounds can be used to achieve similar functionality.

  1. Triggers and Temporary Tables: Use triggers to capture changes in the underlying data and update temporary tables with the results of the desired queries.

  2. Scheduled Refresh with Stored Procedures: Utilize stored procedures scheduled with MySQL’s event scheduler to periodically refresh the materialized view based on the original query.

  3. External Tools: Employ third-party tools and frameworks, such as Flexviews, that offer materialized view support and handle the complexity of maintaining them.

Implementing Materialized Views with Summary Tables

In MySQL, creating materialized views can be achieved through the use of summary tables. Summary tables store aggregated or precomputed data that can be easily queried instead of recalculating the results each time a query is executed.

Steps to Create Materialized Views with Summary Tables:

  1. Identify the Query to Optimize:
    Begin by identifying the query or set of queries that you want to optimize using materialized views. These queries should involve aggregations or calculations that are resource-intensive and do not change frequently.

  2. Design the Summary Table:
    Create a summary table that mirrors the structure of the result set from the identified query. This table will store the precomputed results. Ensure that it is appropriately indexed to facilitate quick data retrieval.

Consider we have the following table named “marketing_details” in our MySQL database.

CREATE TABLE `marketing_details` (
 `date` timestamp(6) NULL DEFAULT NULL,
 `destinationurl` varchar(500) DEFAULT NULL,
 `adcost` int DEFAULT NULL,
 `visitors` int DEFAULT NULL,
 `visitors_after_landing_page` int DEFAULT NULL,
 `sessions` int DEFAULT NULL,
 `impressions` int DEFAULT NULL,
 `clicks` int DEFAULT NULL,
 `revenue` int DEFAULT NULL,
 `expense` int DEFAULT NULL,
 `channel_id` int DEFAULT NULL,
 `channel_name` varchar(100) DEFAULT NULL,
 `product_used` int DEFAULT NULL,
 `leads` int DEFAULT NULL,
 `wons` int DEFAULT NULL
);

Now, create the summary table with the required pre-aggregated fields and calculated fields:

CREATE TABLE marketing_details_mv (
 month_year VARCHAR(20),
 destinationurl VARCHAR(500),
 channel_name VARCHAR(100),
 total_unique_destinations INT,
 total_adcost INT,
 average_adcost DECIMAL(10, 2),
 total_visitors INT,
 total_visitors_after_landing_page INT,
 total_sessions INT,
 total_impressions INT,
 total_clicks INT,
 total_revenue INT,
 average_revenue DECIMAL(10, 2),
 total_expense INT,
 average_expense DECIMAL(10, 2),
 total_unique_channel_ids INT,
 total_unique_channels INT,
 total_unique_products_used INT,
 total_leads INT,
 average_leads DECIMAL(10, 2),
 total_wons INT,
 average_wons DECIMAL(10, 2),
 conversion_rate DECIMAL(10, 2),
 click_through_rate DECIMAL(10, 2),
 cost_per_click DECIMAL(10, 2),
 return_on_investment DECIMAL(10, 2),
 average_session_duration DECIMAL(10, 2),
 lead_conversion_rate DECIMAL(10, 2),
 average_revenue_per_visitor DECIMAL(10, 2),
 channel_engagement_rate DECIMAL(10, 2),
 PRIMARY KEY (month_year, destinationurl, channel_name)
);

Populate the summary table with the data from the raw table.

INSERT INTO marketing_details_mv
SELECT 
  DATE_FORMAT(date, '%b %Y') AS month_year,
  destinationurl,
  channel_name,
  COUNT(DISTINCT destinationurl) AS total_unique_destinations,
  SUM(adcost) AS total_adcost,
  AVG(adcost) AS average_adcost,
  SUM(visitors) AS total_visitors,
  SUM(visitors_after_landing_page) AS total_visitors_after_landing_page,
  SUM(sessions) AS total_sessions,
  SUM(impressions) AS total_impressions,
  SUM(clicks) AS total_clicks,
  SUM(revenue) AS total_revenue,
  AVG(revenue) AS average_revenue,
  SUM(expense) AS total_expense,
  AVG(expense) AS average_expense,
  COUNT(DISTINCT channel_id) AS total_unique_channel_ids,
  COUNT(DISTINCT channel_name) AS total_unique_channels,
  COUNT(DISTINCT product_used) AS total_unique_products_used,
  SUM(leads) AS total_leads,
  AVG(leads) AS average_leads,
  SUM(wons) AS total_wons,
  AVG(wons) AS average_wons,
  (SUM(leads) / SUM(visitors)) AS conversion_rate,
  (SUM(clicks) / SUM(impressions)) AS click_through_rate,
  (SUM(adcost) / NULLIF(SUM(clicks), 0)) AS cost_per_click,
  (SUM(revenue) / NULLIF(SUM(expense), 0)) AS return_on_investment,
  (SUM(sessions) / NULLIF(SUM(visitors), 0)) AS average_session_duration,
  (SUM(wons) / NULLIF(SUM(leads), 0)) AS lead_conversion_rate,
  (SUM(revenue) / NULLIF(SUM(visitors), 0)) AS average_revenue_per_visitor,
  (SUM(clicks + leads + wons) / NULLIF(SUM(impressions), 0)) AS channel_engagement_rate
FROM marketing_details
GROUP BY month_year, destinationurl, channel_name;

Refreshing Materialized Views

Materialized views contain precomputed data stored in summary tables, providing faster query performance. The need for refreshing arises as this data isn’t live; it reflects a snapshot at a specific point in time. Refreshing ensures that the materialized view stays up-to-date with changes in the underlying base tables. This process is crucial to maintain the accuracy and relevance of the summarized information.

For detailed instructions on different methods to refresh materialized views, please refer to the following sections.

On-Demand Refresh Strategies with Stored Procedures

A stored procedure can be employed to truncate and update the entire summary table of a materialized view. This approach is well-suited for scenarios where the underlying data changes less frequently, as it involves a complete refresh of the summary table.

Develop a stored procedure named “refresh_marketing_details_mv” in this scenario that truncates the existing summary table and repopulates it with the latest aggregated results from the base table.

DELIMITER $$

CREATE PROCEDURE refresh_marketing_details_mv (
   OUT rc INT
)
BEGIN

 -- Clear existing data in the materialized view
 TRUNCATE TABLE marketing_details_mv;

 -- Insert new data into the materialized view
 INSERT INTO marketing_details_mv
 SELECT 
    DATE_FORMAT(date, '%b %Y') AS month_year,
    destinationurl,
    channel_name,
    COUNT(DISTINCT destinationurl) AS total_unique_destinations,
    SUM(adcost) AS total_adcost,
    AVG(adcost) AS average_adcost,
    SUM(visitors) AS total_visitors,
    SUM(visitors_after_landing_page) AS total_visitors_after_landing_page,
    SUM(sessions) AS total_sessions,
    SUM(impressions) AS total_impressions,
    SUM(clicks) AS total_clicks,
    SUM(revenue) AS total_revenue,
    AVG(revenue) AS average_revenue,
    SUM(expense) AS total_expense,
    AVG(expense) AS average_expense,
    COUNT(DISTINCT channel_id) AS total_unique_channel_ids,
    COUNT(DISTINCT channel_name) AS total_unique_channels,
    COUNT(DISTINCT product_used) AS total_unique_products_used,
    SUM(leads) AS total_leads,
    AVG(leads) AS average_leads,
    SUM(wons) AS total_wons,
    AVG(wons) AS average_wons,
    (SUM(leads) / NULLIF(SUM(visitors), 0)) AS conversion_rate,
    (SUM(clicks) / NULLIF(SUM(impressions), 0)) AS click_through_rate,
    (SUM(adcost) / NULLIF(SUM(clicks), 0)) AS cost_per_click,
    (SUM(revenue) / NULLIF(SUM(expense), 0)) AS return_on_investment,
    (SUM(sessions) / NULLIF(SUM(visitors), 0)) AS average_session_duration,
    (SUM(wons) / NULLIF(SUM(leads), 0)) AS lead_conversion_rate,
    (SUM(revenue) / NULLIF(SUM(visitors), 0)) AS average_revenue_per_visitor,
    (SUM(clicks + leads + wons) / NULLIF(SUM(impressions), 0)) AS channel_engagement_rate
 FROM marketing_details
 GROUP BY month_year, destinationurl, channel_name;

 -- Set the return code
 SET rc = 0;

END;
$$

DELIMITER ;

Once you have created the stored procedure for refreshing the materialized view summary table, you can call it using the following steps:

CALL refresh_marketing_details_mv(@rc);

Periodic Refresh Using Scheduled Events

Scheduled events in MySQL are a mechanism for automating the execution of tasks at specified intervals. They allow you to schedule the execution of SQL statements, including stored procedures, to automate maintenance tasks, data updates, or other recurring operations.

Use the CREATE EVENT statement to define a scheduled event. Specify the event’s name, schedule, and the SQL statement or stored procedure to be executed.

Refer to the MySQL Create Events documentation for syntax:

CREATE
   [DEFINER = user]
   EVENT
   [IF NOT EXISTS]
   event_name
   ON SCHEDULE schedule
   [ON COMPLETION [NOT] PRESERVE]
   [ENABLE | DISABLE | DISABLE ON SLAVE]
   [COMMENT 'string']
   DO event_body;

schedule: {
   AT timestamp [+ INTERVAL interval] ...
 | EVERY interval
   [STARTS timestamp [+ INTERVAL interval] ...]
   [ENDS timestamp [+ INTERVAL interval] ...]
}

interval:
   quantity {YEAR | QUARTER | MONTH | DAY | HOUR | MINUTE |
             WEEK | SECOND | YEAR_MONTH | DAY_HOUR | DAY_MINUTE |
             DAY_SECOND | HOUR_MINUTE | HOUR_SECOND | MINUTE_SECOND}

Scheduled events are suitable when data changes are not too frequent, and periodic updates are sufficient to maintain the accuracy of the materialized view. Here’s an example for our use case:


-- Create the event to refresh the materialized view
DELIMITER $$

CREATE EVENT IF NOT EXISTS refresh_marketing_details_mv_event
ON SCHEDULE EVERY 1 HOUR
DO
BEGIN
 DECLARE rc INT;
 CALL refresh_marketing_details_mv(rc);
END $$

DELIMITER ;

Immediate Refresh Methods Using Triggers

In MySQL, a trigger is a set of instructions that automatically execute in response to specific events on a particular table or view. Triggers can be employed to capture and respond to row-level changes such as inserts, updates, or deletes. Leveraging triggers for materialized views involves updating the summary table whenever there is a relevant change in the underlying base tables.

Refer to the Triggers help documentation for syntax:

CREATE TRIGGER trigger_name
{BEFORE | AFTER} {INSERT | UPDATE | DELETE}
ON table_name
FOR EACH ROW
BEGIN
   -- Trigger body, containing SQL statements
END;

Develop triggers for BEFORE or AFTER each relevant event (INSERT, UPDATE, DELETE) on the base table. For instance:

CREATE TRIGGER after_insert_trigger
AFTER INSERT
ON marketing_details
FOR EACH ROW
BEGIN
   -- Logic to update the materialized view summary table
   
END;

Considerations and Limitations:

  • Data Freshness: Triggers provide real-time updates, ensuring materialized view accuracy immediately after row-level changes.
  • Complexity: Triggers can become complex, especially if the update logic involves intricate calculations or aggregations.
  • Resource Usage: Frequent row-level changes might lead to more frequent updates, potentially impacting database performance. Carefully evaluate the system load.
  • Visibility and Debugging: Triggers operate automatically, making their execution less visible. Implement logging mechanisms to monitor and troubleshoot trigger behavior.

Performance Comparison: Raw Tables vs. Materialized Views

We have constructed identical dashboards in two modes:

Dashboard using raw data, with KPIs created through Bold BI expressions and summary types.
Dashboard using pre-aggregated materialized views summary table, where required KPIs are calculated within the summary table.

Data Count (10 Million Records):

Query Execution Time (in seconds) - Widget Names:

Widget Name Widget Type Query execution time for Raw table Query execution time for summary table Total rows retrieved for the widget
Revenue vs Expenses KPI Card 00:00:50.598 00:00:00.005 1
Select Channel Combo Box 00:00:50.69 00:00:00.020 6
Select Period Combo Box 00:01:01.318 00:00:00.005 13
Conversion Rate Number Card 00:00:50.791 00:00:00.011 1
Cost Per Click Number Card 00:00:50.850 00:00:00.005 1
Total Visitors Numbe rCard 00:00:01.012 00:00:00.017 1
Average Revenue Per Visitor Channel wise Chart 00:01:26.660 00:00:00.010 6
Top 5 Channal Based on Engagement Rate Proportin Chart 00:04:27.858 00:00:00.005 5
Average Click Through Rate Number Card 00:00:50.712 00:00:00.011 1
Total Sessions Number Card 00:00:50.479 00:00:00.011 1
Average Leads Number Card 00:00:50.536 00:00:00.008 1
Average Session Duration Heat Map 00:01:34.131 00:00:00.013 36
Summary Details Grid 00:04:52.772 00:00:00.006 100

Summary

Utilizing materialized views with summary tables in MySQL brings about a substantial enhancement in dashboard performance, evident through notable reductions in query execution times for various widgets. The inherent nature of pre-aggregated tables within materialized views accelerates data retrieval, particularly in scenarios involving intricate calculations and aggregations. However, it is crucial to carefully weigh the trade-offs related to storage space and update frequency when implementing materialized views. The choice of refreshing mechanisms, whether through stored procedures, scheduled events, or triggers, plays a pivotal role and should align with the specific use case requirements and performance priorities. Selecting the most suitable approach ensures optimal performance while considering the frequency at which data changes and needs to be refreshed, thus tailoring the solution to meet the unique demands of each scenario.

Was this article useful?
Like
Dislike
Help us improve this page
Please provide feedback or comments
VS
Written by Venkataprasad Subramani
Updated
Comments (0)
Please  to leave a comment
Access denied
Access denied