Leveraging AWS Redshift Materialized Views for Telemetry Data Aggregation

In many industries, especially energy and utilities, efficiently managing large volumes of telemetry data from diverse IoT devices is critical for managing operations and making data-informed decisions. With more and more types of devices becoming interconnected every day, properly aggregating telemetry data becomes key to derive insights and ultimately gain value from your data.

Once you have this data streaming into your database, how to efficiently allow applications to access and display this data to users becomes challenging, especially when you want to see aggregated data across many devices. There are a few different aspects of this challenge:

  • As data volumes increase, performing large aggregation queries directly against base tables creates long load times for users, and is expensive in terms of resource requirements.
  • Storing aggregations of this data in additional tables has limitations and drawbacks, including:
    • Keeping base table data changes up to date in the aggregation tables can be complicated, especially when working with late-arriving data and reprocessing historical records.
    • Adapting the aggregation tables for changes in business logic and user requirements is time-consuming.

AWS Redshift Materialized Views for Telemetry Data Aggregation

First, let’s define what materialized views in AWS Redshift are. They are essentially stored query results, based on SQL queries performed on one or more base tables, which can then be queried by your application without having to access the base tables.

Materialized views offer a practical solution to the telemetry aggregation challenge for a few key reasons:

1. Faster Access

  • They store data that’s already aggregated. So, when your application needs data, it gets it quickly, without waiting for calculations each time or needing to query large volumes of data from base tables.

2. Smart Refreshes

  • For time series data from IoT devices that is continuously streaming, incremental refreshes are used to only perform calculations from new data, thus reducing resource requirements.
  • In the case that changes are made to historical data that has already been aggregated, the materialized view will automatically perform a full refresh to make sure everything is aligned.
  • Rather than having aggregations of base table data stored in other tables that need to be reconciled, materialized views provide a way to have a single source of truth.

3. Handling Changing Business Logic

  • For example, if you add a new type of sensor to your IoT network, updating your data aggregation to include this new source is straightforward. Adjust the view’s logic and refresh it to incorporate this new data stream.

Getting Started with Materialized Views

Here are the basic steps to keep in mind as you start using materialized views:

1. Identify Your Query Patterns

  • Look for where your application is performing queries that are both predictable and repeated.
  • Features like dashboards are usually a great fit for materialized views, while rarely used and changing calculations are not.

2. Build the Views

  • Create materialized views based on the query patterns you identified.
    • Use each materialized view for a specific narrow purpose. If you have another purpose, build a different view for it. Try not to bloat the view since that defeats the purpose of having a predictable result set.
  • Remember that you can define “materialized” not just in terms of base tables, but also in terms of other materialized views.
    • This strategy is beneficial for data aggregation use cases that involve many hierarchies.

3. Keep Them Updated

  • Use scheduled queries to refresh views at regular intervals.
  • Configure materialized views to auto-refresh when base tables are updated.
    • The auto-refresh operation can be configured to factor in the availability of cluster resources to reduce disruptions.

Final Thoughts

Using materialized views in AWS Redshift for aggregation of IoT device data is a great option to improve your application performance and keep your data consistent and up to date. We have helped clients achieve AWS cost savings of up to 40% through the implementation of a telemetry streaming architecture that leverages materialized views.

To make sure you’re getting the most value out of your IoT device data, selecting an AWS Advanced Tier Services Partner like 27Global is an excellent choice to provide you access to experts in telemetry data processing, aggregation, and insights. Contact one of our solution architects here.

Mladen Gajic is a Consulting Manager at 27Global’s Denver office. He has a background in energy and utilities and has successfully delivered software solutions across various industries.

Share this post