Unlock the power of optimization in Amazon Redshift Serverless

Amazon Redshift Serverless automatically scales compute capacity to match workload demands, measuring this capacity in Redshift Processing Units (RPUs). Although traditional scaling primarily responds to query queue times, the new AI-driven scaling and optimization feature offers a more sophisticated approach by considering multiple factors including query complexity and data volume. Intelligent scaling addresses key data warehouse challenges by preventing both over-provisioning of resources for performance and under-provisioning to save costs, particularly for workloads that fluctuate based on daily patterns or monthly cycles.

Amazon Redshift serverless now offers enhanced flexibility in configuring workgroups through two primary methods. Users can either set a base capacity, specifying the baseline RPUs for query execution, with options ranging from 8 to 1024 RPUs and each RPU providing 16 GB of memory, or they can opt for the price-performance target. Amazon Redshift Serverless AI-driven scaling and optimization can adapt more precisely to diverse workload requirements and employs intelligent resource management, automatically adjusting resources during query execution for optimal performance. Consider using AI-driven scaling and optimization if your current workload requires 32 to 512 base RPUs. We don’t recommend using this feature for less than 32 base RPU or more than 512 base RPU workloads.

In this post, we demonstrate how Amazon Redshift Serverless AI-driven scaling and optimization impacts performance and cost across different optimization profiles.

Options in AI-driven scaling and optimization

Amazon Redshift Serverless AI-driven scaling and optimization offers an intuitive slider interface, letting you balance price and performance goals. You can select from five optimization profiles, ranging from Optimized for Cost to Optimized for Performance, as shown in the following diagram. Your slider position determines how Amazon Redshift allocates resources and implements AI-driven scaling and optimizations, to achieve your desired price-performance target.

The slider offers the following options:

Optimized for Cost (1)
- Prioritizes cost savings over performance
- Allocates minimum resources in favor of saving on costs
- Best for workloads where performance isn’t time-critical
Cost-Balanced (25)
- Balances towards cost savings while maintaining reasonable performance
- Allocates moderate resources
- Suitable for mixed workloads with some flexibility in query time
Balanced (50)
- Provides equal emphasis on cost efficiency and performance
- Allocates optimal resources for most use cases
- Ideal for general-purpose workloads
Performance-Balanced (75)
- Favors performance while maintaining some cost control
- Allocates additional resources when needed
- Suitable for workloads requiring consistently fast query elapsed time
Optimized for Performance (100)
- Maximizes performance regardless of cost
- Provides maximum available resources
- Best for time-critical workloads requiring fastest possible query delivery

Which workloads to consider for AI-driven scaling and optimizations

The Amazon Redshift Serverless AI-driven scaling and optimization capabilities can be applied to almost every analytical workload. Amazon Redshift will assess and apply optimizations according to your price-performance target—cost, balance, or performance.

Most analytical workloads operate on millions or even billions of rows and generate aggregations and complex calculations. These workloads have high variability for query patterns and number of queries. The Amazon Redshift Serverless AI-driven scaling and optimization will improve the price, performance, or both because it learns the patterns (the repeatability of your workload) and will allocate more resources towards performance improvements if you’re performance-focused or fewer resources if you’re cost-focused.

Cost-effectiveness of AI-driven scaling and optimization

To effectively determine the effectiveness of Amazon Redshift Serverless AI-driven scaling and optimization we need to be able to measure your current state of price-performance. We encourage you to measure your current price-performance by using sys_query_history to calculate the total elapsed time of your workload and note the start time and end time. Then use sys_serverless_usage to calculate the cost. You can use the query from the Amazon Redshift documentation and add the same start and end times. This will establish your current price performance, and now you have a baseline to compare against.

If such measurement isn’t practical because your workloads are continuously running and it’s impractical for you to determine a fixed start and end time, then another way is to compare holistically, check your month over month cost, check your user sentiment towards performance, towards system stability, improvements in data delivery, or reduction in overall monthly processing times.

Benchmark conducted and results

We evaluated the optimization options using the TPCDS 3TB dataset from the AWS Labs GitHub repository (amazon-redshift-utils). We deployed this dataset across three Amazon Redshift Serverless workgroups configured as Optimized for Cost, Balanced, and Optimized for Performance. To create a realistic reporting environment, we configured three Amazon Elastic Compute Cloud (Amazon EC2) instances with JMeter (one per endpoint) and ran 15 selected TPCDS queries concurrently for approximately 1 hour, as shown in the following screenshot.

We disabled the result cache to make sure Amazon Redshift Serverless ran all queries directly, providing accurate measurements. This setup helped us capture authentic performance characteristics across each optimization profile. Also, we designed our test environment without setting the Amazon Redshift Serverless workgroup max capacity parameter—a key configuration that controls the maximum RPUs available to your data warehouse. By removing this limit, we could clearly showcase how different configurations affect scaling behavior in our test endpoints.

Our comprehensive test plan included running each of the 15 queries 355 times, generating 5,325 queries per test cycle. The AI-driven scaling and optimization needs multiple iterations to identify patterns and optimize RPUs, so we ran this workload 10 times. Through these repetitions, the AI learned and adapted its behavior, processing a total of 53,250 queries throughout our testing period.

The testing revealed how the AI-driven scaling and optimization system adapts and optimizes performance across three distinct configuration profiles: Optimized for Cost, Balanced, and Optimized for Performance.

Queries and elapsed time

Although we ran the same core workload repeatedly, we used variable parameters in JMeter to generate different values for the WHERE clause conditions. This approach created similar but not identical workloads, introducing natural variations that showed how the system handles real-world scenarios with varying query patterns.

Our elapsed time analysis demonstrates how each configuration achieved its performance objectives, as shown by the average consumption metrics for each endpoint, as shown in the following screenshot.

The results matched our expectations: the Optimized for Performance configuration delivered significant speed improvements, running queries approximately two times as the Balanced configuration and four times as the Optimized for Cost setup.

The following screenshots show the elapsed time breakdown for each test.

The following screenshot shows tenth and final test iteration demonstrates distinct performance differences across configurations.

To clarify more, we categorized our query elapsed times into three groups:

Short queries – Less than 10 seconds
Medium queries – From 10 seconds to 10 minutes
Long queries: More than 10 minutes

Considering our last test, the analysis shows:

Duration per configuration	Optimized for Cost	Balanced	Optimized for Performance
Short queries (<10 sec)	1488	1743	3290
Medium queries (10 sec – 10 min)	3633	3579	2035
Long queries (>10 min)	204	3	0
TOTAL	5325	5325	5325

The configuration’s capacity directly impacts query elapsed time. The Optimized for Cost configuration limits resources to save money, resulting in longer query times, making it best suited for workloads that aren’t time critical, where cost savings are prioritized. The Balanced configuration provides moderate resource allocation, striking a middle ground by effectively handling medium-duration queries and maintaining reasonable performance for short queries while nearly eliminating long-running queries. In contrast, the Optimized for Performance configuration allocates more resources, which increases costs but delivers faster query results, making it best for latency-sensitive workloads where query speed is critical.

Capacity used during the tests

Our comparison of the three configurations reveals how Amazon Redshift Serverless AI-driven scaling and optimization technology adapts resource allocation to meet user expectations. The monitoring showed both Base RPU variations and distinct scaling patterns across configurations—scaling up aggressively for faster performance or maintaining lower RPUs to optimize costs.

The Optimized for Cost configuration starts at 128 RPUs and increases to 256 RPUs after three tests. To maintain cost-efficiency, this setup limits the maximum RPU allocation during scaling, even when facing query queuing.

In the following table, we can observe the costs for this Optimized for Cost configuration.

Test#	Starting RPUs	Scaled up to	Cost incurred
1	128	1408	$254.17
2	128	1408	$258.39
3	128	1408	$261.92
4	256	1408	$245.57
5	256	1408	$247.11
6	256	1408	$257.25
7	256	1408	$254.27
8	256	1408	$254.27
9	256	1408	$254.11
10	256	1408	$256.15

The strategic RPU allocation by Amazon Redshift Serverless helps optimize costs, as demonstrated in tests 3 and 4, where we observed significant cost savings. This is shown in the following graph.

Although the optimization for cost changed the base RPU, the balanced configuration didn’t change the base RPUs but scaled up to 2176, further than the 1408 RPUs that were the maximum used by the cost optimization setup. The following table shows the figures for the Balanced configuration.

Test#	Starting RPUs	Scaled up to	Cost incurred
1	192	2176	$261.48
2	192	2112	$270.90
3	192	2112	$265.26
4	192	2112	$260.20
5	192	2112	$262.12
6	192	2112	$253.18
7	192	2112	$272.80
8	192	2112	$272.80
9	192	2112	$263.72
10	192	2112	$243.28

The Balanced configuration, averaging $262.57 per test, delivered significantly better performance while costing only 3% more than the Optimized for Cost configuration, which averaged $254.32 per test. As demonstrated in the previous section, this performance advantage is evident in the elapsed time comparisons. The following graph shows the costs for the Balanced configuration.

As expected from the Optimized for Performance configuration, the usage of resources was higher to attend the high performance. In this configuration, we can also observe that after two tests, the engine adapted itself to start with a higher number of RPUs to attend the queries faster.

Test#	Starting RPUs	Scaled Up to	Cost incurred
1	512	2753	$295.07
2	512	2327	$280.29
3	768	2560	$333.52
4	768	2991	$295.36
5	768	2479	$308.72
6	768	2816	$324.08
7	768	2413	$300.45
8	768	2413	$300.45
9	768	2107	$321.07
10	768	2304	$284.93

Despite a 19% cost increase in the third test, most subsequent tests remained below the $304.39 average cost.

The Optimized for Performance configuration maximizes resource usage to achieve faster query times, prioritizing speed over cost efficiency.

The final cost-performance analysis reveals compelling results:

The Balanced configuration delivered twofold better performance while costing only 3.25% more than the Optimized for Cost setup
The Optimized for Performance configuration achieved fourfold faster elapsed time with a 19.39% cost increase compared to the Optimized for Cost option.

The following chart illustrates our cost-performance findings:

It’s important to note that these results reflect our specific test scenario. Each workload has unique characteristics, and the performance and cost differences between configurations might vary significantly in other use cases. Our findings serve as a reference point rather than a universal benchmark. Additionally, we didn’t test two intermediate configurations available in Amazon Redshift Serverless: one between Optimized for Cost and Balanced, and another between Balanced and Optimized for Performance.

Conclusion

The test results demonstrate the effectiveness of Amazon Redshift Serverless AI-driven scaling and optimization across different workload requirements. These findings highlight how Amazon Redshift Serverless AI-driven scaling and optimization can help organizations find their ideal balance between cost and performance. Although our test results serve as a reference point, each organization should evaluate their specific workload requirements and price-performance targets. The flexibility of five different optimization profiles, combined with intelligent resource allocation, enables teams to fine-tune their data warehouse operations for optimal efficiency.

To get started with Amazon Redshift Serverless AI-driven scaling and optimization, we recommend:

Establishing your current price-performance baseline
Identifying your workload patterns and requirements
Testing different optimization profiles with your specific workloads
Monitoring and adjusting based on your results

By using these capabilities, organizations can achieve better resource utilization while meeting their specific performance and cost objectives.

Ready to optimize your Amazon Redshift Serverless workloads? Visit the AWS Management Console today to create your own Amazon Redshift Serverless AI-driven scaling and optimization to start exploring the different optimization profiles. For more information, check out our documentation on Amazon Redshift Serverless AI-driven scaling and optimization, or contact your AWS account team to discuss your specific use case.

About the Authors

Ricardo Serafim is a Senior Analytics Specialist Solutions Architect at AWS. He has been helping companies with Data Warehouse solutions since 2007.

Milind Oke is a Data Warehouse Specialist Solutions Architect based out of New York. He has been building data warehouse solutions for over 15 years and specializes in Amazon Redshift.

Andre Hass is a Senior Technical Account Manager at AWS, specialized in AWS Data Analytics workloads. With more than 20 years of experience in databases and data analytics, he helps customers optimize their data solutions and navigate complex technical challenges. When not immersed in the world of data, Andre can be found pursuing his passion for outdoor adventures. He enjoys camping, hiking, and exploring new destinations with his family on weekends or whenever an opportunity arises. Solana Token Creator