Parallel Queries with Amazon Aurora

Updated: Mar 26, 2019

Parallel query performance is my preferred, non-existent, feature in MySQL. In all versions of MySQL – at minimum at the time of writing – when you run a sole request it will run in one thread, efficiently using one CPU core only. Multiple requests run at the same time will be using changed threads and will operate more than one CPU core.

On multi-core apparatuses – which is the common of the hardware these days – and in the cloud, we have several cores accessible for use. With quicker disks (i.e. SSD) we can’t use the complete potential of IOPS with just single thread.

AWS Aurora (based on MySQL 5.7) now has a type which will care parallelism for SELECT requests (using the read volume of storage nodes beneath the Aurora cluster). In this article, we will look at how this can progress the writing/analytical query routine in MySQL. I will associate AWS Aurora with MySQL (Percona Server) 5.6 running on an EC2 example of the same class.

In Brief

Aurora Parallel Query response time (for queries which can not use indexes) can be 5x-10x better compared to the non-parallel fully cached operations. This is a significant improvement for the slow queries.

Test data and types

For my test, I need to select:

Aurora instance type and assessment
Dataset
Requests

Aurora case type and evaluation

Giving to Jeff Barr’s outstanding article (https://aws.amazon.com/blogs/aws/new-parallel-query-for-amazon-aurora/) the resulting example classes will provision parallel query (PQ):

“The instance class limits the number of equivalent queries that can be active at a given time:

db.r*.large – 1 parallel parallel request session

db.r*.xlarge – 2 parallel query sessions

db.r*.2xlarge – 4 parallel query sessions

db.r*.4xlarge – 8 parallel query sessions

db.r*.8xlarge – 16 parallel query sessions

db.r4.16xlarge – 16 parallel query sessions”

As I want to exploit the concurrency of similar request sessions, I have selected db.r4.8xlarge. For the EC2 example I will use the similar class: r4.8xlarge.

Aurora:

MySQL on ec2

Table

I’m by means of the “Airlines On-Time Performance” database from http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time (You can find the writings I used here: https://github.com/Percona-Lab/ontime-airline-performance).

At work with Aurora Parallel Request

Documents: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html

Aurora PQ works by doing a full table scan (parallel reads are complete on the storage level). The InnoDB buffer pool is not used when Parallel Query is applied.

For the resolutions of the test I straight PQ on and off (usually AWS Aurora uses its own heuristics to control if the PQ will be supportive or not):

Turn on and force

GoplarDB

The Database Experts