top of page
Search

Parallel Queries with Amazon Aurora

Updated: Mar 26, 2019

Parallel query performance is my preferred, non-existent, feature in MySQL. In all versions of MySQL – at minimum at the time of writing – when you run a sole request it will run in one thread, efficiently using one CPU core only. Multiple requests run at the same time will be using changed threads and will operate more than one CPU core.


On multi-core apparatuses – which is the common of the hardware these days – and in the cloud, we have several cores accessible for use. With quicker disks (i.e. SSD) we can’t use the complete potential of IOPS with just single thread.


AWS Aurora (based on MySQL 5.7) now has a type which will care parallelism for SELECT requests (using the read volume of storage nodes beneath the Aurora cluster). In this article, we will look at how this can progress the writing/analytical query routine in MySQL. I will associate AWS Aurora with MySQL (Percona Server) 5.6 running on an EC2 example of the same class.


In Brief

Aurora Parallel Query response time (for queries which can not use indexes) can be 5x-10x better compared to the non-parallel fully cached operations. This is a significant improvement for the slow queries.


Test data and types

For my test, I need to select:

  • Aurora instance type and assessment

  • Dataset

  • Requests


Aurora case type and evaluation

Giving to Jeff Barr’s outstanding article (https://aws.amazon.com/blogs/aws/new-parallel-query-for-amazon-aurora/) the resulting example classes will provision parallel query (PQ):


“The instance class limits the number of equivalent queries that can be active at a given time:

db.r*.large – 1 parallel parallel request session

db.r*.xlarge – 2 parallel query sessions

db.r*.2xlarge – 4 parallel query sessions

db.r*.4xlarge – 8 parallel query sessions

db.r*.8xlarge – 16 parallel query sessions

db.r4.16xlarge – 16 parallel query sessions”


As I want to exploit the concurrency of similar request sessions, I have selected db.r4.8xlarge. For the EC2 example I will use the similar class: r4.8xlarge.


Aurora:


MySQL on ec2


Table

I’m by means of the “Airlines On-Time Performance” database from http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time (You can find the writings I used here: https://github.com/Percona-Lab/ontime-airline-performance).


At work with Aurora Parallel Request

Documents: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html


Aurora PQ works by doing a full table scan (parallel reads are complete on the storage level). The InnoDB buffer pool is not used when Parallel Query is applied.


For the resolutions of the test I straight PQ on and off (usually AWS Aurora uses its own heuristics to control if the PQ will be supportive or not):

Turn on and force

278 views0 comments

Recent Posts

See All

What are the future prospects of Java

According to a survey conducted, nearly 63% programmers mentioned that they will likely continue to work with Java along with coding with Python, SQL and HTML/CSS. In addition, the giants in the corpo

Deleting Duplicate Rows in MySQL

In this blog, you will study several ways to delete duplicate rows in MySQL. In this blog, I have shown you how to find duplicate values in a table. Once the duplicates rows are recognized, you may ne

Upload Data to MySQL tables using mysqlimport

Uploading quite a lot of rows of data from a text, csv, and excel file into a MySQL table is a repetitive task for sysadmins and DBAs who are handling MySQL database. This blog clarifies 4 applied exa

bottom of page