Cloud computing provides us with the ability to deploy infrastructure as code. Not so long ago, to deploy a new database server you had to buy physical hardware: a hard disk, CPU, RAM, power supply, etc. Afterward, you had to install the database server, make it run once the machine starts, provision replicas, set up a backup policy and allocate space for it, handle errors, and routinely update it with security patches and new versions. Wow, it took a while even to write what’s required if you manage databases yourself. Just think about how much work is required to actually do it.
AWS RDS in a Nutshell
Amazon RDS is a database as a service (DbaaS) that gives you a working database in the cloud while reducing about 90 percent of the setup and maintenance overhead. In this post I’ll explain what Amazon RDS is and how it differs from AWS EC2, and I’ll go into detail about the different types of RDS instances and how to choose the right one.
What Is AWS RDS
AWS RDS supports most of the popular relational databases and their latest versions: MySQL, MariaDB, Oracle, PostgreSQL, and MsSQL, thus supporting in most cases a simple migration from an on-premise database. It provides out-of-the-box database provisioning (with a DB version of your choice), high availability and redundancy (by deploying replicas of the DB in different physical data centers with automatic data synchronization and failover), automatic backups, manual snapshots, etc.
It also provides some additional benefits over self-hosted databases such as the following:
- Pay-as-you-go pricing. As with other AWS services, you only pay for what you use. You can seamlessly start small and grow big over time as needed. But if you manage your own database, you’ll pay a fixed cost no matter how much you use it, and scaling an on-premises DB requires a great deal of effort.
- Deep integration with other AWS products. AWS RDS integrates very well with other AWS products. For instance, you can use AWS Cloud Watch to monitor your RDS instances, you can use AWS IAM to control access to them, and you can use AWS VPC to manage the internal networking.
- An easy ability to analyze the data via external services that provides insights, such as Scalyr Event Data Cloud.
What Is the Difference Between EC2 and RDS
EC2 is a general purpose compute instance that provides the equivalent of a (usually) Linux virtual machine with a CPU, RAM, and storage with SSH access. On the other hand, RDS is a dedicated database service. You don’t get SSH access to the underlying machines that run your database, and you (usually) can’t tune them, but you get all the goodies described above. It’s possible to deploy a database server in EC2, but it will mostly be like deploying a database on premise. You won’t get the service aspects of RDS as defined above (backups, replicas, monitoring, etc).
AWS RDS Instance Types
The DB instance is the basic unit of work in your RDS workload. Each instance can be viewed as a dedicated VM for database purposes. It can host multiple database schemes. Each instance has a tag(name) and DNS entry. You can select three types of storage for each instance: general purpose SSD, provisioned IOPS SSD, and legacy magnetic storage (more on this later).
Amazon RDS supports three types of instance classes: Standard, Memory Optimized, and Burstable Performance. Instance types comprise varying combinations of CPU, memory, storage, and networking capacity and give you the flexibility to choose the appropriate mix of resources for your database. Each instance type includes several instance sizes, allowing you to scale your database to the requirements of your target workload.
Why It’s Important to Choose the Right Type of Instance
The first reason is price. Choosing a too-large instance will cost money that could have been spent more productively elsewhere. Another reason is that, as stated above, choosing the proper type of instance is important to meet your database workload needs. For instance, if you are reading a lot of rows from the database (hundreds of thousands), those are stored in memory before they are sent to the client. For this kind of workload, you’ll want to have a memory optimized instance. Choosing the wrong type of instance can result in increased latency, service denial, and a generally bad customer experience. In the worst case, it can fail database commits, which will result in corruption of data and data loss. Last but not least, not all databases listed above are available on all instance types.
Old RDS Instance Types vs. New RDS Instance Types
Each class of instance―burstable, memory optimized, etc.―contains newer types of instances and older (legacy) types. As a rule of thumb, it’s better to chose the latest types as they usually run on better hardware and provide better performance for the same price. For burstable instances it’s better to choose the T3 over the T2, for general purpose types it’s better to choose the M5 over the M4, and so on.
Burstable Performance Instances (T Type Instances)
Burstable performance instances are ideal for workloads that usually require a constant baseline performance with occasional peaks in demand for a limited time. For instance, an e-commerce site will usually have a relatively constant load on the database that will increase dramatically on Cyber Monday and Black Friday. For this kind of workload, the T type instances are ideal as they provide a baseline performance with the ability to “burst” when needed. Note, however, that only the CPU is burstable. Only the CPU performance will increase with the spikes in read/write operations. The RAM, network, and storage capacity won’t. So you still need to choose a proper instance that’s large enough in size to handle such spikes.
When Is Burst Needed
For most cases, burst performance is handled out of the box. T type instances can handle occasional spikes automatically without needing to burst if the average performance is at or below the baseline CPU level for a twenty-four-hour window. In other words, if during a twenty-four-hour period you had a spike that lasted thirty minutes, in most cases the instance won’t need to burst. On the other hand, if you have a five-day period of increased usage, burst performance will be needed to cope with this load.
As stated in the documentation: “T type instances’ baseline performance and ability to burst are governed by CPU Credits. Each T type instance receives CPU Credits continuously, the rate of which depends on the instance size. T type instances accrue CPU Credits when they are idle, and use CPU credits when they are active. A CPU Credit provides the performance of a full CPU core for one minute.” To put it in simpler terms, you can “burst budget” when your instance is underused. The less usage you have compared to the baseline average, the more future CPU credits you’ll save for burst activity. If the instance does not use the credits it receives, they’ll be stored in the credit balance up to a maximum of 288 CPU Credits.
The Problem With Burst Credits
Truth to be told, this is a a difficult calculation to make. The number of credits that accumulate in a certain time frame changes per instance size and can change for each of your instances. In addition, note that for the old type T2 instance, if you are out of credits, you can’t pay for whatever burst you might need. With the new T3 instances, if you are out of your burst budget, you can just pay for it with an additional fee.
When Is This Useful?
If you don’t want to actively monitor your burst budget and instance usage, a T2 or T3 instance is only useful if you know for sure that it’s underutilized most of the time and that you won’t need to cope with a significant load for a few days, like with the e-commerce example above.
General Purpose Instances
General purpose instances provide a balance of compute, memory, and network resources. It is a good choice for standard database workloads. The general purpose instances are split into two main subtypes: the M5 and the M6g.
M5 is the latest generation of the general purpose instances running regular Xeon CPUs. It replaces the M4 subtype instances. This should be the default choice when you don’t have any specific hardware requirements.
The M6g subtype boasts the new AWS custom-built Graviton2 processors. Amazon claims that the Graviton2 processors provide 40 percent better performance than the standard Xeon CPUs for the same instance size. However, this claim was contested in recent benchmarks. In addition, as real-world database performance depends not only on the CPU performance but even more so on the RAM, disk, and network performance, you might not notice any performance gains. In addition, before switching to a Graviton2 CPU, you need to make sure that your database engine of choice and the corresponding database version support this CPU.
Memory Optimized Instances
Memory optimized instances are best for scenarios where you need to load a lot of rows (hundreds of thousands) into memory, such as after running a SELECT query on a really large table. The main subtypes for these kind of instances are the R6g and the R5. Additional subtypes exist (such as the Z1d), but those are for niche use cases and won’t be discussed here.
The R6g is the same as the M6g described above but includes a lot more memory. For example, the basic instance comes with 16 GB of RAM. However, the same caveats described regarding the M6g hold true here as well.
The R5 is the same as the M5 described above, but it’s memory optimized. It starts with 16 GB of RAM for the smallest instance.
For each of the instance types, there are three types of storage available. General purpose SSD storage, provisioned IOPS SSD storage, and magnetic storage.
General Purpose SSD Storage
The general SSD storage is the standard storage for most workloads and use cases. As stated in the documentation regarding the performance of general purpose SSD: “Baseline I/O performance for General Purpose SSD storage is 3 IOPS for each GiB, with a minimum of 100 IOPS. This relationship means that larger volumes have better performance. For example, baseline performance for a 100-GiB volume is 300 IOPS. Baseline performance for a 1-TiB volume is 3,000 IOPS.” Nevertheless, as with T type instances that have burstable performance, the general purpose SSD has burst capabilities as well. However, as with the T type instances, the calculation of the “burstable budget” is tricky and hence not very useful for most organizations.
Provisioned SSD Storage
Provisioned SSD storage costs more than regular storage, but it should provide better and more consistent performance. Consistent storage performance is important for certain scenarios. Online casinos, e-commerce marketplaces, and others that frequently read and write to the disk. However, benchmarks didn’t find a difference between general purpose and provisioned SSD performance.
Legacy magnetic disk (HDD) storage is much slower than SSD storage, but it’s also cheaper. It’s not recommended for any new applications and should be used in legacy scenarios only.
On-Demand vs. Reserved Instances
As with EC2 instances, RDS instances are priced by the hour and billed by the second. If an instance is stopped, you stop paying for it. However, you still pay for any storage you’re using. Also as with EC2 instances, you can pay as you go with no up-front commitment for each instance. These are called on-demand instances.
However, you’ll usually know a database’s lifespan up front. It’s rare for a database to become obsolete out of the blue. Since you’ll usually know if you need the database long-term or as a sample database instance for testing, it might make sense to use reserved instances.
You can reserve an instance for a year or three years and thus receive a discount on the regular on-demand price. The downside, of course, is that you have to pay for the full one- or three-year period even if you no longer need it before the period is over.
You can save money by using reserved instances in three ways: no up-front payment, a partial up-front payment, and a full up-front payment. As expected, the last option will give you the largest discount on the hourly rate.
As you have now seen, there are many options to choose from when it comes to RDS instances, and the right choice will affect your bottom line. In addition, choosing the wrong instance type can negatively affect your end users’ experience. Once you’ve chosen an instance type that works for you―general purpose, memory optimized or burstable―and after you’ve chosen the right instance size and applicable storage type, it makes sense to consider using a reserved instance to save even more.
This post was written by Alexander Fridman. Alexander is a veteran in the software industry with over 11 years of experience. He worked his way up the corporate ladder and has held the positions of Senior Software Developer, Team Leader, Software Architect, and CTO. Alexander is experienced in frontend development and DevOps, but he specializes in backend development.