sveska

AWS Fundamentals RDS, Aurora, ElastiCache

AWS Fundamentals RDS, Aurora, ElastiCache

AWS RDS Overview

  • RDS stands for Relational Database Service • It’s a managed DB service for DB use SQL as a query language.
  • It allows you to create databases in the cloud that are managed by AWS • Postgres • MySQL • MariaDB • Oracle • Microsoft SQL Server
  • Aurora (AWS Proprietary database)

    Advantage over using RDS versus deploying

    DB on EC2

  • RDS is a managed service: • Automated provisioning, OS patching • Continuous backups and restore to specific timestamp (Point in Time Restore)! • Monitoring dashboards • Read replicas for improved read performance • Multi AZ setup for DR (Disaster Recovery) • Maintenance windows for upgrades • Scaling capability (vertical and horizontal) • Storage backed by EBS (gp2 or io1) • BUT you can’t SSH into your instances

    RDS Backups

  • Backups are automatically enabled in RDS
  • Automated backups: • Daily full backup of the database (during the maintenance window) • Transaction logs are backed-up by RDS every 5 minutes • => ability to restore to any point in time (from oldest backup to 5 minutes ago) • 7 days retention (can be increased to 35 days)
  • DB Snapshots: • Manually triggered by the user • Retention of backup for as long as you want

    RDS Read Replicas for read scalability

  • Up to 5 Read Replicas • Within AZ, Cross AZ or Cross Region
  • Replication is ASYNC, so reads are eventually consistent
  • Replicas can be promoted to their own DB
  • Applications must update the connection string to leverage read replicas

    RDS Read Replicas – Use Cases

  • You have a production database that is taking on normal load
  • You want to run a reporting application to run some analytics
  • You create a Read Replica to run the new workload there
  • The production application is unaffected
  • Read replicas are used for SELECT (=read) only kind of statements (not INSERT, UPDATE, DELETE)

    RDS Read Replicas – Network Cost

  • In AWS there’s a network cost when data goes from one AZ to another
  • To reduce the cost, you can have your Read Replicas in the same AZ

    RDS Multi AZ (Disaster Recovery)

  • SYNC replication
  • One DNS name – automatic app failover to standby
  • Increase availability
  • Failover in case of loss of AZ, loss of network, instance or storage failure
  • No manual intervention in apps
  • Not used for scaling
  • Note:The Read Replicas be setup as Multi AZ for Disaster Recovery (DR)

    RDS Security - Encryption

  • At rest encryption
  • Possibility to encrypt the master & read replicas with AWS KMS - AES-256 encryption • Encryption has to be defined at launch time • If the master is not encrypted, the read replicas cannot be encrypted • Transparent Data Encryption (TDE) available for Oracle and SQL Server
  • In-flight encryption • SSL certificates to encrypt data to RDS in flight • Provide SSL options with trust certificate when connecting to database
  • To enforce SSL:
  • PostgreSQL: rds.force_ssl=1 in the AWS RDS Console (Parameter Groups)
  • MySQL: Within the DB: GRANT USAGE ON . TO ‘mysqluser’@’%’ REQUIRE SSL;

    RDS Encryption Operations

  • Encrypting RDS backups
  • Snapshots of un-encrypted RDS databases are un-encrypted
  • Snapshots of encrypted RDS databases are encrypted
  • Can copy a snapshot into an encrypted one
  • To encrypt an un-encrypted RDS database:
  • Create a snapshot of the un-encrypted database
  • Copy the snapshot and enable encryption for the snapshot
  • Restore the database from the encrypted snapshot
  • Migrate applications to the new database, and delete the old database

    RDS Security – Network & IAM

  • Network Security
  • RDS databases are usually deployed within a private subnet, not in a public one
  • RDS security works by leveraging security groups (the same concept as for EC2 instances) – it controls which IP / security group can communicate with RDS
  • Access Management
  • IAM policies help control who can manage AWS RDS (through the RDS API)
  • Traditional Username and Password can be used to login into the database
  • IAM-based authentication can be used to login into RDS MySQL & PostgreSQL

    RDS - IAM Authentication

  • IAM database authentication works with MySQL and PostgreSQL
  • You don’t need a password, just an authentication token obtained through IAM & RDS API calls
  • Auth token has a lifetime of 15 minutes
  • Benefits: • Network in/out must be encrypted using SSL • IAM to centrally manage users instead of DB • Can leverage IAM Roles and EC2 Instance profiles for easy integrati

    RDS Security – Summary

    -• Encryption at rest: • Is done only when you first create the DB instance • or: unencrypted DB => snapshot => copy snapshot as encrypted => create DB from snapshot

  • Your responsibility: • Check the ports / IP / security group inbound rules in DB’s SG • In-database user creation and permissions or manage through IAM • Creating a database with or without public access • Ensure parameter groups or DB is configured to only allow SSL connections
  • AWS responsibility: • No SSH access • No manual DB patching • No manual OS patching • No way to audit the underlying instance

    Amazon Aurora

  • Aurora is a proprietary technology from AWS (not open sourced)
  • Postgres and MySQL are both supported as Aurora DB (that means your drivers will work as if Aurora was a Postgres or MySQL database)
  • Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
  • Aurora storage automatically grows in increments of 10GB, up to 64 TB.
  • Aurora can have 15 replicas while MySQL has 5, and the replication process is faster (sub 10 ms replica lag)
  • Failover in Aurora is instantaneous. It’s HA (High Availability) native.
  • Aurora costs more than RDS (20% more) – but is more efficient

    Aurora High Availability and Read Scaling

  • 6 copies of your data across 3 AZ:
  • 4 copies out of 6 needed for writes
  • 3 copies out of 6 need for reads
  • Self healing with peer-to-peer replication
  • Storage is striped across 100s of volumes
  • One Aurora Instance takes writes (master)
  • Automated failover for master in less than 30 seconds
  • Master + up to 15 Aurora Read Replicas serve reads
  • Support for Cross Region Replication

    Features of Aurora

  • Automatic fail-over
  • Backup and Recovery
  • Isolation and security
  • Industry compliance
  • Push-button scaling
  • Automated Patching with Zero Downtime
  • Advanced Monitoring
  • Routine Maintenance
  • Backtrack: restore data at any point of time without using backups

    Aurora Security

  • Similar to RDS because uses the same engines
  • Encryption at rest using KMS
  • Automated backups, snapshots and replicas are also encrypted
  • Encryption in flight using SSL (same process as MySQL or Postgres)
  • Possibility to authenticate using IAM token (same method as RDS)
  • You are responsible for protecting the instance with security groups
  • You can’t SSH

    Aurora Serverless

  • Automated database instantiation and autoscaling based on actual usage
  • Good for infrequent, intermittent or unpredictable workloads
  • No capacity planning needed
  • Pay per second, can be more cost-effective

    Global Aurora

    Aurora Cross Region Read Replicas:

  • Useful for disaster recovery
  • Simple to put in place
  • Aurora Global Database (recommended):
  • 1 Primary Region (read / write)
  • Up to 5 secondary (read-only) regions, replication lag is less than 1 second
  • Up to 16 Read Replicas per secondary region
  • Helps for decreasing latency
  • Promoting another region (for disaster recovery) has an RTO of < 1 minute

    Amazon ElastiCache Overview

  • The same way RDS is to get managed Relational Databases…
  • ElastiCache is to get managed Redis or Memcached
  • Caches are in-memory databases with really high performance, low latency
  • Helps reduce load off of databases for read intensive workloads
  • Helps make your application stateless
  • AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backups
  • Using ElastiCache involves heavy application code changes

    ElastiCache Solution Architecture - DB Cache

  • Applications queries ElastiCache, if not available, get from RDS and store in ElastiCache.
  • Helps relieve load in RDS
  • Cache must have an invalidation strategy to make sure only the most current data is used in there.

    ElastiCache Solution Architecture – User Session Store

    User logs into any of the application

  • The application writes the session data into ElastiCache
  • The user hits another instance of our application
  • The instance retrieves the data and the user is already logged in

    ElastiCache – Redis vs Memcached

  • REDIS • Multi AZ with Auto-Failover • Read Replicas to scale reads and have high availability • Data Durability using AOF persistence • Backup and restore features
  • MEMCACHED • Multi-node for partitioning of data (sharding) • Non persistent • No backup and restore • Multi-threaded architecture

    Caching Implementation Considerations

  • Read more at: https://aws.amazon.com/caching/implementationconsiderations/
  • Is it safe to cache data? Data may be out of date, eventually consistent
  • Is caching effective for that data?
  • Pattern: data changing slowly, few keys are frequently needed
  • Anti patterns: data changing rapidly, all large key space frequently needed
  • Is data structured well for caching?
  • example: key value caching, or caching of aggregations results
  • Which caching design pattern is the most appropriate?

    Lazy Loading / Cache-Aside / Lazy Population

  • Pros • Only requested data is cached (the cache isn’t filled up with unused data) • Node failures are not fatal (just increased latency to warm the cache)
  • Cons • Cache miss penalty that results in 3 round trips, noticeable delay for that request • Stale data: data can be updated in the database and outdated in the cache

    Write Through –Add or Update cache when database is updated

  • Pros: • Data in cache is never stale, reads are quick • Write penalty vs Read penalty (each write requires 2 calls) • Cons: • Missing Data until it is added / updated in the DB. Mitigation is to implement Lazy Loading strategy as well • Cache churn – a lot of the data will never be read

    Cache Evictions and Time-to-live (TTL)

  • Cache eviction can occur in three ways:
  • You delete the item explicitly in the cache
  • Item is evicted because the memory is full and it’s not recently used (LRU)
  • You set an item time-to-live (or TTL)
  • TTL are helpful for any kind of data:
  • Leaderboards
  • Comments
  • Activity streams
  • TTL can range from few seconds to hours or days
  • If too many evictions happen due to memory, you should scale up or out

    Final words of wisdom

  • Lazy Loading / Cache aside is easy to implement and works for many situations as a foundation, especially on the read side
  • Write-through is usually combined with Lazy Loading as targeted for the queries or workloads that benefit from this optimization
  • Setting a TTL is usually not a bad idea, except when you’re using Writethrough. Set it to a sensible value for your application
  • Only cache the data that makes sense (user profiles, blogs, etc…)
  • Quote: There are only two hard things in Computer Science: cache invalidation and naming things

    Questions

  • My company would like to have a MySQL database that is going to be available even in case of a disaster in the AWS Cloud. I should setup:-Multi AZ
  • Our RDS database struggles to keep up with the demand of the users from our website. Our million users mostly read news, and we don’t post news very often. Which solution will NOT help fix this problem?-RDS Multi AZ
  • We have setup read replicas on our RDS database, but our users are complaining that upon updating their social media posts, they do not see the update right away:-Read Replicas have asynchronous replication and therefore it’s likely our users will only observe eventual consistency
  • Which RDS Classic (not Aurora) feature does not require us to change our SQL connection string?-Multi AZ keeps the same connection string regardless of which database is up. Read Replicas imply we need to reference them individually in our application as each read replica will have its own DNS name
  • You want to ensure your Redis cluster will always be available
  • Your application functions on an ASG behind an ALB. Users have to constantly log back in and you’d rather not enable stickiness on your ALB as you fear it will overload some servers. What should you do?-Store session data in ElastiCache
  • One analytics application is currently performing its queries against your main production database. These queries slow down the database which impacts the main user experience. What should you do to improve the situation?-Read Replicas will help as our analytics application can now perform queries against it, and these queries won’t impact the main production database.
  • You have a requirement to use TDE (Transparent Data Encryption) on top of KMS. Which database technology does NOT support TDE on RDS?-PostgreSQL
  • Which RDS database technology does NOT support IAM authentication?-Oracle
  • You would like to ensure you have a database available in another region if a disaster happens to your main region. Which database do you recommend?-Global Databases allow.Global Databases allow you to have cross region replication.
  • You are managing a PostgreSQL database and for security reasons, you would like to ensure users are authenticated using short-lived credentials. What do you suggest doing?- Use PostgreSQL for RDS and authenticate using a token obtained through the RDS service. In this case, IAM is leveraged to obtain the RDS service token, so this is the IAM authentication use case.
  • Your organisation wants to enforce SSL connections on your MySQL database:Apply a ‘REQUIRE SSL’ statement to all your users in your SQL database
  • You are implementing a caching strategy with ElastiCache and would like to ensure that only the data that is often requested will be loaded in ElastiCache, as your cache size is small. Which caching strategy should you implement?-Lazy Loading would only cache data that is actively requested from the database
  • You are serving web pages for a very dynamic website and you have a requirement to keep latency to a minimum for every single user when they do a read request. Writes can take longer to happen. Which caching strategy do you recommend?-Write-through, this has longer writes, but the reads are quick and the data is always updated in the cache.