AWS : Database Specialty Exam — Part 4

A Quick Review for Monitoring and Troubleshooting

Pisit J.
Sum up As A Service

--

Part 4: Monitoring and Troubleshooting (18% of exam)

  • Determine monitoring and alerting strategies
  • Evaluate the configuration, schema design, queries, and infrastructure to improve database performance

1 — The application uses the read replica from Amazon RDS for MySQL database for generating business reports to avoid performance issues on the primary DB. However, the queries to the read replica still have a high replication lag.

What can be root cause of this issue ?

  • Long-running queries on the primary DB instance
  • binlog_format on the primary DB instance to ROW
  • Insufficient replica DB instance class size or storage
  • Binary logs synced to the disk on the replica DB instance
  • Creation of new replica DB instance

Note — When using Amazon RDS for MySQL read replica with binary log file position based replication, you can monitor replication lag in Amazon CloudWatch by viewing the Amazon RDS ReplicaLag metric, in term of Seconds_Behind_Master.

https://aws.amazon.com/premiumsupport/knowledge-center/rds-mysql-high-replica-lag/

2 — What should the Database Specialist consider to enable Amazon RDS Performance Insights ?

  • The AWS account used to build the cluster is granted proper AWS KMS key and IAM policies.
  • Amazon RDS Performance Insights is configured on a per-instance level.

Note — Turning Performance Insights on and off doesn’t cause downtime, a reboot, or a failover.

The Performance Insights agent consumes limited CPU and memory on the DB host. When the DB load is high, the agent limits the performance impact by collecting data less frequently.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.Enabling.html

3 — What should the Database Specialist do to fix when Amazon RDS abruptly stopped working because the database is in STORAGE_FULL state ?

  • Add more storage space to DB instance.
  • Increase the allocated storage property.

Note — If the DB instance is in a STORAGE_FULL state, the instance accepts only allocated storage modifications. Any modifications to other values are rejected.

You must increase allocated storage by at least 10%. An increase less of than 10% results in an error.

When the DB instance is in a storage-optimization status, the instance is operational, but you can’t make other storage modifications for six hours or until the DB instance’s status is no longer storage-optimization.

In most cases, a small increase to the Allocated Storage allows you to reconnect to the instance so that you can perform additional troubleshooting

https://aws.amazon.com/premiumsupport/knowledge-center/rds-out-of-storage/

4 — The Database Specialist needs to view the real-time metrics of the operating system (OS) that the Amazon RD instance to analyze how the processes or threads use the CPU.

What should the Database Specialist do to view the required metrics ?

Enable Enhanced Monitoring

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.overview.html

5 — How should the Database Manager take to investigate the Amazon RDS during an increase in CPU load ?

Use Performance Insights to assess the database load and filter the load by waits, SQL statements, hosts, or users.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.html

--

--