RDS Postgres Multi AZ force failover
I have recently done Multi AZ failover for RDS Postgres. we can use cli command aws rds reboot-db-instance to test this failover
When enabling Multi AZ, AWS automatically creates a primary database (DB) instance and synchronous replica in a different AZ (stand by instance).
aws rds reboot-db-instance — db-instance-identifier DB_INSTANCE_IDENTIFIER — force-failover
we can get the evidence for failover under Logs & Events Tab of RDS console and Events will have data about failover also look for the AZ under Networking section of connectivity and security Tab
After failover
- RDS DNS endpoint does not change , during fail over DNS propagation takes place
- Underlying IP address associated with the endpoint will change to the IP address of the new primary instance
https://repost.aws/questions/QU4DYhqh2yQGGmjE_x0ylBYg/what-happens-after-failover-in-rds - DB Down time -it took me nearly 4 mins for medium DB size
- failed instance will be recovered and used as stand by (sync action with the current primary node happens)
- During automated failover, transactions or inflight queries are terminated (application should have retry logic )
https://repost.aws/knowledge-center/rds-connections-reboot-failover
The RPO with an Amazon RDS Multi-AZ instance failover is zero because of the synchronous replication to the standby db instance. The amount of time it takes for failover is usually 1–2 minutes. Long recovery times due to rollback of uncommitted transactions or roll-forward of in-memory committed transactions, limits on instance class’s IO throughput, lazy loading from Amazon S3 to Amazon EBS volumes, and the amount of transactions logs that must be copied can all prolong failover time.