Custom configuration of Amazon RDS instances

Amazon RDS instances are convenient and scalable. However, whilst running long-running workflows with large datasets, the servers can come under considerable strain at times. Sometimes, we have noticed that certain transactions return with an error – not because something is wrong with the transaction syntax, rather that the MySQL server believes that the transaction is taking too long. This behaviour appears to be specific to MySQL – such behaviour has not been observed with other RDBMS software, such as PostgreSQL.

In order to fix these unnecessary database exceptions, the MySQL configuration must be modified in order to increase a number of timeout values. By default, some of these timeouts are quite small (10 seconds or so), which is the cause of the observed problems. We have several responders that input large amounts of data in the same transaction and sometimes data upload to a remote server can exceed the default values.
The configuration of Amazon RDS instances is handled slightly differently to a typical RDBMS installation. Unlike a local MySQL installation where the configuration file is easily modifiable, the configuration file used for the RDS service are not readily accessible.

Creating a parameter group

Log into the Amazon Management Console and click the ‘RDS‘ tab. Make sure your intended region is selected in the ‘Region‘ combo box. Select ‘DB Parameter Groups‘ from the menu on the left. Then click the ‘Create DB Parameter Group‘ button from the toolbar. A dialogue box should appear. For ‘DB Parameter Group Family‘, choose ‘mysql5.5‘. Then choose a name and appropriate description for your parameter group:
Currently, it is not possible to modify parameters using the web interface. Continue to the next section to find out how to set new values.

Setting parameter group values

Setting values of items in the parameter group must (as of time of writing) be performed using the Amazon RDS command line tools. We assume that you have installed these already.
Once you have the RDS command line tools on your machine, you can start to modify the default MySQL values. Note that some values require a reboot of the RDS instance to take effect.
Run the following command lines to do the following:
  • Increase max_allowed_packet from the default 1MB to 512MB. Some databases (including various Microbase tables) make use of the MySQL BLOB type. In order to send large BLOBs over the network, this value needs to be set larger than the default.
  • Increase innodb_lock_wait_timeout – the default value is 50 seconds. Increase lock wait time to a maximum of 3600. Useful for long-running transactions.
  • Increase net_write_timeout – the default value is 60. Increases the timeout to 300 seconds. Allows for large or congested network transfers.
Remember to insert your own parameter group name instead of ‘microbase‘.
rds-modify-db-parameter-group microbase --parameters "name=max_allowed_packet, value=536870912, method=immediate"
 
rds-modify-db-parameter-group microbase --parameters "name=innodb_lock_wait_timeout, value=3600, method=pending-reboot"
 
rds-modify-db-parameter-group microbase --parameters "name=net_write_timeout, value=300, method=immediate"
Additionally, if you wish to increase the performance of the database, you may optionally run the following two commands:
rds-modify-db-parameter-group microbase --parameters "name=innodb_flush_log_at_trx_commit, value=0, method=immediate"
 
rds-modify-db-parameter-group microbase --parameters "name=sync_binlog, value=0, method=immediate"
These commands alter the way that data is flushed to disk. These parameters give a greater speedup, but increase the risk of data loss in the case that the database server crashes. Use these parameters at your own risk.
  • innodb_flush_log_at_trx_commit – sets the behaviour of syncing on commit. The default value is 1(?). There are several possible values:
    • 0 – if MySQL crashes, last second of txns lost;
    • 1 – flush to disk each update/commit;
    • 2 – flush to OS (if OS crashes, 1 sec txns are lost)
  • sync_binlog – sets when the transaction log is synchronised. Default is 0, but we set it here to make sure.
    • 0 – does no synchronising. If the DB crashes, you will lose whatever transaction log entries were in memory waiting to be written to disk.
    • 1 – synchronises after every write. You lose at most 1 statement from the binary log in the event of a crash.

 

Revisions

No comments yet.

Leave a Reply