Tuesday, September 13, 2011

WSO2 cloud with Amazon RDS

WSO2 Cloud Platform decided to go with Amazon Relational Database Services for it's relational data services requirements. If you plan to leverage Amazon RDS for similar purposes you can share our experience.
Date: Mon, 1st Nov, 2010
Level: Intermediate
Reads: 1821 Comments: 0 | Login or register to post comments
Damitha Kumarage

WSO2 Inc.
damitha's picture

Content

Applies To

WSO2 Cloud ESB 3.0.1
Amazon RDS version
Ubuntu version
   

Introduction

This document explain how WSO2 cloud platform could benefit from Amazon Relational Database Service(RDS). RDS is the relational database services provided by Amazon. Currently it supports only MySQL database. To understand the importance of RDS let's consider a scenario. Suppose you have your application which uses a database, is installed in an EC2 instance. Suppose the database is installed in a separate EC2 instance. Your application connects to that EC2 instance where database installed for database access. You need a database administrator to do all maintenance work like applying security patches, version upgrades, database backups, scaling etc. on your EC2 database instance. Amazon RDS relieves you from all such burden. You just need to request a new RDS instance that suite your specific requirements and you are good to go with your database needs.


Getting started with RDS

Let's go through a step by step procedure to get our hands wet with RDS. We will create an RDS instance and then as a first step of testing our instance try importing an existing database into it. We initially have our WSO2 Enterprise Service Bus installed in an Amazon EC2 instance which use two relational databases called registry and userstore installed in another EC2 instance. We will import our Registry and Userstore database into our newly created RDS instance. Then we try connecting to it from WSO2 ESB. We also use the Amazon Cloudwatch monitoring facility to monitor our Database instance. We will try scaling our database using the faclity provided by Amazon.

    Getting an instance up

    In the web page here[1] sign up for RDS as a new user. To do that you have to create an account with Amazon. You may use your existing Amazon account for that. Once you connect you can either use the cloudwatch management[2] console or RDS command line tools which can be downloaded from here[3] to create a new RDS instance. We use a command line tool in this document to manage our RDS instance. Once you download your tool unpack it into your preferred place and do the following.
    • set environment variable AWS_RDS_HOME to the unpacked directory
    • set the path to it's bin directory.
    • Set JAVA_HOME environment variable.
    • Save your certificate and private key to .pem files
    • set the following environment variables pointing to your keys/certificates
      EC2_CERT=/path/to/my-cert.pem
      EC2_PRIVATE_KEY=/path/to/my-pk.pem
    Now to create a new instance execute following
    rds-create-db-instance --db-instance-identifier registry --allocated-storage 10 
    --db-instance-class db.m1.small --engine mysql5.1 --availability-zone us-east-1c 
    --master-username damitha --master-user-password dbpass
    Here I give the name registry to my new RDS instance. I initially allocate 10GB storage to it and my database user is damitha with password dbpass. When this instance is created you will see a message like following in your shell.
    DBINSTANCE  registry  db.m1.small  mysql5.1  10  master  creating  1  ****  n
          SECGROUP  default  active
          PARAMGRP  default.mysql5.1  in-sync
    
    Now after while execute
    rds-describe-db-instances
    This will show all of your RDS instances. There under the instance name registry you will see it's status as available and it's amazon rds public dns name. Now you can connect to this instance using a mysql command line shell as following.
    mysql --host=<rds instance public dns name> -u damitha -p
    
    However before accessing the RDS instance using above command, you need to allow the EC2 machine you are connecting from, to access that RDS instance. To do that you need to make sure that both of these instances belong to the same security group. You can add a security group as below.
    rds-authorize-db-security-group-ingress default --ec2-security-group-name your_sec_group_name 
    --ec2-security-group-owner-id your_owner_id
    

    Importing Registry and Userstore database

    Now we will connect to our RDS instance from another EC2 instance. First you need to install mysql client in that machine.
    sudo apt-get install mysql-client-core-5.1
    Now you can connect to the RDS instance's mysql shell. We will import an existing WSO2 registry database and userstore database into our RDS instance.
    mysql --host=<rds instance public dns name> -u damitha -p
    If registry and userstore database are already created first drop them
    drop dataabse registry;
    drop database userstore
    create database userstore;
    create database registry; 
    Now log into the existing mysql server machine where registry and userstore database is installed.
    mysqldump --max_allowed_packet=16M --opt --compress --user=registry --password='password_for_registry' | mysql --user=damitha --password=dbpass --host=<rds instance public dns name> registry
    Note that we are overriding the default max_allowed_packet size of the database we are dumping from. If you don't do this there could be an error telling packet size more than max_allowed_packet is not allowed. We also need to make the max_allowed_packet size of the target database greater than or equal to that value.
    Normally the default value in RDS for this parameter is 8M. So we need to increase that value to be greater than or equal to 16M!!!. How we can do that?. We don't have shell access to RDS instance, nor we can log into the RDS mysql instance with super user. Here the rds-modify-db-parameter-group command come into our help. Using that command add a parameter to the RDS instance.
    rds-modify-db-parameter-group default \
     --parameters="name=max_allowed_packet, value=16M, method=immediate"
    You are adding here a new parameter to the parameter group default. But it is not good practice to modify existing parameter groups. So best practice is to create a custom parameter group for your RDS instance.  
    rds-create-db-parameter-group mygroup -f MySQL5.1 -d "wso2"
    So your new parameter group name is mygroup. Now you can add the parameter to that parameter group.
    rds-modify-db-parameter-group mygroup \
     --parameters="name=max_allowed_packet, value=32000000, method=immediate"

    Now you need to modify your RDS instance with the new parameter group
    ./rds-modify-db-instance registry --db-parameter-group-name=mygroup --apply-immediately
    
    Note that your parameter group mygroup has value max_allowed_packet set to 32M. All other parameters of this group has defaults values.
    Similary we can import the userstore database into our RDS instance.
    mysqldump --max_allowed_packet=16M --opt --compress --user=userstore 
     --password='password_for_userstore' userstore | mysql --user=damitha --password=dbpass 
     --host=<rds instance public dns name > userstore

    Now you have successfully imported two databases into your new RDS instance. In the next section we will configure WSO2 ESB to connect to your RDS instance.

     

    Connecting from WSO2 ESB

    Now we will configure our WSO2 ESB to connect to our newly imported registry and userstore database.
    Get an Amazon EC2 instance and insatll WSO2 ESB[4].
    Unarchive it into appropriate directory. WSO2 ESB by default uses an internal database. We will configure ESB to point to our RDS registry and userstore database by editing the registry.xml and user-mgt.xml files respectively
    Edit repository/conf/registry.xml as following

    <dbConfig name="wso2registry">
        <url>jdbc:mysql://<rds instance public dns name>:3306/
            registry?autoReconnect=true</url>
        <userName>registry</userName>
        <password>password_for_registry</password>
        <driverName>com.mysql.jdbc.Driver</driverName>
        <maxActive>50</maxActive>
        <maxWait>60000</maxWait>
        <minIdle>5</minIdle>
        <validationQuery>SELECT 1</validationQuery>
    </dbConfig>

    And configure user-mgt.xml as following
    <Property name="url">jdbc:mysql://<rds instance public dns name>:3306/userstore?autoReconnect=true</Property>
    <Property name="userName">userstore</Property>
    <Property name="password">password_for_registry</Property>
    <Property name="driverName">com.mysql.jdbc.Driver</Property>
    <Property name="maxActive">50</Property>
    <Property name="maxWait">60000</Property>
    <Property name="minIdle">5</Property>
    <Property name="MultiTenantRealmConfigBuilder">org.wso2.carbon.user.core.
        config.multitenancy.SimpleRealmConfigBuilder</Property>
    <Property name="validationQuery">SELECT 1</Property> 
    Also copy the mysql java connector jar into the repository/components/lib directory of your ESB.
    scp -i /home/damitha/.ssh/stratos.pem mysql-connector-java-5.1.12-bin.jar \
        ubuntu@ec2-174-129-91-98.compute-1.amazonaws.com:/opt/wso2esb/repository/components/lib 
    
    Now start the servers from the esb bin directory by executing
    ./wso2server.sh 
    If everything goes well your ESB should start without any problem.

 

Monitoring

You need to sign up for the cloudwatch to monitor your RDS instance. Although there is a command line tool set to download[5], using the management console of cloudwatch is what I recommend.
 

Scaling


First, I created a RDS small instance. But when I test with a big load it's cpu came close to 70% and memory went to 100%. Then I upgraded my instance to a large instance through RDS management console.
Note: choosing to change your DB Instance Class and/or Backup Retention Period along with selecting "Apply Immediately" causes your DB Instance to reboot immediately.
I enabled Multi-AZ replication without rebooting the instance. Also I increased my storage from 10GB to 15GB without rebooting the RDS instance.

Backup

Your backup strategy is very important when designing your databases. Amazon provides several backup strategies to keep you justified in choosing RDS.
  • Taking a snapshot


    rds-create-db-snapshot registry --db-snapshot-identifier=test_snap
    
    
  • Restoring from a snapshot


    rds-restore-db-instance-from-db-snapshot test_restore \ 
     --db-snapshot-identifier=test_snap --db-instance-class=db.m1.large
    

  • Automatic Backups

    You can configure backup-retention-period(in days) and backup window(in hours) so that you can restore to point in time backups in case of a mishap. That means you can restore to any backup window during that retention period. The default window is 2 hours. If you wish you can disable automatic backups(to improve performance?)

    rds-modify-db-instance registry --backup-retention-period=1 --apply-immediately
    

    To restore to a point in time you need to call
    RestoreDBInstanceToPointInTime

Performance

Suppose you have an EC2 instance running WSO2 ESB. Suppose you have another EC2 instance which is configured to run a MySQL database. And also suppose your WSO2 ESB is configured to connect to that database running in EC2. We have already discussed benefits of moving your database into an RDS instance. But how about the implications on performance?. To find an answer I used a large RDS instance and a large ec2 instace with mysql server to compare. Both of them has somewhat similar mysql configurations. I used mysqlslap performance test tool to test with auto generation sql for large tables of 7 char columns and 4 int columns. Following are the two reports. I sent 10,000 database requests(schema, insert, retrieve) using mysqlslap perf test tool with 20 concurrency.
 
ubuntu@domU-12-31-39-0F-20-12:~/rds_test$ cat rds_report 
Benchmark
	Average number of seconds to run all queries: 89.252 seconds
	Minimum number of seconds to run all queries: 87.491 seconds
	Maximum number of seconds to run all queries: 91.013 seconds
	Number of clients running queries: 20
	Average number of queries per client: 500

ubuntu@domU-12-31-39-0F-20-12:~/rds_test$ cat mysql_report 
Benchmark
	Average number of seconds to run all queries: 80.300 seconds
	Minimum number of seconds to run all queries: 79.268 seconds
	Maximum number of seconds to run all queries: 81.333 seconds
	Number of clients running queries: 20
	Average number of queries per client: 500
It seems RDS instance is somewhat slow, but we cannot come to conclusion that it is slow, because there could be slight configuration changes.
To see how my RDS instance handle heavy loads, I loaded the RDS instance from an ec2 large instance running mysqlslap, in the same Amazon zone as the rds instance. I sent 1million requests with concurrency 500 with autogeneration sql for large tables of 7 char columns and 4 int columns. From this I could verify that RDS instance could reliably sustain the load.

 

Some important facts when using RDS

  • It is recommneded to use one parameter group for each instance. Using one parameter group for mulitiple instances could lead to chaos.
  • Transferring data between two regions could be very expensive.
  • If you use parameter groups to modify mysql configurations make sure that you know what you are doing. For most of the time default configuraion is sufficient.
  • Use Cloud watch management console to clearly plan your maintaincance strategy.

 

Limitations

  • Amaozon does not provide a shell access the to the machine where RDS is running. That means we cannot install our own software in that machine. For example we cannot install our own favourite monitoring software in that machine.
  • Have to depend on the cloudwatch RDS monitor provided by Amazon. Cannot install our own favourite monitoring software
  • Replication support is still at preliminary stage
  • Clustering is stilll not supported
  • We don't have direct access to MySQL configuration. Can configure MySQL through parameters using command line tool which is step more work to do than direct configuration.
  • Autoscaling is still not supported.
  • Accessing RDS instance is not provided through same API's as EC2. For example you cannot use Elastifox to access your RDS instance.
  • When a DB Instance changes database parameter groups (i.e. it stops using its current DB Parameter Group and starts using another), it must be restarted. However when adding or modifying a parameter the rebooting may depend on whether the add/modify parameter is dynamic or static. 

 

Reference

[1]http://aws.amazon.com/rds/
[2]https://console.aws.amazon.com/ec2/home
[3]http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2936&ref=featured
[4]http://wso2.com/products/enterprise-service-bus/
[5]http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2534&categoryID=88

Author: Damitha Kumarage
Technical Lead, WSO2 Inc.

1 comment:

  1. I get a lot of great information here and this is what I am searching for. Thank you for your sharing. I have bookmark this page for my future reference. Search aws jobs in hyderabad.

    ReplyDelete