Setup a new MySQL Service on Port 3307 without touching existing database.

1. Create directories for MySQL_3307
$ sudo mkdir -p /opt/mysql_3307/{data,tmp,run,binlogs,log}
$ sudo chown mysql:mysql /opt/mysql_3307/{data,tmp,run,binlogs,log}

2. Create configuration file for MySQL_3307
$ sudo vim /etc/my_3307.cnf

[mysqld]
# basic settings
datadir = /opt/mysql_3307/data
tmpdir = /opt/mysql_3307/tmp
socket = /opt/mysql_3307/run/mysqld.sock
port = 3307
pid-file = /opt/mysql_3307/run/mysqld.pid

# innodb settings
default-storage-engine = INNODB
innodb_file_per_table = 1
log-bin = /opt/mysql_3307/binlogs/bin-log-mysqld
log-bin-index = /opt/mysql_3307/binlogs/bin-log-mysqld.index
innodb_data_home_dir = /opt/mysql_3307/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /opt/mysql_3307/data

# server id
server-id=33071

# other settings
max_allowed_packet = 64M
max_connections = 1024
expire_logs_days = 14
character-set-server = utf8

[mysqld_safe]
log-error = /opt/mysql_3307/log/mysqld.log
pid-file = /opt/mysql_3307/run/mysqld.pid
open-files-limit = 8192

[mysqlhotcopy]
interactive-timeout

[client]
port = 3307
socket = /opt/mysql_3307/run/mysqld.sock
default-character-set = utf8

3. Initialize MySQL_3307
$ sudo -i
# su – mysql
$ mysql_install_db –user=mysql –datadir=/opt/mysql_3307/data/ –defaults-file=/etc/my_3307.cnf
$ exit
# exit

4. Start MySQL_3307
$ sudo mysqld_safe –defaults-file=/etc/my_3307.cnf –user=mysql >/dev/null 2>&1 &

5. Stop MySQL_3307
$ sudo pkill -f “/etc/my_3307.cnf”

6. Connect to MySQL_3307
Must use mysql_3307 to connect MySQL_3307 if you use mysql client on local.
Because “-P 3307” and “–port=3307” not work, mysql client will still connect the default port 3306.
Have to connect via socket file.
On other servers, it’s ok. I searched on Google, this might be a bug.

$ alias mysql_3307=’mysql -S /opt/mysql_3307/run/mysqld.sock’
$ mysql_3307 -uroot -p

7. Create service script for MySQL_3307
$ sudo vim /etc/init.d/mysql_3307

#!/bin/sh
#
# MySQL daemon on Port 3307 start/stop/status script.
# by Dong Guo at 2014-03-19
#

CONF="/etc/my_3307.cnf"
PIDFILE="/opt/mysql_3307/run/mysqld.pid"

function check_root(){
    if [ $EUID -ne 0 ]; then
        echo "This script must be run as root" 1>&2
        exit 1
    fi
}

status(){
  if test -s "${PIDFILE}"; then
    read mysqld_pid < "${PIDFILE}"
    if kill -0 ${mysqld_pid} 2>/dev/null ; then
      echo "MySQL (on Port 3307) running (${mysqld_pid})"
      exit 0
    else
      echo "MySQL (on Port 3307) is not running, but PID file exists"
      exit 1
    fi
  else
    echo "MySQL (on Port 3307) is not running"
    exit 2
  fi
}

start(){
  if test -s "${PIDFILE}"; then
    read mysqld_pid < "${PIDFILE}"
    if kill -0 ${mysqld_pid} 2>/dev/null ; then
      echo "MySQL (on Port 3307) is already running (${mysqld_pid})"
      exit 0
    else
      echo "MySQL (on Port 3307) is not running, but PID file exists"
      exit 1
    fi
  else
    echo "Starting MySQL (on Port 3307)"
    mysqld_safe --defaults-file=${CONF} --user=mysql >/dev/null 2>&1 &
  fi
}

stop(){
  if test -s "${PIDFILE}"; then
    read mysqld_pid < "${PIDFILE}"
    if kill -0 ${mysqld_pid} 2>/dev/null ; then
      echo "Stopping MySQL (on Port 3307)"
      if pkill -f "${CONF}" ; then
        rm ${PIDFILE}
      fi
    else
      echo "MySQL (on Port 3307) is not running, but PID file exists"
      exit 1
    fi
  else
    echo "MySQL (on Port 3307) is not running"
    exit 2
  fi
}

check_root
case "$1" in
    start)
        start
        sleep 2
        status
        ;;
    stop)
        stop
        sleep 2
        status
        ;;
    status)
        status
        ;;
    *)
        echo $"Usage: $0 {start|stop|status}"
        exit 2
esac

$ sudo chmod +x /etc/init.d/mysql_3307
$ sudo /etc/init.d/mysql_3307 status

No Comments

BBCP Test Report

Environment:
Tested 10 times between west-server1 and east-server1.

Speed:
Within the same network, sometimes busy sometimes not, via the internet.
bbcp speed was more stable, and always faster than scp, mostly only spent 30% time of scp, sometimes the time was even shorter.

System Load:
During the whole transfer process:
CPU Load changed very few, only around “0.5-1”.
The value of “-/+ buffers/cache” changed very few, only around “100M”, sometimes even lower.

Bigfile and Smallfiles:
Tested the models directory(1.9G, 204 directories, 82 files) with a single bigfile(file.2g)
Single bigfile was faster, modles directory(9m25s), file.2g(6m12s), still faster than “scp -r”(24m8s)

Stop in the middle of transaction and then resume:
By default, files only appear on remote side when they are completed.
So if there was a file on remote side with the same name, it skipped with messages:

"bbcp: File /home/heydevops/file.2g already exists."

But “-a” option gives bbcp the ability to pick up where it left off, it creates a file on remote side for checkpoint informations:
[heydevops.guo@east-server1 heydevops]$ bbcp -k -a /home/heydevops/ -r -P 2 -V -f -w 9m -s 16 -i /home/heydevops/.ssh/id_rsa file.256m west-server1:/home/heydevops/
[heydevops@west-server1 ~]$ cat bbcp.172.16.4.11.905004000a0.file.256m

0 f 278176444449211 644 268435456 531fd087 531fcffd serverymp file.256m

[heydevops@east-server1 ~]$ bbcp -k -a /home/heydevops/ -r -P 2 -V -f -w 9m -s 16 -i /home/heydevops/.ssh/id_rsa file.256m west-server1:/home/heydevops/

...
bbcp: Will try to complete copying /home/heydevops/file.256m
bbcp: Appending to /home/heydevops/file.256m at offset 199233536
...

Command explain:
# bbcp -k -a /home/heydevops/ -r -P 2 -V -f -w 9m -s 16 -i /home/heydevops/.ssh/id_rsa file.256m west-server1:/home/heydevops/

-k keeps any partially created target files and allows full recovery after a copy failure.
-a /home/heydevops/ appends data to the end of the target file if the target is found to be incomplete due to a previously failed copy.
-r performs a recursive copy by copying all files starting at the source directory
-P 2 produces progress messages every 2 seconds.
-V produces verbose output, including detailed transfer-speed statistics.
-f forces the copy by erasing the target prior to copying the source file.
-w 9m sets the size of the disk input/output (I/O) buffers.
(window = netspeed/8*RTT = 1000Mb/8*74ms = 1000/1000/8*74 = 9.25 M)
[http://www.slac.stanford.edu/~abh/bbcp/#_Toc332986061]
-i specifies the name of the ssh identity file.
-s 16 sets the number of parallel network streams to 16 as suggested:
[http://www.slac.stanford.edu/~abh/bbcp/#_Streams_(-s)]

System optimization:
Increase the os default max windows size.
Otherwise the bbcp shows the autotuning may be misconfigured with given window size:

"bbcp: Target autotuning may be misconfigured; max set to 245760 bytes(240K)."

# vim /etc/sysctl.conf

net.core.rmem_max = 536870912
net.core.wmem_max = 536870912
net.core.rmem_default = 536870912
net.core.wmem_default = 536870912
net.core.optmem_max = 536870912
net.core.netdev_max_backlog = 1000000

1 Comment

Try BBCP to speed up the data transfers on internet

[heydevops@east-server1 ~]$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_rhel60/bbcp -O /usr/bin/bbcp
[heydevops@east-server1 ~]$ sudo chmod +x /usr/bin/bbcp

[heydevops@west-server1 ~]$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_rhel60/bbcp -O /usr/bin/bbcp
[heydevops@west-server1 ~]$ sudo chmod +x /usr/bin/bbcp

[heydevops@east-server1 ~]$ which bbcp

/usr/bin/bbcp

[heydevops@east-server1 ~]$ ssh west-server1 which bbcp

/usr/bin/bbcp

[heydevops@east-server1 ~]$ cd heydevops
[heydevops@east-server1 heydevops]$ sudo dd if=/dev/zero of=/home/heydevops/heydevops/file.2g bs=1024M count=2

2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 45.9129 s, 46.8 MB/s

[heydevops@east-server1 heydevops]$ ls -lh

total 2.0G
-rw-r--r-- 1 root   root   2.0G Mar  4 06:40 file.2g

[heydevops@east-server1 heydevops]$ time bbcp -r -P 2 -V -w 8m -s 16 file.2g west-server1:/home/heydevops/heydevops/

bbcp: Window size reduced to 245760 bytes.
bbcp: Indexing files to be copied...
bbcp: Copying 0 files in 0 directories.
Source east-server1.heylinux.com using initial send window of 18700
Target west-server1.heylinux.com using initial recv window of 87380
bbcp: Creating /home/heydevops/heydevops/file.2g
bbcp: 140304 06:46:12  0% done; 8.5 MB/s, avg 8.5 MB/s
bbcp: 140304 06:46:14  1% done; 6.9 MB/s, avg 7.5 MB/s
...
bbcp: 140304 06:51:46  99% done; 7.7 MB/s, avg 6.1 MB/s
bbcp: 140304 06:51:48  99% done; 3.3 MB/s, avg 6.1 MB/s
Source cpu=3.643 (sys=3.552 usr=0.091).
File /home/heydevops/heydevops/file.2g created; 2147483648 bytes at 6.0 MB/s
48 buffers used with 0 reorders; peaking at 0.
Source east-server1.heylinux.com using a final send window of 433840
Target cpu=15.149 (sys=14.505 usr=0.644).
Target west-server1.heylinux.com using a final recv window of 2298624
1 file copied at effectively 6.0 MB/s

real    5m42.236s
user    0m0.104s
sys     0m3.567s

[heydevops@east-server1 heydevops]$ time scp file.2g west-server1:/home/heydevops/heydevops/

file.2g   100%   2048MB   2.1MB/s   16:06    

real    16m8.448s
user    0m43.497s
sys     0m7.548s

, ,

No Comments

2013 Annual Review

I still remember the most conversations when ‘Tom’ interviewed me last February, we had a good talk about Linux and many operation stuff.
At that time, my life did not go well, I was working overtime untill 11pm+ every day. I expected a challenging job but not too much busy, wanted more time with my family.

Now I believe I found it. In the past year I have gained a lot on my work, the knowledge, the team, a better life and a trip to the US.
The knowledge I’ve learned:
1. Python
I’ve heard Python, but did not use it before. For our project, I learned Python about two weeks, then wrote some useful tools to improve our operation. The most helpful tools are:
“xenserver.py”, it makes us to create local virtual machines much easier, all the vms(200+) in our new colo were created by it.
“racktables.py”, it makes us to manage the servers inventory much easier, it collects the informations and auto audit them into the racktables with their rackspaces.
2. Puppet, Salt, Ansible
I used Chef as IT automation software before, our project uses Puppet, I learned it, wrote modules to install Hadoop, graphite, statsd and MooseFS. Then learned Salt and Ansible, wrote modules to do the same things as Puppet. And I also learned the HA solution for puppet servers.
3. Hadoop
I had experiences on Hadoop operation, but just for some small clusters. Our project has two big clusters, I learned many troubleshooting skills from our team and the runbooks, helped to fix some serious incidents when I was on-call. And learned how to integrate LDAP/Kerberos with Hadoop, upgrade CDH3 to CDH4 and Impala.
4. Nagios
I used Zabbix as monitor system before, our project uses Nagios. I learned how Nagios works then wrote some scripts to check new services, like Dyn QPS report, Disk errors, web api connection, time server and DNS.
5. Database
I learned MySQL auto-failover and Percona XtraDB Cluster to improve the high availability. Improved backup scripts and fixed a backup issue.
6. Others
I also learned many interesting stuff like MooseFS, DRBD, RPMBuild, Jenkins CI, CouchBase, Redis HA and BTSync.

The team I’ve gained:
1. Good team leader
‘Tom’ has a broad view, leads us to learn new technologies and improve our operation works, open-minded for suggestions. So we can enjoy the work and improve our skills.
2. Warm-hearted co-workers
Our team is small, but it’s warm. We learn from each other, and we help each other. Especially at the beginning of my on-call time, ‘Jack’ helped me a lot very patiently. I love this team.
3. A better life I’ve gained:
I had more time to stay with my family, before I was single, I thought if I get married I wouldn’t work harder because I have to take care of my family. But I was totally wrong, now I have the responsibility to work harder to make sure I can give them a better life.

A trip to the US:
Maybe this one is quite normal for many people, but it’s amazing for me. Worked at US with you guys for half a month, I experienced many different things. The culture, the company, the people. In China only few company like that.

2013 is a great year for me, but I knew clearly that I didn’t do well on some works:
1. Hadoop cluster operation
If the hadoop clusters have incidents about our log systems and oozie workflows, I fell difficult to find the root causes, sometimes the runbooks didn’t cover all situations, if I don’t understand them very well, I can’t resolve the incidents. I need to learn more on these two parts.
2. Suggestions for operation
I should bring more useful suggestions. Not just follow the tickets, emails and the on-call. For example the Nagios, compared with Zabbix, its graphic tools are sucks. If something is better, I should learn it more and push it to improve our project.

Thanks. I will still enjoy the work, and work harder, keep learning to make our company better, make my life better.

No Comments

How to create the first VM on XenServer automatically without XenCenter

Environment:
httprepo: 192.168.92.128
xenserver: 192.168.92.140
vm: 192.168.92.142

1. Create a http repo
# yum install httpd
# wget http://linux.mirrors.es.net/centos/6/isos/x86_64/CentOS-6.4-x86_64-minimal.iso

# mkdir -p /var/www/html/repo/centos/6
# mkdir /var/www/html/repo/ks
# mount -o loop CentOS-6.4-x86_64-minimal.iso /var/www/html/repo/centos/6/

# service httpd start

2. Create the kickstart configuration file
# vi /var/www/html/repo/ks/centos-6.4-x86_64-minimal.ks

cmdline
skipx
install
cdrom
lang en_US.UTF-8
keyboard us

network --onboot yes --device eth0 --bootproto=static --ip=192.168.92.142 --netmask=255.255.255.0 --gateway=192.168.92.2 --nameserver=192.168.92.2 --noipv6

rootpw mypasswd

firewall --service=ssh
authconfig --enableshadow --passalgo=sha512
selinux --disabled
timezone --utc Etc/UTC

bootloader --location=mbr --driveorder=xvda --append="crashkernel=auto"

zerombr
clearpart --all --initlabel
autopart

reboot

%packages --nobase
@core
%end

3. Install the first vm automatically
Get the Local storage uuid
# xe sr-list | grep -C 1 “Local”

uuid ( RO) : fbeda99f-b5a7-3100-5e3d-fbb48a46fca0
name-label ( RW): Local storage
name-description ( RW):

Create an empty vm
# xe vm-install new-name-label=centos6_template sr-uuid=fbeda99f-b5a7-3100-5e3d-fbb48a46fca0 template=Other\ install\ media

2fe3c706-9506-50d5-a557-0d61ebde651b

Configure the cpu,memory of the vm
# xe vm-param-set VCPUs-max=1 uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set VCPUs-at-startup=1 uuid=2fe3c706-9506-50d5-a557-0d61ebde651b

# xe vm-param-set memory-dynamic-max=512MiB uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set memory-static-max=512MiB uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set memory-dynamic-min=512MiB uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set memory-static-min=512MiB uuid=2fe3c706-9506-50d5-a557-0d61ebde651b

Configure the bootloader,httprepo,kickstart of the vm installation
# xe vm-param-set HVM-boot-policy=”” uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set PV-bootloader=”eliloader” uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set other-config:install-repository=”http://192.168.92.128/repo/centos/6/” uuid=2fe3c706-9506-50d5-a557-0d61ebde651b
# xe vm-param-set PV-args=”ip=192.168.92.142 netmask=255.255.255.0 gateway=192.168.92.2 ns=192.168.92.2 noipv6 ks=http://192.168.92.128/repo/ks/centos-6.4-x86_64-minimal.ks ksdevice=eth0″ uuid=2fe3c706-9506-50d5-a557-0d61ebde651b

Add a virtual disk to the vm
# xe vm-disk-add uuid=2fe3c706-9506-50d5-a557-0d61ebde651b sr-uuid=fbeda99f-b5a7-3100-5e3d-fbb48a46fca0 device=0 disk-size=20GiB

Make the virtual disk bootable
# xe vbd-list vm-uuid=2fe3c706-9506-50d5-a557-0d61ebde651b userdevice=0 params=uuid –minimal
d304bbbd-f4e2-d648-a668-fe6a803bc301

# xe vbd-param-set bootable=true uuid=d304bbbd-f4e2-d648-a668-fe6a803bc301

Create a network connection of the vm
# xe network-list bridge=xenbr0 –minimal
a6fcd4a1-fb61-6f73-2b31-2a20ad45e0cc

# xe vif-create vm-uuid=2fe3c706-9506-50d5-a557-0d61ebde651b network-uuid=a6fcd4a1-fb61-6f73-2b31-2a20ad45e0cc mac=random device=0
aaf0a04d-c721-fae8-aca1-eb63e047ea93

Start the VM, it will be installed automatically with basic packages and start sshd
# xe vm-start uuid=2fe3c706-9506-50d5-a557-0d61ebde651b

4. Wait for 20 minutes then try to login the vm
# ssh root@192.168.92.142

The authenticity of host '192.168.92.142 (192.168.92.142)' can't be established.
RSA key fingerprint is d5:de:ac:6a:7f:33:0c:ca:84:aa:d5:62:43:d2:9a:23.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.92.142' (RSA) to the list of known hosts.
root@192.168.92.142's password: mypasswd
Last login: Thu Oct 10 22:17:48 2013 from 192.168.92.140
[root@localhost ~]# ls
anaconda-ks.cfg install.log install.log.syslog
[root@localhost ~]#

,

1 Comment

How to setup Percona XtraDB Cluster

Environment:
servers: demoenv-trial-1 demoenv-trial-2 demoenv-trial-3

1. Install Percona Server, on all servers:
$ sudo yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
$ sudo yum install Percona-Server-shared-compat
$ sudo yum install Percona-Server-server-55 Percona-Server-client-55

2. Configure the /etc/my.cnf, on all servers:
$ sudo vim /etc/my.cnf

[mysqld]
# basic settings
datadir = /opt/mysql/data
tmpdir = /opt/mysql/tmp
socket = /opt/mysql/run/mysqld.sock
port = 3306
pid-file = /opt/mysql/run/mysqld.pid

# innodb settings
default-storage-engine = INNODB
innodb_file_per_table = 1
log-bin = /opt/mysql/binlogs/bin-log-mysqld
log-bin-index = /opt/mysql/binlogs/bin-log-mysqld.index
innodb_data_home_dir = /opt/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /opt/mysql/data
binlog-do-db = testdb

# server id
server-id=1

# other settings
max_allowed_packet = 64M
max_connections = 1024
expire_logs_days = 14
character-set-server = utf8

[mysqld_safe]
log-error = /opt/mysql/log/mysqld.log
pid-file = /opt/mysql/run/mysqld.pid
open-files-limit = 8192

[mysqlhotcopy]
interactive-timeout

[client]
port = 3306
socket = /opt/mysql/run/mysqld.sock
default-character-set = utf8

3. Create directories, on all servers:
$ sudo mkdir -p /opt/mysql/{data,tmp,run,binlogs,log}
$ sudo chown mysql:mysql /opt/mysql/{data,tmp,run,binlogs,log}

4. Initialize the database, on all servers:
$ sudo -i
# su – mysql
$ mysql_install_db –user=mysql –datadir=/opt/mysql/data/
$ exit
# exit
$ sudo /etc/init.d/mysql start

5. Remove them without dependencies, on all servers:
Because if we install xtradb cluster, we will get error messages like:
Error: Percona-XtraDB-Cluster-client conflicts with Percona-Server-client-55-5.5.34-rel32.0.591.rhel6.x86_64
Error: Percona-XtraDB-Cluster-server conflicts with Percona-Server-server-55-5.5.34-rel32.0.591.rhel6.x86_64
Error: Percona-XtraDB-Cluster-shared conflicts with Percona-Server-shared-55-5.5.34-rel32.0.591.rhel6.x86_64

$ sudo /etc/init.d/mysql stop
$ sudo rpm -qa | grep Percona-Server | grep -v compat | xargs sudo rpm -e –nodeps
Read the rest of this entry »

, ,

2 Comments

How to upgrade Percona Server 55 to 56

1. Stop Percona Server, on all servers:
$ sudo service mysql stop

2. Check installed packages, on all servers:
$ sudo rpm -qa | grep Percona-Server | grep -v compat

Percona-Server-client-55-5.5.34-rel32.0.591.rhel6.x86_64
Percona-Server-shared-55-5.5.34-rel32.0.591.rhel6.x86_64
Percona-Server-server-55-5.5.34-rel32.0.591.rhel6.x86_64

Remove them without dependencies, on all servers:
$ sudo rpm -qa | grep Percona-Server | grep -v compat | xargs sudo rpm -e –nodeps

3. Install new packages, on all servers:
$ sudo yum install Percona-Server-server-56

4. Start without reading grant table, on all servers, may see some ERROR messages:
$ sudo /usr/sbin/mysqld –skip-grant-tables –user=mysql &
Read the rest of this entry »

1 Comment

Replication and auto-failover made easy with MySQL Utilities

Reference:
http://www.clusterdb.com/mysql/replication-and-auto-failover-made-easy-with-mysql-utilities

Environment:
master: demoenv-trial-1
slaves: demoenv-trial-2 demoenv-trial-3

1. Install Percona Server, on all servers:
$ sudo yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ sudo yum install Percona-Server-shared-compat
$ sudo yum install Percona-Server-server-56

$ sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
$ sudo yum install mysql-utilities

2. Configure /etc/my.cnf, on all servers:
Ensure the “server-id” is different and the “report-host” same as own hostname
$ sudo vim /etc/my.cnf

[mysqld]
# basic setting
datadir = /opt/mysql/data
tmpdir = /opt/mysql/tmp
socket = /opt/mysql/run/mysqld.sock
port = 3306
pid-file = /opt/mysql/run/mysqld.pid

# innodb setting
default-storage-engine = INNODB
innodb_file_per_table = 1
log-bin = /opt/mysql/binlogs/bin-log-mysqld
log-bin-index = /opt/mysql/binlogs/bin-log-mysqld.index
innodb_data_home_dir = /opt/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /opt/mysql/data
binlog-do-db = testdb

# server id
server-id=1

# gtids setting
binlog-format = ROW
log-slave-updates = true
gtid-mode = on
enforce-gtid-consistency = true
report-host = demoenv-trial-1
report-port = 3306
master-info-repository = TABLE
relay-log-info-repository = TABLE
sync-master-info = 1

# other settings
max_allowed_packet = 64M
max_connections = 1024
expire_logs_days = 14
character-set-server = utf8

[mysqld_safe]
log-error = /opt/mysql/log/mysqld.log
pid-file = /opt/mysql/run/mysqld.pid
open-files-limit = 8192

[mysqlhotcopy]
interactive-timeout

[client]
port = 3306
socket = /opt/mysql/run/mysqld.sock
default-character-set = utf8

3. Create directories, on all servers:
$ sudo mkdir -p /opt/mysql/{data,tmp,run,binlogs,log}
$ sudo chown mysql:mysql /opt/mysql/{data,tmp,run,binlogs,log}

4. Initialize the database, on all servers:
$ sudo -i
# su – mysql
$ mysql_install_db –user=mysql –datadir=/opt/mysql/data/
$ exit
# exit
$ sudo /etc/init.d/mysql start

5. Create remote access to root@’%’ for “mysqlreplicate” to create replication setting, on all servers:
$ mysql -uroot
mysql> grant all on *.* to root@’%’ identified by ‘pass’ with grant option;
mysql> quit;

6. Create replication user, on all servers:
$ mysql -uroot
mysql> grant replication slave on *.* to ‘rpl’@’%’ identified by ‘rpl’;
mysql> quit;

7. Set up replication, on any one server:
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot
mysql> use mysql;
mysql> drop user root@’demoenv-trial-1′;
mysql> quit;
Read the rest of this entry »

, ,

4 Comments

How to install Impala on CDH4.2.1 with pseudo mode

Reference:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Quick-Start/cdh4qs_topic_3_3.html
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/Installing-and-Using-Impala.html
http://blog.cloudera.com/blog/2013/02/from-zero-to-impala-in-minutes/

1. Install JDK
$ sudo yum install jdk-6u41-linux-amd64.rpm

2. Install CDH4
$ cd /etc/yum.repos.d/
$ sudo wget http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/cloudera-cdh4.repo
$ sudo yum install hadoop-conf-pseudo

Step 1: Format the NameNode.
$ sudo -u hdfs hdfs namenode -format

Step 2: Start HDFS
$ for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done

Step 3: Create the /tmp Directory
Remove the old /tmp if it exists:
$ sudo -u hdfs hadoop fs -rm -r /tmp

Create a new /tmp directory and set permissions:
$ sudo -u hdfs hadoop fs -mkdir /tmp
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

Step 4: Create Staging and Log Directories
Create the staging directory and set permissions:
$ sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging

Create the done_intermediate directory under the staging directory and set permissions:
$ sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging/history/done_intermediate
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging/history/done_intermediate

Change ownership on the staging directory and subdirectory:
$ sudo -u hdfs hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn/staging

Create the /var/log/hadoop-yarn directory and set ownership:
$ sudo -u hdfs hadoop fs -mkdir /var/log/hadoop-yarn
$ sudo -u hdfs hadoop fs -chown yarn:mapred /var/log/hadoop-yarn

Step 5: Verify the HDFS File Structure:
Run the following command:
$ sudo -u hdfs hadoop fs -ls -R /
Read the rest of this entry »

, , , ,

2 Comments

How to migrate the VMs’ IP addresses of CloudStack

Backgroud: 
We had two CloudStack Servers in Wuhan, which were running over 30 VMs. Their IP addresses were all like 172.16.x.x, after we moved them to Chengdu office, the network became a big trouble, because we are using the 10.6.x.x, so we have to migrate all VMs’ IP addresses to make sure they can be accessed. 

We searched a lot on Google and official forum, asked the official technical support team. Then we knew that the CloudStack doesn’t provide this function yet, but they suggested that we can try to change all IP addresses of VMs in Database, then restart the CloudStack. 

Solution: 
Finally, we made it by the following steps: 

1. First, we should backup all configuration files and the Database. 
Log into the management server, stop all VMs in page “Instances”. 
Waiting until all VMs stopped, the time depends on the number of the running VMs. 
Stop the management server and agent:

$ sudo /etc/init.d/cloud-management stop
$ sudo /etc/init.d/cloud-agent stop

Backup the configuration files:

$ sudo cp -rpa /etc/cloud /etc/cloud.bak
$ sudo cp -rpa /var/lib/cloud /var/lib/cloud.bak

Backup the Database:

$ mysqldump -uroot -p123456 cloud > cloud.sql

2. Change the IP addresses of OS 
Backup the network settings:

$ sudo cp -p /etc/network/interfaces /etc/network/interfaces.bak

Update the network settings:

$ sudo vi /etc/network/interfaces
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 10.6.8.200
netmask 255.255.0.0
gateway 10.6.255.1

auto cloudbr0
iface cloudbr0 inet static
bridge_ports eth0
address 10.6.8.201
netmask 255.255.0.0

Restart the OS:

$ sudo sync
$ sudo reboot

After the OS restart, we may need to do the following steps to make the network works:

$ sudo ifdown cloudbr0
$ sudo ifup cloudbr0

$ sudo ifdown eth0 
$ sudo ifup eth0 

3. Update the configuration files. 
Check all IP settings in configuration files:

$ sudo grep 172.16 -r /etc/cloud/

Update all settings to 10.6.x.x:

$ sudo vi /etc/cloud/management/db.properties
cluster.node.IP=10.6.8.201

$ sudo vi /etc/cloud/agent/agent.properties
host.ip=10.6.8.201

$ sudo vi /etc/cloud/agent/agent.properties
host=10.6.8.201

4. Update all IP settings in Database: 
There are so many data about the IP settings in Database, so we need some scripts. 

Update the IP addresses of table “user_ip_address”:

#!/bin/sh
mysql -uroot -p123456 cloud -e "select * from user_ip_address" > cloud.user_ip_address.list
sed -i '1d' cloud.user_ip_address.list

for public_ip_address in `cat cloud.user_ip_address.list|awk '{print $5}'`
 do
        value3=`echo $public_ip_address | awk -F "." '{print $3}'`
        value4=`echo $public_ip_address | awk -F "." '{print $4}'`
        id=`grep -w $public_ip_address cloud.user_ip_address.list | awk '{print $1}'`
        echo mysql -uroot -p123456 -e "update cloud.user_ip_address set public_ip_address="10.6.$value3.$value4" where id="$id";"
        mysql -uroot -p123456 -e "update cloud.user_ip_address set public_ip_address='10.6.$value3.$value4' where id='$id';"
done

Update the IP addresses of table “nics”:

#!/bin/sh
mysql -uroot -p123456 cloud -e "select * from nics" | grep 172.16 > cloud.nics.list

for ip4_address in `cat cloud.nics.list|awk '{print $5}'`
do
        value3=`echo $ip4_address | awk -F "." '{print $3}'`
        value4=`echo $ip4_address | awk -F "." '{print $4}'`
        for id in `grep -w $ip4_address cloud.nics.list.2 | awk '{print $1}'`
        do
                echo mysql -uroot -p123456 -e "update cloud.nics set ip4_address='10.6.$value3.$value4' where id="$id";"
        echo mysql -uroot -p123456 -e "update cloud.nics set gateway='10.6.255.1' where id="$id";"
                mysql -uroot -p123456 -e "update cloud.nics set ip4_address='10.6.$value3.$value4' where id="$id";"
                mysql -uroot -p123456 -e "update cloud.nics set gateway='10.6.255.1' where id="$id";"
        done
done

Update the IP addresses of table “op_dc_ip_address_alloc”:

#!/bin/sh
mysql -uroot -p123456 cloud -e "select * from op_dc_ip_address_alloc" > cloud.op_dc_ip_address_alloc.list
sed -i '1d' cloud.op_dc_ip_address_alloc.list

for ip_address in `cat cloud.op_dc_ip_address_alloc.list|awk '{print $2}'`
do
    value3=`echo $ip_address | awk -F "." '{print $3}'`
    value4=`echo $ip_address | awk -F "." '{print $4}'`
    id=`grep -w $ip_address cloud.op_dc_ip_address_alloc.list | awk '{print $1}'`
    echo mysql -uroot -p123456 -e "update cloud.op_dc_ip_address_alloc set ip_address="10.6.$value3.$value4" where id="$id";"
    mysql -uroot -p123456 -e "update cloud.op_dc_ip_address_alloc set ip_address='10.6.$value3.$value4' where id='$id';"
done

Update the IP addresses of table “vm_instance”:

#!/bin/sh
mysql -uroot -p123456 cloud -e "select id,private_ip_address from vm_instance;" > cloud.vm_instance.list
sed -i '1d' cloud.vm_instance.list

for private_ip_address in `cat cloud.vm_instance.list|awk '{print $2}'`
do
    value3=`echo $private_ip_address | awk -F "." '{print $3}'`
    value4=`echo $private_ip_address | awk -F "." '{print $4}'`
    echo mysql -uroot -p123456 -e "update cloud.vm_instance set private_ip_address="10.6.$value3.$value4" where private_ip_address="172.16.$value3.$value4";"
    mysql -uroot -p123456 -e "update cloud.vm_instance set private_ip_address='10.6.$value3.$value4' where private_ip_address='172.16.$value3.$value4';"
done

Then we should update the IP addresses of other tables manually. 

We can use this script to find out the tables which has the IP addresses data:

#!/bin/sh
mysql -uroot -p123456 cloud -e "show tables;" > tables.list

for table in `cat tables.list`
do
    tabledata=`mysql -uroot -p123456 cloud -e "select * from $table;"`
    echo $tabledata | grep --color=auto "$1" && echo ============= $table
done

Save the script as check.sh then use the following command to check:

$ ./check.sh 172.16

All data in DB related the 172.16 will be displayed, then we should update them all except these tables: 

“alert;usage_event” 

It takes a lot of time but we really need. 

5. Check the consistency of settings (pod,zone,cluster). 
If we started the agent or did something else before we’ve done all DB IP addresses update, the settings in /etc/cloud/agent/agent.properties could be changed by the agent itself. 

So we should check the consistency of settings between the file and Database:

$ mysql -uroot -p123456 cloud -e "select * from pod_vlan_map;"
+----+--------+------------+
| id | pod_id | vlan_db_id |
+----+--------+------------+
|  1 |      1 |          1 |
+----+--------+------------+

$ mysql -uroot -p123456 cloud -e "select * from cluster;"
+----+--------+--------------------------------------+------+--------+----------------+-----------------+--------------+------------------+---------------+---------+
| id | name   | uuid                                 | guid | pod_id | data_center_id | hypervisor_type | cluster_type | allocation_state | managed_state | removed |
+----+--------+--------------------------------------+------+--------+----------------+-----------------+--------------+------------------+---------------+---------+
|  1 | beluga | 51b5c695-8c94-4dfd-9f22-017dc9cf9af6 | NULL |      1 |              1 | KVM             | CloudManaged | Enabled          | Managed       | NULL    |
+----+--------+--------------------------------------+------+--------+----------------+-----------------+--------------+------------------+---------------+---------+

$ sudo cat /etc/cloud/agent/agent.properties
#Storage
#Fri Sep 14 10:04:27 CST 2012
guest.network.device=cloudbr0
workers=5
private.network.device=cloudbr0
port=8250
resource=com.cloud.agent.resource.computing.LibvirtComputingResource
pod=1
host.mac.address=c8\:60\:00\:6d\:55\:ad
vm.migrate.speed=0
zone=1
guid=c075ffc5-d6da-382b-826a-e2011c7badd1
host.ip=10.6.8.201
public.network.device=cloudbr0
cluster=1
local.storage.uuid=5ddd48d8-717b-4ba7-96b3-8690a6182c5f
domr.scripts.dir=scripts/network/domr/kvm
host=10.6.8.201
LibvirtComputingResource.id=1

6. Start the agent and management:

$ sudo /etc/init.d/cloud-agent start
$ sudo /etc/init.d/cloud-management start

7. Start the VMs 

Go to the page “Infrastructure” – “Zones” – “wuhan” – “System VMs”. 
Start the VMs “s-1-VM” and “v-2-VM” 

Go to the page “Infrastructure” – “Zones” – “wuhan” – “Physical Network” – “PhysicalNetworkInBasicZone” – “Network Service Providers” – “Virtual Router” – “Instances”. Start the VM “r-4-VM” 

Go to the page “Infrastructure” – “Zones” – “wuhan” – “Compute and Storage” – “Hosts”. 
Check the State of host, if it shows “Up”, means it’s OK. Otherwise we should check the log (/var/log/cloud/management/management-server.log) and fix it. 

After the host up, try to restart the Guest Network. 
“Network” – “guestNetworkForBasicZone” – “Restart Network” 

Then, we can try to start the VMs in Page “Instances” one by one. 

When we meet any Errors we should keep investigating from the logs: 
/var/log/cloud/management/management-server.log 
/var/log/cloud/agent/agent.log 

8. Something could help us 
We can log into the VMs as Local on the management web – “View console”. 
The Root Password of System VMs is “6m1ll10n”.

,

1 Comment

Fork me on GitHub