Hadoop运维笔记 之 调整hdfs与yarn用户的进程nice优先级

最近,线上的CDH5集群增加了一批新的datanode,但是其中有一些datanode的负载明显比其它的要高。
通过Ganglia上的图形报告查看,发现这些datanode的CPU %ni很高,进一步查看进程,发现不少的hdfs与yarn进程运行在高优先级(<0 可通过 ps -lp pid 查看)上,这样会夺取一部分重要系统进程的资源,导致一些问题。 于是,我们通过在线调整的方式,将所有hdfs与yarn用户的进程优先级都设置为了0,将问题解决了。 具体步骤如下: vim /etc/security/limits.d/hdfs.conf 新增如下配置:

hdfs    hard    priority        20
hdfs    hard    nice            0

vim /etc/security/limits.d/yarn.conf
新增如下配置:

yarn    hard    priority  20
yarn    hard    nice    0

vim /etc/security/limits.d/mapred.conf
新增如下配置:

mapred    hard    priority  20
mapred    hard    nice    0

执行命令,在线调整hdfs,yarn与mapred用户的进程nice优先级:
renice 0 -u hdfs yarn mapred

ulimit -Hr 20 hdfs
ulimit -Hr 20 yarn
ulimit -Hr 20 mapred

ulimit -He 0 hdfs
ulimit -He 0 yarn
ulimit -He 0 mapred

,

2 Comments

分享一个有趣的.bashrc设置

在使用Mac与Linux的命令行终端时,经常会在一些目录之间频繁的切换,于是我参考了一些资料,在.bashrc中写了几个小函数,来方便这样的操作。

操作示例
1.进入某个目录并通过mark命令创建一个链接
[dong@Dong-MacBookPro ~]$ cd Downloads/Pictures/
[dong@Dong-MacBookPro Pictures]$ mark pictures

2.执行jump命令,在敲Tab键后的自动完成列表中选择想要进入的链接
[dong@Dong-MacBookPro Pictures]$ cd
[dong@Dong-MacBookPro ~]$ jump
chrome github pictures tmp
[dong@Dong-MacBookPro ~]$ jump pictures
[dong@Dong-MacBookPro Pictures]$ pwd
/Users/dong/Downloads/Pictures

3.执行unmark命令,在敲Tab键后的自动完成列表中选择想要删除的链接
[dong@Dong-MacBookPro Pictures]$ cd
[dong@Dong-MacBookPro ~]$ unmark
chrome github pictures tmp
[dong@Dong-MacBookPro ~]$ unmark pictures
remove /Users/dong/.marks/pictures? y
[dong@Dong-MacBookPro ~]$ jump
chrome github tmp

相关代码:

# Filesystem Markers & Jump
export MARKPATH=$HOME/.marks
function jump(){
  cd -P $MARKPATH/$1 2>/dev/null || echo "No such mark: $1"
}
function mark(){
  mkdir -p $MARKPATH; ln -s $(pwd) $MARKPATH/$1
}
function unmark(){
  rm -i $MARKPATH/$1
}

function _marks(){
  COMPREPLY=()
  local cur=${COMP_WORDS[COMP_CWORD]};
  local com=${COMP_WORDS[COMP_CWORD-1]};
  case $com in
    'jump')
      local marks=($(ls ${MARKPATH}))
      COMPREPLY=($(compgen -W '${marks[@]}' -- $cur))
      ;;
    'unmark')
      local marks=($(ls ${MARKPATH}))
      COMPREPLY=($(compgen -W '${marks[@]}' -- $cur))
      ;;
  esac
}
complete -F _marks jump
complete -F _marks unmark

1 Comment

使用聊天机器人让运维的工作变得有趣

运维的工作其实也可以很有趣。
Slack企业聊天软件有着很丰富的API,我们正尝试使用slackbot或hubot来将一部分运维工作,比如部署,回滚,计划任务等通过聊天机器人来触发。
这样,以后需要做什么,直接 @slackbot 说一句"Deploy the app servers with the lastest packages. 或 Rollback the app servers with the last backup.",然后 slackbot 就会说一句:"Yes sir." "I've the done the deployment, everything is OK."。
看起来很高大上,其实不过是将之前手动执行命令或点击的动作做了一个新的包装,不过好处在于,通过这种方式,我们可以不再通过繁琐的邮件方式来通知大家,因为其他人都可以通过企业聊天软件收到相应的消息。
待我们实现之后,我会写一篇文章好好分享给大家的。

,

No Comments

在CentOS 6上部署OpenVPN Server

参考资料:
https://www.digitalocean.com/community/tutorials/how-to-setup-and-configure-an-openvpn-server-on-centos-6
http://www.unixmen.com/setup-openvpn-server-client-centos-6-5/
http://docs.ucloud.cn/software/vpn/OpenVPN4CentOS.html

背景介绍:
最近,GFW开始针对VPN进行了屏蔽,之前在VPS上搭建的PPTP/L2TP VPN在有些时候都开始变得不稳定了。
因此,打算在VPS上再搭建一个OpenVPN Server,以备不时之需。

相关配置:
OS: CentOS 6.4 x86_64 Minimal

1. 安装EPEL扩展库
# yum install http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm

2. 安装所需依赖软件包
# yum install -y openssl openssl-devel lzo lzo-devel pam pam-devel automake pkgconfig

3. 安装OpenVPN
# yum install openvpn

4. 下载easy-rsa 2.x
# wget https://github.com/OpenVPN/easy-rsa/archive/release/2.x.zip
# unzip 2.x.zip
# cd easy-rsa-release-2.x
# cp -rf easy-rsa /etc/openvpn/

5. 配置easy-rsa vars
# cd /etc/openvpn/easy-rsa/2.0/
# ln -s openssl-1.0.0.cnf openssl.cnf
# chmod +x vars

修改vars文件中以下配置项:
# vim vars

...
# Increase this to 2048 if you
# are paranoid.  This will slow
# down TLS negotiation performance
# as well as the one-time DH parms
# generation process.
export KEY_SIZE=1024

...
# These are the default values for fields
# which will be placed in the certificate.
# Don't leave any of these fields blank.
export KEY_COUNTRY="JP"
export KEY_PROVINCE="JP"
export KEY_CITY="Tokyo"
export KEY_ORG="heylinux.com"
export KEY_EMAIL="guosuiyu@gmail.com"
export KEY_OU="MyOrganizationalUnit"
...

执行vars文件使环境变量生效:
# source ./vars

NOTE: If you run ./clean-all, I will be doing a rm -rf on /etc/openvpn/easy-rsa/2.0/keys

6. 生成所需的各种证书文件
清除旧的证书:
# ./clean-all

生成服务器端CA证书,由于在vars文件中做过缺省设置,在出现交互界面时,直接一路回车即可:
# ./build-ca

Generating a 1024 bit RSA private key
..............................++++++
.....................................++++++
writing new private key to 'ca.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [JP]:
State or Province Name (full name) [JP]:
Locality Name (eg, city) [Tokyo]:
Organization Name (eg, company) [heylinux.com]:
Organizational Unit Name (eg, section) [MyOrganizationalUnit]:
Common Name (eg, your name or your server's hostname) [heylinux.com CA]:
Name [EasyRSA]:
Email Address [guosuiyu@gmail.com]:

生成服务器证书,仍然是在出现交互界面时,直接一路回车,并在结尾询问[y/n]时输入y即可:
# ./build-key-server heylinux.com

Generating a 1024 bit RSA private key
............++++++
................++++++
writing new private key to 'heylinux.com.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [JP]:
State or Province Name (full name) [JP]:
Locality Name (eg, city) [Tokyo]:
Organization Name (eg, company) [heylinux.com]:
Organizational Unit Name (eg, section) [MyOrganizationalUnit]:
Common Name (eg, your name or your server's hostname) [heylinux.com]:
Name [EasyRSA]:
Email Address [guosuiyu@gmail.com]:

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Using configuration from /etc/openvpn/easy-rsa/2.0/openssl-1.0.0.cnf
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
countryName           :PRINTABLE:'JP'
stateOrProvinceName   :PRINTABLE:'JP'
localityName          :PRINTABLE:'Tokyo'
organizationName      :PRINTABLE:'heylinux.com'
organizationalUnitName:PRINTABLE:'MyOrganizationalUnit'
commonName            :PRINTABLE:'heylinux.com'
name                  :PRINTABLE:'EasyRSA'
emailAddress          :IA5STRING:'guosuiyu@gmail.com'
Certificate is to be certified until Jan 26 09:49:38 2025 GMT (3650 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

生成DH验证文件:
# ./build-dh

Generating DH parameters, 1024 bit long safe prime, generator 2
This is going to take a long time
................................+.............++*++*++*

生成TLS私密文件:
# cd keys
# openvpn --genkey --secret ta.key
# cd ..

生成客户端证书,例如eric与rainbow两个用户:
# ./build-key eric

Generating a 1024 bit RSA private key
.++++++
..........................................................................++++++
writing new private key to 'eric.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [JP]:
State or Province Name (full name) [JP]:
Locality Name (eg, city) [Tokyo]:
Organization Name (eg, company) [heylinux.com]:nginxs.com
Organizational Unit Name (eg, section) [MyOrganizationalUnit]:
Common Name (eg, your name or your server's hostname) [eric]:
Name [EasyRSA]:
Email Address [guosuiyu@gmail.com]:eric@nginxs.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Using configuration from /etc/openvpn/easy-rsa/2.0/openssl-1.0.0.cnf
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
countryName           :PRINTABLE:'JP'
stateOrProvinceName   :PRINTABLE:'JP'
localityName          :PRINTABLE:'Tokyo'
organizationName      :PRINTABLE:'nginxs.com'
organizationalUnitName:PRINTABLE:'MyOrganizationalUnit'
commonName            :PRINTABLE:'eric'
name                  :PRINTABLE:'EasyRSA'
emailAddress          :IA5STRING:'eric@nginxs.com'
Certificate is to be certified until Jan 26 09:52:03 2025 GMT (3650 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

# ./build-key rainbow

Generating a 1024 bit RSA private key
......................++++++
......................++++++
writing new private key to 'rainbow.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [JP]:
State or Province Name (full name) [JP]:
Locality Name (eg, city) [Tokyo]:
Organization Name (eg, company) [heylinux.com]:
Organizational Unit Name (eg, section) [MyOrganizationalUnit]:
Common Name (eg, your name or your server's hostname) [rainbow]:
Name [EasyRSA]:
Email Address [guosuiyu@gmail.com]:

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Using configuration from /etc/openvpn/easy-rsa/2.0/openssl-1.0.0.cnf
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
countryName           :PRINTABLE:'JP'
stateOrProvinceName   :PRINTABLE:'JP'
localityName          :PRINTABLE:'Tokyo'
organizationName      :PRINTABLE:'heylinux.com'
organizationalUnitName:PRINTABLE:'MyOrganizationalUnit'
commonName            :PRINTABLE:'rainbow'
name                  :PRINTABLE:'EasyRSA'
emailAddress          :IA5STRING:'guosuiyu@gmail.com'
Certificate is to be certified until Jan 26 09:52:49 2025 GMT (3650 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

7. 编辑服务器配置文件:
# vim /etc/openvpn/server.conf

port 1194
proto udp
dev tun
ca /etc/openvpn/easy-rsa/2.0/keys/ca.crt
cert /etc/openvpn/easy-rsa/2.0/keys/heylinux.com.crt
key /etc/openvpn/easy-rsa/2.0/keys/heylinux.com.key
dh /etc/openvpn/easy-rsa/2.0/keys/dh1024.pem
server 10.192.170.0 255.255.255.0
ifconfig-pool-persist ipp.txt
push "redirect-gateway def1 bypass-dhcp"
push "dhcp-option DNS 172.31.0.2"
push "dhcp-option DOMAIN-SEARCH ap-northeast-1.compute.internal"
push "dhcp-option DOMAIN-SEARCH ec2.drawbrid.ge"
client-to-client
keepalive 10 120
comp-lzo
user nobody
group nobody
persist-key
persist-tun
status /var/log/openvpn/openvpn-status.log
log         /var/log/openvpn/openvpn.log
log-append  /var/log/openvpn/openvpn.log
verb 3

注解:在以上配置文件中,
采用了udp协议,较tcp协议而言,在较差的网络情况下效果更好;
指定了ca, cert, key, dh等文件的具体路径;
分配了virtual IP地址段10.192.170.0给VPN客户端;
启用了ipp.txt作为客户端和virtual IP的对应表,以方便客户端重新连接可以获得同样的IP;
启用了redirect-gateway的push功能,这样客户端会在连接后默认设置为所有流量都经过服务器;
启用了dhcp-option的push功能,这样可以将EC2的默认DNS解析配置推送到客户端,并自动配置其DNS解析文件(如MacOS中的/etc/resolv.conf);
启用了client-to-client,使客户端之间能够直接通讯;
启用了nobody作为user和group来降低OpenVPN的执行用户权限;
启用了TLS认证;
启用了lzo压缩;
指定了独立的日志文件;

创建日志文件目录:
# mkdir -p /var/log/openvpn
# chown openvpn:openvpn /var/log/openvpn

8. 启动OpenVPN服务
# /etc/init.d/openvpn start
# chkconfig openvpn on

9. 配置服务器,开启NAT数据转发和相关端口
# vim /etc/sysctl.conf

...
net.ipv4.ip_forward = 1
...

# sysctl -p

# iptables -t nat -A POSTROUTING -s 10.192.170.0/24 -o eth0 -j MASQUERADE

# iptables -A INPUT -p udp --dport 1194 -j ACCEPT
# iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# /etc/init.d/iptables save
# /etc/init.d/iptables restart
# chkconfig iptables on
注意:如果使用的是云主机如EC2,端口过滤相关的配置则需要跳过,然后到Security Group中进行设置。

10. 配置OpenVPN客户端
将服务器端生成的相关证书统一复制到一处,如针对rainbow用户:
# mkdir -p /home/rainbow/tmp/openvpn_heylinux
# cd /home/rainbow/tmp/openvpn_heylinux
# cp -rpa /etc/openvpn/easy-rsa/2.0/keys/ta.key .
# cp -rpa /etc/openvpn/easy-rsa/2.0/keys/ca.crt .
# cp -rpa /etc/openvpn/easy-rsa/2.0/keys/rainbow.crt .
# cp -rpa /etc/openvpn/easy-rsa/2.0/keys/rainbow.key .

配置rainbow用户的ovpn配置文件:
# vim rainbow.ovpn

client
dev tun
proto udp
remote 54.238.131.140 1194
resolv-retry infinite
nobind
persist-key
persist-tun
ca ca.crt
cert rainbow.crt
key rainbow.key
remote-cert-tls server
tls-auth ta.key 1
comp-lzo
verb 3

将相关证书文件与ovpn配置打包:
# cd /home/rainbow/tmp
# tar cf openvpn_heylinux.tar openvpn_heylinux

将打包过后的openvpn_heylinux.tar下载到本地;

在Windows中,下载并安装OpenVPN Client:
下载地址:http://openvpn.net/index.php/download.html
然后将相关的证书文件和rainbow.ovpn配置放到C:/Program Files/OpenVPN/config目录下,到桌面双击OpenVPN图标并选择指定的选项即可;

在MacOS中,下载并安装Tunnelblick:
下载地址:https://code.google.com/p/tunnelblick/
然后,将openvpn_heylinux.tar解压并重命名为heylinux.com.tblk;
再通过Finder找到heylinux.com.tblk并双击即可;

11. 以下,是我在MacOS中成功连接后的相关截图:



12. 更简单容易的解决方案:https://www.digitalocean.com/community/tutorials/openvpn-access-server-centos

, , ,

No Comments

Hadoop运维笔记 之 Datanode的Last Contact值异常增大导致频繁出现Deadnode

我们在线上采用的是CDH的Hadoop发行版,但从CDH3迁移到CDH5之后,Bug层出不穷。:(

CDH5.0.x版本没有什么严重的Bug,但是Namenode之间的状态同步却有问题。
具体表现为,在需要Decommission某个节点时,必须在Active Namenode上操作,如果在Standby Namenode上操作,其Decommissioning状态不会同步到Active Namenode上,也不会真正的Decommissioning。
而即使在Active Namenode上操作的话,Decommissioned状态也不会同步到Standby Namenode;
通过升级到CDH5.1.0之后,我们解决了这个问题,但没想到的是,后面版本的Bug会更加严重。

在CDH5.1.0版本中,有严重的snapshot操作导致edits记录紊乱使Namenode崩溃的问题,在定位到了其匹配的Bug后,我们只能继续通过升级到CDH5.2.0解决这个问题。

但CDH5.2.0又引入了一个新的Bug,就是Namenode与Datanode的心跳会因为正在运行的job而被block,虽然Datanode的负载并不高,但仍然会导致Last Contact值不断增大。
而默认的心跳超时时间是630秒,超过这个数值之后,Namenode就自动将Datanode列入Deadnodes当中。
我们所有的开发和运维花费了一周的时间做各种分析和调试,都没能解决这个问题,也没有找到与问题完全匹配的Bug。
最后,因为一个同行的哥们儿也遇到了相同的问题,他通过升级到CDH5.3.0将问题解决了。
于是,我们在无计可施的情况下,也升级到了CDH5.3.0,果然解决了Last Contact值增高的心跳问题。

于此,不得不感叹,CDH5的Bug真是多,有的还会导致非常严重的问题,但目前已经上了贼船,也就只能自求多福了。

从CDH5.2.0之前的旧版本直接升级到CDH5.3.0的文档:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_earlier_cdh5_upgrade.html

从CDH5.2.0升级到CDH5.3.0的文档:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_cdh5beta1_to_latest_upgrade.html

升级Hive与Oozie到CDH5.3.0对应版本的文档:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hive_upgrade.html
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_oozie_upgrade.html

, ,

No Comments

SSH Tunnel 实践

参考资料:
http://blog.creke.net/722.html

背景介绍:
目前,线上有好几个数据中心,不同数据中心之间的速度差异还是比较大的,我们一般选择一个最优的数据中心作为VPN的接入点。
但有些时候直接通过VPN访问其它数据中心的服务会很慢,于是就临时通过SSH Tunnel来解决。

应用场景:
直接访问服务器idc1-server1很快,但是直接访问idc2-server2很慢,而idc1-server1到idc2-server2却很快;
于是,我们打算用idc1-server1服务器作为跳板来连接idc2-server2。

ssh -i /path/to/sshkey -l username -f -N -T -L 8088:idc2-server2:80 idc1-server1
通过浏览器直接访问http://localhost:8088就相当于访问了http://idc2-server2

关键参数介绍:
-L 8088:idc2-server2:80
将本地的某个端口转发到远端指定机器的指定端口。工作原理是:本地机器上分配了一个socket侦听port端口,一旦这个端口上有了连接, 该连接就经过安全隧道(idc1-server1)转发出去,即 localhost:8088 -> (idc1-server1) -> idc2-server2:80;

ssh -i /path/to/sshkey -l username -f -N -T -L 2022:idc2-server2:22 idc1-server1
通过scp可以将文件通过idc1-server1中转后传送到idc2-server2中:
scp -i /path/to/sshkey -P 2022 upload_file_name.tgz dong@localhost:/path/to/upload/

No Comments

Hadoop运维笔记 之 更换du命令降低datanode磁盘IO

背景介绍:
近期,不少datanode节点的磁盘IO比较高,主要原因还是由于job数量的增多,以及规模的增大。
但任何可以降低磁盘IO消耗的手段,我们都可以尝试一下。

比如,我们经常可以看到hdfs用户在执行"du -sk"命令:
[root@idc1-server2 ~]# ps -ef| grep "du -sk"

hdfs     17119 10336  1 00:57 ?        00:00:04 du -sk /data1/dfs/dn/current/BP-1281416642-10.100.1.2-1407274717062
hdfs     17142 10336  1 00:57 ?        00:00:03 du -sk /data5/dfs/dn/current/BP-1281416642-10.100.1.2-1407274717062
hdfs     17151 10336  1 00:57 ?        00:00:05 du -sk /data6/dfs/dn/current/BP-1281416642-10.100.1.2-1407274717062
...

随着datanode上的数据不断增加,这样频繁的du操作,会耗时比较长,在CPU和磁盘IO很闲的时候,每次也都会耗时5秒左右,而在服务器负载比较高的时候,这样的操作就会耗时很长时间。

于是,我们便考虑通过将原有的du命令替换,并基于df命令来编写一个新的du命令来取而代之。

[root@idc1-server2 ~]# mv /usr/bin/du /usr/bin/du.orig
[root@idc1-server2 ~]# vim /usr/bin/du

#!/bin/sh

mydf=$(df -Pk $2 | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $3 }')
echo -e "$mydf\t$2"

[root@idc1-server2 ~]# chmod +x /usr/bin/du

不过这样的话,统计出来的结果不就不准确了吗?
但具体情况具体应对,一般来说,Hadoop的datanode都会采用不同的磁盘并划分分区来存储数据,那么使用df统计出来的结果,误差应该是很小的。

No Comments

AWS自动化运维脚本分享

背景介绍:
目前项目中使用了大量的AWS EC2 Instances作为服务器,在自动化运维方面,我们之前一直使用的是AWS CLI命令行工具,然后在Shell脚本中调用。
最近我想通过脚本实现一个“Clone”的功能,模拟Web Console上的“Launch More Like This”来创建Instance。但在Shell脚本中实现起来感觉不太舒服,于是就直接利用Python的boto库写了一个,在此分享给大家。

具体内容:
脚本地址:https://github.com/mcsrainbow/python-demos/blob/master/demos/awscli.py

相关示例:
温馨提示:我的Blog页面默认没有采用宽屏模式,如果觉得下面的代码不太美观,可以点击右上角的“<>”切换到宽屏模式。
$ ./awscli.py -h

usage: awscli.py [-h] (--create | --clone | --terminate) --region REGION
                 [--instance_name INSTANCE_NAME] [--image_id IMAGE_ID]
                 [--instance_type INSTANCE_TYPE] [--key_name KEY_NAME]
                 [--security_group_ids SECURITY_GROUP_IDS]
                 [--subnet_id SUBNET_ID]
                 [--src_instance_name SRC_INSTANCE_NAME]
                 [--dest_instance_name DEST_INSTANCE_NAME]
                 [--private_ip_address PRIVATE_IP_ADDRESS]
                 [--instance_id INSTANCE_ID] [--volume_size VOLUME_SIZE]
                 [--volume_type {standard,io1,gp2}]
                 [--volume_zone VOLUME_ZONE] [--volume_iops VOLUME_IOPS]
                 [--volume_delete_on_termination]
                 [--load_balancer_name LOAD_BALANCER_NAME]
                 [--ignore_load_balancer]
                 [--quick]

examples:
  ./awscli.py --create --region us-west-1 --instance_name idc1-server2 \
              --image_id ami-30f01234 --instance_type t1.micro \
              --key_name idc1-keypair1 --security_group_ids sg-eaf01234f \
              --subnet_id subnet-6d901234
  ./awscli.py --create --region us-west-1 --instance_name idc1-server3 \
              --image_id ami-30f01234 --instance_type t1.micro \
              --key_name idc1-keypair1 --security_group_ids sg-eaf01234f \
              --subnet_id subnet-6d901234 --volume_size 10 --volume_type gp2 \
              --volume_zone us-west-1a --volume_delete_on_termination \
              --load_balancer_name idc1-elb1 --private_ip_address 172.16.2.23
  ./awscli.py --clone --region us-west-1 --src_instance_name idc1-server1 \
              --dest_instance_name idc1-server2
  ./awscli.py --clone --region us-west-1 --src_instance_name idc1-server1 \
              --dest_instance_name idc1-server3 --private_ip_address 172.16.2.23
  ./awscli.py --clone --region us-west-1 --src_instance_name idc1-server1 \
              --dest_instance_name idc1-server3 --private_ip_address 172.16.2.23 \
              --ignore_load_balancer
  ./awscli.py --terminate --region us-west-1 --instance_name idc1-server3
  ./awscli.py --terminate --region us-west-1 --instance_id i-01234abc
  ./awscli.py --terminate --region us-west-1 --instance_id i-01234abc --quick
  ...

optional arguments:
  -h, --help            show this help message and exit
  --create              create instance
  --clone               clone instance
  --terminate           terminate instance
  --region REGION
  --instance_name INSTANCE_NAME
  --image_id IMAGE_ID
  --instance_type INSTANCE_TYPE
  --key_name KEY_NAME
  --security_group_ids SECURITY_GROUP_IDS
  --subnet_id SUBNET_ID
  --src_instance_name SRC_INSTANCE_NAME
  --dest_instance_name DEST_INSTANCE_NAME
  --private_ip_address PRIVATE_IP_ADDRESS
  --instance_id INSTANCE_ID
  --volume_size VOLUME_SIZE
                        in GiB
  --volume_type {standard,io1,gp2}
  --volume_zone VOLUME_ZONE
  --volume_iops VOLUME_IOPS
  --volume_delete_on_termination
                        delete volumes on termination
  --load_balancer_name LOAD_BALANCER_NAME
  --ignore_load_balancer
                        ignore load balancer setting
  --quick               no wait on termination

$ ./awscli.py --create --region us-west-1 --instance_name idc1-server1 --image_id ami-30f01234 \
--instance_type t1.micro --key_name idc1-keypair1 --security_group_ids sg-eaf01234f \
--subnet_id subnet-6d901234 --volume_size 10 --volume_type gp2 --volume_zone us-west-1a \
--volume_delete_on_termination --load_balancer_name idc1-elb1 --private_ip_address 172.16.2.21

1. Launching instance: idc1-server1
2. Creating tag as instance name: {"Name": idc1-server1}
Instance state: pending
Instance state: running
3. Creating secondary volume for instance: idc1-server1 as gp2 10G
Volume status: available
4. Attaching volume: vol-4ba6a54c to instance: idc1-server1 as device: /dev/sdf
5. Adding instance: idc1-server1 to ELB: idc1-elb1

$ ./awscli.py --clone --region us-west-1 --src_instance_name idc1-server1 --dest_instance_name idc1-server2

1. Launching instance: idc1-server2
2. Creating tag as instance name: {"Name": idc1-server2}
Instance state: pending
Instance state: running
3. Creating secondary volume for instance: idc1-server2 as gp2 10G
Volume status: available
4. Attaching volume: vol-5b61635c to instance: idc1-server2 as device: /dev/sdf
5. Adding instance: idc1-server2 to ELB: idc1-elb1

$ ./awscli.py --terminate --region us-west-1 --instance_name idc1-server2

Terminating instance: idc1-server2 id: i-86976d62
Instance state: shutting-down
Instance state: shutting-down
Instance state: terminated

, ,

4 Comments

在AWS EC2中创建不含Marketplace code的CentOS6 AMI

参考资料:
https://www.caseylabs.com/remove-the-aws-marketplace-code-from-a-centos-ami/
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage_expand_partition.html

背景介绍:
在AWS EC2中,从Marketplace里面可以很方便的选择最新的CentOS6的官方Minimal版本的AMI,来创建Instance。
但是这里面却埋了一个大坑,那就是,所有基于Marketplace里面的AMI所创建的Instance,都会带有一个Marketplace code。
它会导致你无法通过为现有根分区所在的EBS Volume创建Snapshot和新的Volume的方式来对其扩容。
在Detach了现有的根分区所在的Volume后,将无法再次将其Attach到Instance当中,在Attach新的Volume时也会遇到相同的报错:

Client.OperationNotPermitted:
'vol-xxxxxxx' with Marketplace codes may not be attached as a secondary device.

这个Marketplace code,顾名思义,应该就是为了保护一些付费的AMI不被随意的克隆,但不知道为什么没有对费用为$0的CentOS6 AMI做单独的处理。
上面的限制,主要影响到的是,默认创建好的CentOS6 Instance的EBS Volume只有8G,即使在创建时指定了50G的EBS Volume,创建后的根分区空间也只有8G。这样的大小是无法满足线上需求的,只能对其进行扩容,而因为有上面的Marketplace code的限制,又使扩容变得很艰难。
还好最终我通过参考上面的两篇文章,从官方的CentOS6 AMI中移除了Marketplace code,并成功的对根分区进行了扩容并创建了相应的AMI。

具体步骤:
1. 从现有的CentOS6 AMI中移除Marketplace code
1.1 从AWS的Marketplace搜索CentOS6 AMI,并创建一个根分区所在的EBS Volume为8G(默认大小)的Instance;
1.2 在AWS EC2 web console中,再创建一个新的大小为8G的EBS Volume;
1.3 将新创建的EBS Volume Attach到Instance上,通常会默认识别为/dev/xvdj(HVM版本的AMI会识别为/dev/xvdf);
1.4 通过SSH登陆到Instance,并通过dd克隆根分区所在的EBS Volume(HVM版本的AMI会将根目录所在的EBS Volume识别为/dev/xvda):

dd bs=65536 if=/dev/xvde of=/dev/xvdj

1.5 当克隆完成以后,关闭Instance;
1.6 Detach现有根分区所在的EBS Volume;
1.7 Detach新创建的EBS Volume,并重新Attach到Instance,作为/dev/sda(HVM版本的AMI需要指定为/dev/sda1);
1.8 启动Instance;
1.9 在确认Instance正常启动后,在EC2 web console中右键点击Instance,并选择Create Image,即可创建一个新的不含Marketplace code的CentOS6 AMI了,我一般将其命名为official_centos6_x86_64_minimal_ebs8g。

2. 将现有的AMI根分区所在的EBS Volume扩容为50G,并创建新的AMI official_centos6_x86_64_minimal_ebs50g
2.1 基于AMI official_centos6_x86_64_minimal_ebs8g创建一个Instance;
2.2 为Instance所在的EBS Volume创建一个Snapshot;
2.3 创建一个新的大小为50G的Volume,并包含刚刚创建的Snapshot;
2.4 将新创建的Volume Attach到Instance,作为第二块EBS Volume,默认会识别为/dev/xvdj(HVM版本的AMI会识别为/dev/xvdf);
2.5 在Instance上对第二块EBS Volume进行扩容,详细步骤如下(HVM版本的AMI会将根目录所在的EBS Volume识别为/dev/xvda):

[root@ip-172-17-4-12 ~]# parted /dev/xvdj
GNU Parted 2.1
Using /dev/xvdj
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit s
(parted) print
Model: Xen Virtual Block Device (xvd)
Disk /dev/xvdj: 104857600s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start  End        Size       Type     File system  Flags
 1      2048s  16777215s  16775168s  primary  ext4         boot

(parted) rm 1
(parted) mkpart primary 2048s 100%
(parted) print
Model: Xen Virtual Block Device (xvd)
Disk /dev/xvdj: 104857600s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start  End         Size        Type     File system  Flags
 1      2048s  104857599s  104855552s  primary  ext4

(parted) set 1 boot on
(parted) print
Model: Xen Virtual Block Device (xvd)
Disk /dev/xvdj: 104857600s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start  End         Size        Type     File system  Flags
1      2048s  104857599s  104855552s  primary  ext4         boot

(parted) quit
Information: You may need to update /etc/fstab.

[root@ip-172-17-4-12 ~]# e2fsck -f /dev/xvdj1
e2fsck 1.41.12 (17-May-2010)
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? yes

/dev/xvdj1: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/xvdj1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/xvdj1: 18425/524288 files (0.2% non-contiguous), 243772/2096896 blocks

[root@ip-172-17-4-12 ~]# lsblk
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvde    202:0    0   8G  0 disk
└─xvde1 202:1    0   8G  0 part /
xvdj    202:80   0  50G  0 disk
└─xvdj1 202:     0  50G  0 part

[root@ip-172-17-4-12 ~]# resize2fs /dev/xvdj1
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/xvdj1 to 13106944 (4k) blocks.
The filesystem on /dev/xvdj1 is now 13106944 blocks long.

2.6 关闭Instance;
2.7 Detach现有根分区所在的EBS Volume;
2.8 Detach扩容后的第二块EBS Volume,并重新Attach到Instance,作为/dev/sda(HVM版本的AMI需要指定为/dev/sda1);
2.9 启动Instance;
2.10 在确认Instance正常启动后,在EC2 web console中右键点击Instance,并选择Create Image,即可创建一个新的根分区大小为50G的CentOS6 AMI了,我一般将其命名为official_centos6_x86_64_minimal_ebs50g。

, , , , ,

No Comments

Hadoop运维笔记 之 Snappy创建libhadoop.so导致datanode报错

为了解决上一篇文章中提到的Bug,我们将线上的CDH5升级到了目前最新的CDH5.2.0,但升级之后,有一部分服务器的datanode不能正常启动,报错如下:

2014-11-20 19:54:52,071 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected exception in block pool Block pool <registering> (Datanode Uuid unassigned) service to idc1-server1/10.100.1.100:8020
com.google.common.util.concurrent.ExecutionError: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO.link0(Ljava/lang/String;Ljava/lang/String;)V
	at com.google.common.util.concurrent.Futures.wrapAndThrowExceptionOrError(Futures.java:1126)
	at com.google.common.util.concurrent.Futures.get(Futures.java:1048)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocks(DataStorage.java:870)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.linkAllBlocks (BlockPoolSliceStorage.java:570)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doUpgrade (BlockPoolSliceStorage.java:379)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition (BlockPoolSliceStorage.java:313)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead (BlockPoolSliceStorage.java:187)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead (DataStorage.java:309)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1109)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1080)
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo (BPOfferService.java:320)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake (BPServiceActor.java:220)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:824)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO.link0 (Ljava/lang/String;Ljava/lang/String;)V
	at org.apache.hadoop.io.nativeio.NativeIO.link0(Native Method)
	at org.apache.hadoop.io.nativeio.NativeIO.link(NativeIO.java:838)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage$2.call(DataStorage.java:862)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage$2.call(DataStorage.java:855)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	... 1 more
2014-11-20 19:54:52,073 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to idc1-server1/10.100.1.100:8020

但搜遍了Google也未能找到匹配的信息,唯一沾点边的都是一些在Windows平台上因为缺少lib导致的问题。
而在我们的环境中,只有一部分的服务器有以上问题,对比了所有Hadoop相关的软件包之后都没法发现有什么不同,这给我们分析问题带来了很大的干扰。

最后,我们尝试通过strace来跟踪datanode的进程。
yum install strace
strace -f -F -o /tmp/strace.output.txt /etc/init.d/hadoop-hdfs-datanode start
lsof | grep libhadoop.so

java 18527 hdfs mem REG 253,0 122832 270200 /usr/java/jdk1.7.0_45/jre/lib/amd64/libhadoop.so

发现它读取了一个lib文件:/usr/java/jdk1.7.0_45/jre/lib/amd64/libhadoop.so,而其它正常的服务器的datanode进程则是读取的/usr/lib/hadoop/lib/native/libhadoop.so。
经过验证发现/usr/java/jdk1.7.0_45/jre/lib/amd64/libhadoop.so是在安装Snappy软件包时创建的,在移走了它之后,datanode终于正常启动了。

看来,虽然datanode在启动时指定了 -Djava.library.path=/usr/lib/hadoop/lib/native,但jre中的lib被载入的优先级还是要高一些。

,

No Comments