在 Linux&Unix 分类下的文章

在CentOS 6上部署Shadowsocks Server

参考资料:
http://www.shadowsocks.org

背景介绍:
相对于VPN而言,搭建一个Shadowsocks服务,然后通过浏览器代理的方式来使用,要方便很多。
它的原理跟SSH Tunnel类似,就是通过Shadowsocks的服务端与其专用的Shadowsocks客户端建立起一个加密的隧道,然后Shadowsocks客户端会在本地监听一个端口,默认为1080;所有经过这个本地端口的数据都会通过这个加密隧道。

相关配置:
OS: CentOS 6.4 x86_64 Minimal

1. 安装Shadowsocks Server
# pip install shadowsocks

2. 配置/etc/shadowsocks.json
# vim /etc/shadowsocks.json

{
  "server": "0.0.0.0",
  "server_port": 443,
  "local_address": "127.0.0.1",
  "local_port": 1080,
  "password": "shadowsockspass",
  "timeout": 600,
  "method": "aes-256-cfb",
  "fast_open": false,
  "workers": 1
}

注解:在以上配置文件中,
定义了监听的服务器地址为任意地址:"server": "0.0.0.0",
定义了监听的服务器端口为443:"server_port": 443,
定义了客户端本地的监听地址为127.0.0.1:"local_address": "127.0.0.1",
定义了客户端本地的监听端口为1080:"local_port": 1080,
定义了密码为shadowsockspass:"password": "shadowsockspass",
定义了连接超时的时间为600秒:"timeout": 600,
定义了加密的方式为aes-256-cfb:"method": "aes-256-cfb",
默认关闭了fast_open属性:"fast_open": false,
定义了进程数为1:"workers": 1

3. 配置/etc/sysctl.conf,新增如下配置:
# vim /etc/sysctl.conf

# For shadowsocks
fs.file-max = 65535
net.core.rmem_max = 67108864
net.core.wmem_max = 67108864
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 5120
net.ipv4.tcp_mem = 25600 51200 102400
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_congestion_control = hybla

# sysctl -p

4. 启动Shadowsocks服务
# ssserver -c /etc/shadowsocks.json -d start

# netstat -lntp | grep 443

tcp      0      0      0.0.0.0:443      0.0.0.0:*      LISTEN      11037/python

5. 下载Shadowsocks客户端
Windows:https://github.com/shadowsocks/shadowsocks-csharp/releases/download/2.5.6/Shadowsocks-win-2.5.6.zip
Mac OS X:https://github.com/shadowsocks/shadowsocks-iOS/releases/download/2.6.3/ShadowsocksX-2.6.3.dmg

6. 配置客户端
创建服务器连接,输入:
服务器地址,如:heylinux.com
端口:443
加密方式:aes-256-cfb
密码:shadowsockspass

启动客户端并一直保持在启动状态,默认选择Auto Proxy Mode,并执行一次Update PAC from GFWList,如下图所示:
shadowsocks_client

7. 配置浏览器插件
安装插件Proxy SwitchySharp:https://chrome.google.com/webstore/detail/dpplabbmogkhghncfbfdeeokoefdjegm

配置插件,如下图所示:
proxy_switchsharp

启用刚刚配置好的Proxy:shadowsocks

8. 结束

, , ,

No Comments

在CentOS 6上部署PPTP VPN Server

参考资料:
https://www.digitalocean.com/community/tutorials/how-to-setup-your-own-vpn-with-pptp

背景介绍:
搭建PPTP VPN Server应该是非常容易的,可身边有不少朋友在参考了一些文章后仍然来求助于我,走了不少的弯路。
因此,觉得自己有必要写一篇文章来讲解一下。毕竟我写文章的习惯是边操作边记录,所以一步一步照着做就可以完成,大家都喜欢看。

相关配置:
OS: CentOS 6.4 x86_64 Minimal

1. 安装EPEL扩展库
# yum install http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm

2. 安装PPTP扩展库
# yum install http://poptop.sourceforge.net/yum/stable/rhel6/pptp-release-current.noarch.rpm

3. 安装PPTP VPN Server
# yum install pptpd

4. 编辑/etc/pptpd.conf
# vim /etc/pptpd.conf

###############################################################################
# $Id: pptpd.conf,v 1.11 2011/05/19 00:02:50 quozl Exp $
#
# Sample Poptop configuration file /etc/pptpd.conf
#
# Changes are effective when pptpd is restarted.
###############################################################################

# TAG: ppp
#	Path to the pppd program, default '/usr/sbin/pppd' on Linux
#
#ppp /usr/sbin/pppd

# TAG: option
#	Specifies the location of the PPP options file.
#	By default PPP looks in '/etc/ppp/options'
#
option /etc/ppp/options.pptpd

# TAG: debug
#	Turns on (more) debugging to syslog
#
debug

# TAG: stimeout
#	Specifies timeout (in seconds) on starting ctrl connection
#
stimeout 120

# TAG: noipparam
#       Suppress the passing of the client's IP address to PPP, which is
#       done by default otherwise.
#
#noipparam

# TAG: logwtmp
#	Use wtmp(5) to record client connections and disconnections.
#
#logwtmp

# TAG: vrf <vrfname>
#	Switches PPTP & GRE sockets to the specified VRF, which must exist
#	Only available if VRF support was compiled into pptpd.
#
#vrf test

# TAG: bcrelay <if>
#	Turns on broadcast relay to clients from interface <if>
#
#bcrelay eth1

# TAG: delegate
#	Delegates the allocation of client IP addresses to pppd.
#
#       Without this option, which is the default, pptpd manages the list of
#       IP addresses for clients and passes the next free address to pppd.
#       With this option, pptpd does not pass an address, and so pppd may use
#       radius or chap-secrets to allocate an address.
#
#delegate

# TAG: connections
#       Limits the number of client connections that may be accepted.
#
#       If pptpd is allocating IP addresses (e.g. delegate is not
#       used) then the number of connections is also limited by the
#       remoteip option.  The default is 100.
#connections 100

# TAG: localip
# TAG: remoteip
#	Specifies the local and remote IP address ranges.
#
#	These options are ignored if delegate option is set.
#
#       Any addresses work as long as the local machine takes care of the
#       routing.  But if you want to use MS-Windows networking, you should
#       use IP addresses out of the LAN address space and use the proxyarp
#       option in the pppd options file, or run bcrelay.
#
#	You can specify single IP addresses seperated by commas or you can
#	specify ranges, or both. For example:
#
#		192.168.0.234,192.168.0.245-249,192.168.0.254
#
#	IMPORTANT RESTRICTIONS:
#
#	1. No spaces are permitted between commas or within addresses.
#
#	2. If you give more IP addresses than the value of connections,
#	   it will start at the beginning of the list and go until it
#	   gets connections IPs.  Others will be ignored.
#
#	3. No shortcuts in ranges! ie. 234-8 does not mean 234 to 238,
#	   you must type 234-238 if you mean this.
#
#	4. If you give a single localIP, that's ok - all local IPs will
#	   be set to the given one. You MUST still give at least one remote
#	   IP for each simultaneous client.
#
# (Recommended)
#localip 192.168.0.1
#remoteip 192.168.0.234-238,192.168.0.245
# or
#localip 192.168.0.234-238,192.168.0.245
#remoteip 192.168.1.234-238,192.168.1.245
localip 10.192.168.1
remoteip 10.192.168.100-200

注解:在以上配置文件中,
指定了PPP配置文件路径:option /etc/ppp/options.pptpd
开启了调试日志:debug
设置了建立连接时的超时时间为120秒:stimeout 120
PPTP VPN Server的本地地址,即客户端会自动获取到的网关地址:localip 10.192.168.1
分配给客户端的地址范围:remoteip 10.192.168.100-200

5. 编辑/etc/ppp/options.pptpd

###############################################################################
# $Id: options.pptpd,v 1.11 2005/12/29 01:21:09 quozl Exp $
#
# Sample Poptop PPP options file /etc/ppp/options.pptpd
# Options used by PPP when a connection arrives from a client.
# This file is pointed to by /etc/pptpd.conf option keyword.
# Changes are effective on the next connection.  See "man pppd".
#
# You are expected to change this file to suit your system.  As
# packaged, it requires PPP 2.4.2 and the kernel MPPE module.
###############################################################################


# Authentication

# Name of the local system for authentication purposes
# (must match the second field in /etc/ppp/chap-secrets entries)
name ec2-tokyo

# Strip the domain prefix from the username before authentication.
# (applies if you use pppd with chapms-strip-domain patch)
#chapms-strip-domain


# Encryption
# (There have been multiple versions of PPP with encryption support,
# choose with of the following sections you will use.)


# BSD licensed ppp-2.4.2 upstream with MPPE only, kernel module ppp_mppe.o
# {{{
refuse-pap
refuse-chap
refuse-mschap
# Require the peer to authenticate itself using MS-CHAPv2 [Microsoft
# Challenge Handshake Authentication Protocol, Version 2] authentication.
require-mschap-v2
# Require MPPE 128-bit encryption
# (note that MPPE requires the use of MSCHAP-V2 during authentication)
require-mppe-128
# }}}


# OpenSSL licensed ppp-2.4.1 fork with MPPE only, kernel module mppe.o
# {{{
#-chap
#-chapms
# Require the peer to authenticate itself using MS-CHAPv2 [Microsoft
# Challenge Handshake Authentication Protocol, Version 2] authentication.
#+chapms-v2
# Require MPPE encryption
# (note that MPPE requires the use of MSCHAP-V2 during authentication)
#mppe-40	# enable either 40-bit or 128-bit, not both
#mppe-128
#mppe-stateless
# }}}


# Network and Routing

# If pppd is acting as a server for Microsoft Windows clients, this
# option allows pppd to supply one or two DNS (Domain Name Server)
# addresses to the clients.  The first instance of this option
# specifies the primary DNS address; the second instance (if given)
# specifies the secondary DNS address.
#ms-dns 10.0.0.1
#ms-dns 10.0.0.2
ms-dns 172.31.0.2

# If pppd is acting as a server for Microsoft Windows or "Samba"
# clients, this option allows pppd to supply one or two WINS (Windows
# Internet Name Services) server addresses to the clients.  The first
# instance of this option specifies the primary WINS address; the
# second instance (if given) specifies the secondary WINS address.
#ms-wins 10.0.0.3
#ms-wins 10.0.0.4

# Add an entry to this system's ARP [Address Resolution Protocol]
# table with the IP address of the peer and the Ethernet address of this
# system.  This will have the effect of making the peer appear to other
# systems to be on the local ethernet.
# (you do not need this if your PPTP server is responsible for routing
# packets to the clients -- James Cameron)
proxyarp

# Normally pptpd passes the IP address to pppd, but if pptpd has been
# given the delegate option in pptpd.conf or the --delegate command line
# option, then pppd will use chap-secrets or radius to allocate the
# client IP address.  The default local IP address used at the server
# end is often the same as the address of the server.  To override this,
# specify the local IP address here.
# (you must not use this unless you have used the delegate option)
#10.8.0.100


# Logging

# Enable connection debugging facilities.
# (see your syslog configuration for where pppd sends to)
debug

# Print out all the option values which have been set.
# (often requested by mailing list to verify options)
dump


# Miscellaneous

# Create a UUCP-style lock file for the pseudo-tty to ensure exclusive
# access.
lock

# Disable BSD-Compress compression
nobsdcomp

# Disable Van Jacobson compression
# (needed on some networks with Windows 9x/ME/XP clients, see posting to
# poptop-server on 14th April 2005 by Pawel Pokrywka and followups,
# http://marc.theaimsgroup.com/?t=111343175400006&r=1&w=2 )
novj
novjccomp

# turn off logging to stderr, since this may be redirected to pptpd,
# which may trigger a loopback
nologfd

# put plugins here
# (putting them higher up may cause them to sent messages to the pty)

logfile /var/log/pptpd.log
multilink

注解:在以上配置文件中,
定义了PPTP VPN Server的服务名:name ec2-tokyo
定义了加密的规则,如下:
refuse-pap
refuse-chap
refuse-mschap
require-mschap-v2
require-mppe-128
定义了推送到客户端的DNS地址:ms-dns 172.31.0.2 (我通常选择PPTP VPN Server所在服务器的默认DNS设置)
允许相同局域网的主机在PPTP VPN Server上互相可见:proxyarp
开启了调试信息:debug
启用了一些通用的设置,如下:
dump
lock
nobsdcomp
novj
novjccomp
nologfd
指定了日志文件的位置:logfile /var/log/pptpd.log
允许把多个物理通道捆绑为单一逻辑信道:multilink

6. 编辑用户账号密码文件/etc/ppp/chap-secrets
# vim /etc/ppp/chap-secrets

# Secrets for authentication using CHAP
# client	server	secret			IP addresses
"username"  *       "password"        *

7. 编辑/etc/sysconfig/iptables-config
修改 IPTABLES_MODULES="" 为 IPTABLES_MODULES="ip_nat_pptp" 确保在启动iptables服务时自动加载模块。

8. 编辑/etc/sysconfig/iptables(默认eth0为公网IP地址所在网口)
# vim /etc/sysconfig/iptables

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p gre -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 1723 -j ACCEPT
-A INPUT -s 10.192.168.0/255.255.255.0 -m state --state NEW -m tcp -p tcp -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT
*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A POSTROUTING -s 10.192.168.0/255.255.255.0 -o eth0 -j MASQUERADE
COMMIT

注解:在以上iptables脚本中,
对所有GRE协议的数据包放行;
对TCP端口1723放行;
对整个PPTP VPN的局域网地址段10.192.168.0/24放行;
将整个PPTP VPN的局域网地址段10.192.168.0/24通过NAT映射到eth0网口,实现共享上网;

9. 开启数据转发,编辑/etc/sysctl.conf
修改 net.ipv4.ip_forward = 0 为 net.ipv4.ip_forward = 1
执行 sysctl -p

10. 启动PPTP VPN Server
# /etc/init.d/pptpd restart
# /etc/init.d/iptables restart

11. 设置PPTP VPN Server与iptables服务开机自启动
# chkconfig pptpd on
# chkconfig iptables on

12. 在本地PC上配置客户端并连接PPTP VPN Server

13. 结束

, , ,

No Comments

分享一个我用Flask+Bootstrap实现的简单Web

话不多说,地址:https://github.com/mcsrainbow/devopsdemo

,

No Comments

使用innobackupex备份和恢复MySQL

参考资料:
https://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/privileges.html
https://www.percona.com/doc/percona-xtrabackup/2.1/innobackupex/streaming_backups_innobackupex.html
https://www.percona.com/doc/percona-xtrabackup/2.1/howtos/recipes_ibkx_compressed.html
https://www.percona.com/doc/percona-xtrabackup/2.1/innobackupex/incremental_backups_innobackupex.html

背景介绍:
在一些技术群里面,看到仍然有一些运维在用mysqldump这样的命令来备份MySQL,于是感觉有必要介绍一下innobackupex。
现在,绝大多数使用MySQL的场景中,都用到了Master-Slave这样的架构。相对于mysqldump而言,使用innobackupex备份有以下好处:
1. 以数据文件为备份对象,文件级别备份,速度快,尤其适合需要对所有数据进行备份的场景;
2. 热备份,不会对现有的数据库访问造成影响;
3. 记录binlog以及replication相关信息,在创建和恢复Slave时非常有用;
4. 支持对备份后的数据进行同步并行压缩,有效节省磁盘空间;

目前,在我们的线上环境中,数据库的大小,在没有压缩之前为500G左右,压缩之后的大小为90G左右。
而在风哥的环境中,数据库的大小已经超过了1T,以下是风哥的几点补充:
1.用innobackupex可以做到不停业务在线备份,前提是对innodb引擎,对myisam也会锁表;
2.在备份过程会导致IO很高,建议在一台slave上做备份(一般用一台slave只做备份用),不建议在主上备份;
3.innobackupex可以用增量与全量备份方式配合;

另外,杨总在学习了ITIL之后,补充到:对于最近的一次全量备份,除了要做到异地备份以外,还应该尽量在数据库所在的服务器本地保存一份没有经过压缩打包的备份,这样在进行灾难恢复的时候,能够节省大量的时间。

具体用例:
环境介绍

架构:Master-Slave
服务器:idc1-master1, idc1-slave1
MySQL端口:3308
配置文件:/etc/my_3308.cnf
备份目录:/mysql-backup/3308
MySQL数据目录:/opt/mysql_3308/data
服务脚本:/etc/init.d/mysql_3308

1. 在Master和Slave上安装xtrabackup:
注意:如果你安装的不是Percona版本的MySQL,或者MySQL版本低于5.5,请安装percona-xtrabackup-20,并忽略下面所有压缩相关的步骤与参数。

# yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
# yum install -y percona-xtrabackup qpress
# yum install -y percona-xtrabackup-20 # 非Percona版本的MySQL,或者MySQL版本低于5.5

2. 在Master和Slave上创建一个用于备份的用户backup-user:

mysql> CREATE USER 'backup-user'@'localhost' IDENTIFIED BY 'backup-pass';
mysql> GRANT SUPER, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup-user'@'localhost';
mysql> FLUSH PRIVILEGES;
mysql> EXIT;

注意:对于非Percona版本的MySQL,或者MySQL版本低于5.5,如5.1,需要额外增加SELECT权限,否则会出现mysql系统数据库备份权限不足的问题。

mysql> GRANT SELECT, SUPER, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup-user'@'localhost';

3. 在Master上备份
常规方式

[root@idc1-master1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G /mysql-backup/3308

[root@idc1-master1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_03-00-10

压缩打包方式

[root@idc1-master1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --compress --compress-threads=8 --stream=xbstream --parallel=4 /mysql-backup/3308 > /mysql-backup/3308/$(date +%Y-%m-%d_%H-%M-%S).xbstream

[root@idc1-master1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_03-05-05.xbstream

4. 在Slave上备份
常规方式

[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --slave-info --safe-slave-backup /mysql-backup/3308

[root@idc1-slave1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_03-11-03

压缩打包方式

[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --slave-info --safe-slave-backup --compress --compress-threads=8 --stream=xbstream --parallel=4 /mysql-backup/3308 > /mysql-backup/3308/$(date +%Y-%m-%d_%H-%M-%S).xbstream

[root@idc1-slave1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_03-15-03.xbstream

5. 在Master上恢复

[root@idc1-master1 ~]# /etc/init.d/mysql_3308 stop

[root@idc1-master1 ~]# mv /opt/mysql_3308/data /opt/mysql_3308/data_broken
[root@idc1-master1 ~]# mkdir /opt/mysql_3308/data

# 常规方式
[root@idc1-master1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_03-00-10
[root@idc1-master1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --copy-back --use-memory=4G /mysql-backup/3308/2015-10-26_03-00-10 

# 压缩打包方式
[root@idc1-master1 ~]# mkdir -p /mysql-backup/3308/2015-10-26_03-05-05
[root@idc1-master1 ~]# xbstream -x < /mysql-backup/3308/2015-10-26_03-05-05.xbstream -C /mysql-backup/3308/2015-10-26_03-05-05
[root@idc1-master1 ~]# innobackupex --decompress --parallel=4 /mysql-backup/3308/2015-10-26_03-05-05
[root@idc1-master1 ~]# find /mysql-backup/3308/2015-10-26_03-05-05 -name "*.qp" -delete
[root@idc1-master1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_03-05-05
[root@idc1-master1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --copy-back --use-memory=4G /mysql-backup/3308/2015-10-26_03-05-05 

[root@idc1-master1 ~]# chown -R mysql:mysql /opt/mysql_3308/data

[root@idc1-master1 ~]# /etc/init.d/mysql_3308 start

6. 在Slave上恢复

[root@idc1-slave1 ~]# /etc/init.d/mysql_3308 stop

[root@idc1-slave1 ~]# mv /opt/mysql_3308/data /opt/mysql_3308/data_broken
[root@idc1-slave1 ~]# mkdir /opt/mysql_3308/data

# 常规方式
[root@idc1-slave1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_03-11-03
[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --copy-back --use-memory=4G /mysql-backup/3308/2015-10-26_03-11-03 

# 压缩打包方式
[root@idc1-slave1 ~]# mkdir -p /mysql-backup/3308/2015-10-26_03-15-03
[root@idc1-slave1 ~]# xbstream -x < /mysql-backup/3308/2015-10-26_03-15-03.xbstream -C /mysql-backup/3308/2015-10-26_03-15-03 
[root@idc1-slave1 ~]# innobackupex --decompress --parallel=4 /mysql-backup/3308/2015-10-26_03-15-03 
[root@idc1-slave1 ~]# find /mysql-backup/3308/2015-10-26_03-15-03 -name "*.qp" -delete [root@idc1-slave1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_03-15-03 
[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --copy-back --use-memory=4G /mysql-backup/3308/2015-10-26_03-15-03 

[root@idc1-slave1 ~]# chown -R mysql:mysql /opt/mysql_3308/data 
[root@idc1-slave1 ~]# /etc/init.d/mysql_3308 start 

[root@idc1-slave1 ~]# cd /opt/mysql_3308/data 

# 从Master的备份中恢复时查看 xtrabackup_binlog_pos_innodb 
[root@idc1-slave1 data]# cat xtrabackup_binlog_pos_innodb ./bin-log-mysqld.000222 222333 

# 从Slave的备份中恢复时查看 xtrabackup_slave_info 
[root@idc1-slave1 data]# cat xtrabackup_slave_info 
CHANGE MASTER TO MASTER_LOG_FILE='bin-log-mysqld.000222', MASTER_LOG_POS=222333 

[root@idc1-slave1 data]# mysql_3308 -uroot -p 
mysql> change master to
master_host='idc1-master1',
master_port=3308,
master_user='backup-user',
master_password='backup-pass',
master_log_file='bin-log-mysqld.000222',
master_log_pos=222333;
mysql> start slave;
mysql> show slave status\G;
mysql> exit;

7. 增量备份与恢复
增量备份的原理是,基于一个现有的完整备份,针对InnoDB-based表仅备份增量的部分,针对MyISAM表则仍然保持全量备份。

备份环境:
在Slave服务器上进行

备份策略:
每天1次完整备份 + 每天2次增量备份

具体步骤:
7.1 增量备份
7.1.1 准备完整备份(压缩但不打包方式):

[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --slave-info --safe-slave-backup --compress --compress-threads=8 /mysql-backup/3308

[root@idc1-slave1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_06-48-33

[root@idc1-slave1 ~]# cat /mysql-backup/3308/2015-10-26_06-48-33/xtrabackup_checkpoints
backup_type = full-backuped
from_lsn = 0
to_lsn = 1631145
last_lsn = 1631145
compact = 0
recover_binlog_info = 0

7.1.2 进行第1次增量备份(压缩但不打包方式):

[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --slave-info --safe-slave-backup --compress --compress-threads=8 --incremental /mysql-backup/3308 --incremental-basedir=/mysql-backup/3308/2015-10-26_06-48-33

[root@idc1-slave1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_06-55-12

[root@idc1-slave1 ~]# cat /mysql-backup/3308/2015-10-26_06-55-12/xtrabackup_checkpoints
backup_type = incremental
from_lsn = 1631145
to_lsn = 1635418
last_lsn = 1635418
compact = 0
recover_binlog_info = 0

7.1.3 进行第2次增量备份(压缩但不打包方式):

[root@idc1-slave1 ~]# innobackupex --defaults-file=/etc/my_3308.cnf --user=backup-user --password=backup-pass --use-memory=4G --slave-info --safe-slave-backup --compress --compress-threads=8 --incremental /mysql-backup/3308 --incremental-basedir=/mysql-backup/3308/2015-10-26_06-55-12

[root@idc1-slave1 ~]# ls -rt1 /mysql-backup/3308/ | tail -n 1
2015-10-26_06-59-49

[root@idc1-slave1 ~]# cat /mysql-backup/3308/2015-10-26_06-59-49/xtrabackup_checkpoints
backup_type = incremental
from_lsn = 1635418
to_lsn = 1639678
last_lsn = 1639678
compact = 0
recover_binlog_info = 0

7.2 增量恢复:
7.2.1 取回完整备份(必须加参数 --redo-only)

[root@idc1-slave1 ~]# innobackupex --decompress --parallel=4 /mysql-backup/3308/2015-10-26_06-48-33
[root@idc1-slave1 ~]# find /mysql-backup/3308/2015-10-26_06-48-33 -name "*.qp" -delete
[root@idc1-slave1 ~]# innobackupex --apply-log --redo-only --use-memory=4G /mysql-backup/3308/2015-10-26_06-48-33

7.2.2 合并第1个增量(必须加参数 --redo-only)

[root@idc1-slave1 ~]# innobackupex --decompress --parallel=4 /mysql-backup/3308/2015-10-26_06-55-12
[root@idc1-slave1 ~]# find /mysql-backup/3308/2015-10-26_06-55-12 -name "*.qp" -delete
[root@idc1-slave1 ~]# innobackupex --apply-log --redo-only --use-memory=4G /mysql-backup/3308/2015-10-26_06-48-33 --incremental-dir=/mysql-backup/3308/2015-10-26_06-55-12

7.2.3 合并第2个增量(合并最后一个增量备份时不加 --redo-only)

[root@idc1-slave1 ~]# innobackupex --decompress --parallel=4 /mysql-backup/3308/2015-10-26_06-59-49
[root@idc1-slave1 ~]# find /mysql-backup/3308/2015-10-26_06-59-49 -name "*.qp" -delete
[root@idc1-slave1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_06-48-33 --incremental-dir=/mysql-backup/3308/2015-10-26_06-59-49

7.2.4 准备完整备份(定稿完整备份时不加 --redo-only)

[root@idc1-slave1 ~]# innobackupex --apply-log --use-memory=4G /mysql-backup/3308/2015-10-26_06-48-33

7.2.5 恢复完整备份(按照以上 常规方式,执行从--copy-back开始及之后的步骤)

,

1 Comment

Linux小技巧 之 在进程尚未停止前拯救被误删的文件

背景介绍:
今天,在运维群里面跟一些群友聊天,我主要吐槽了关于EXT4文件系统难以做数据恢复的问题,因为过去自己有尝试过EXT3和EXT4上的数据恢复,主要用的是ext3grep和ext4undelete,EXT3基本上每次都能恢复成功,但EXT4却没有成功过一次,每次恢复出来的文件都是破损的,或残缺不全的。

期间,一些比较资深的牛人说EXT4他们成功的恢复过,并且说抱怨Linux文件系统数据不好恢复的人都是对Linux文件系统的基础知识不熟悉,多去看看Linux文件系统关于block, inode, superblock的知识,对于删除文件就是另外的认识了。

听到之后很是惭愧,的确过去只是停留在工具层面,没有深入了解这些方面的知识。

然后,我提到了一个场景,就是某个进程在运行过程中一直在打印一个日志,这时候,有人误删了这个日志。群友说这种情况下恢复文件是非常简单的,因为文件其实还存在于系统当中,不过必须要让进程保持在运行状态,一旦停止或重启后,文件就消失了。

相信大家都有过类似的经验,在清理空间的时候,虽然删掉了一些大的日志文件,但是空间并没有得到释放,而是必须要等到重启服务或杀掉进程的时候才会。

于是我简单的搜索了一下,找到了这篇文章:http://unix.stackexchange.com/questions/101237/how-to-recover-files-i-deleted-now-by-running-rm
并且通过ping命令打印日志并删除日志来成功模拟了这样一个场景。

具体步骤如下:

[dong@idc1-dong1 ~]$ ping heylinux.com &> ping.output.log &
[1] 22672

[dong@idc1-dong1 ~]$ tail -n 5 ping.output.log
64 bytes from 54.238.131.140: icmp_seq=14 ttl=47 time=176 ms
64 bytes from 54.238.131.140: icmp_seq=15 ttl=47 time=126 ms
64 bytes from 54.238.131.140: icmp_seq=16 ttl=47 time=205 ms
64 bytes from 54.238.131.140: icmp_seq=17 ttl=47 time=121 ms
64 bytes from 54.238.131.140: icmp_seq=18 ttl=47 time=121 ms

[dong@idc1-dong1 ~]$ rm -f ping.output.log
[dong@idc1-dong1 ~]$ ls ping.output.log
ls: cannot access ping.output.log: No such file or directory

[dong@idc1-dong1 ~]$ sudo lsof | grep ping.output
ping 22672 dong 1w REG 253,0 2666 2016 /home/dong/ping.output.log (deleted)
ping 22672 dong 2w REG 253,0 2666 2016 /home/dong/ping.output.log (deleted)

[dong@idc1-dong1 ~]$ sudo -i
[root@idc1-dong1 ~]# cd /proc/22672/fd

[root@idc1-dong1 fd]# ll
total 0
lrwx------ 1 root root 64 Sep  1 11:23 0 -> /dev/pts/0
l-wx------ 1 root root 64 Sep  1 11:23 1 -> /home/dong/ping.output.log (deleted)
l-wx------ 1 root root 64 Sep  1 11:23 2 -> /home/dong/ping.output.log (deleted)
lrwx------ 1 root root 64 Sep  1 11:23 3 -> socket:[26968949]

[root@idc1-dong1 fd]# tail -n 5 1
64 bytes from 54.238.131.140: icmp_seq=119 ttl=47 time=161 ms
64 bytes from 54.238.131.140: icmp_seq=120 ttl=47 time=125 ms
64 bytes from 54.238.131.140: icmp_seq=121 ttl=47 time=198 ms
64 bytes from 54.238.131.140: icmp_seq=122 ttl=47 time=151 ms
64 bytes from 54.238.131.140: icmp_seq=123 ttl=47 time=135 ms
[root@idc1-dong1 fd]# tail -n 5 2
64 bytes from 54.238.131.140: icmp_seq=121 ttl=47 time=198 ms
64 bytes from 54.238.131.140: icmp_seq=122 ttl=47 time=151 ms
64 bytes from 54.238.131.140: icmp_seq=123 ttl=47 time=135 ms
64 bytes from 54.238.131.140: icmp_seq=124 ttl=47 time=135 ms
64 bytes from 54.238.131.140: icmp_seq=125 ttl=47 time=134 ms

[root@idc1-dong1 fd]# cp 1 /root/ping.output.log.recover

[root@idc1-dong1 fd]# cd
[root@idc1-dong1 ~]# head -n 5 ping.output.log.recover
PING heylinux.com (54.238.131.140) 56(84) bytes of data.
64 bytes from 54.238.131.140: icmp_seq=1 ttl=47 time=227 ms
64 bytes from 54.238.131.140: icmp_seq=2 ttl=47 time=196 ms
64 bytes from 54.238.131.140: icmp_seq=3 ttl=47 time=157 ms
64 bytes from 54.238.131.140: icmp_seq=4 ttl=47 time=235 ms
[root@idc1-dong1 ~]# tail -n 5 ping.output.log.recover
64 bytes from 54.238.131.140: icmp_seq=146 ttl=47 time=172 ms
64 bytes from 54.238.131.140: icmp_seq=147 ttl=47 time=132 ms
64 bytes from 54.238.131.140: icmp_seq=148 ttl=47 time=212 ms
64 bytes from 54.238.131.140: icmp_seq=149 ttl=47 time=172 ms
64 bytes from 54.238.131.140: icmp_seq=150 ttl=47 time=132 ms

[root@idc1-dong1 ~]# pkill -kill -f ping
[root@idc1-dong1 ~]# cd /proc/22672/fd
-bash: cd: /proc/22672/fd: No such file or directory

No Comments

线上HAProxy内核参数调优分享

HAProxy
CPU: 8核
内存: 16G
数量:4

Servers
数量: 150
类型:HTTP/HTTPS响应GET/POST请求,返回json数据并产生日志
稳定支持的并发会话数量:400K

系统相关配置
# grep -E 'maxconn|nbproc' /etc/haproxy/haproxy.cfg

maxconn     200000
nbproc           7

# cat /etc/security/limits.d/90-nproc.conf

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          -    nproc     4096
root       -    nproc     unlimite

# cat /etc/security/limits.d/90-nofile.conf

*          -    nofile     200000

# cat /etc/sysctl.conf

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1

# Controls source route verification
net.ipv4.conf.default.rp_filter = 0

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# Maximize ephemeral port range
net.ipv4.ip_local_port_range = 1024 65535

# ARP related
net.ipv4.conf.all.arp_notify = 1
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2

# General gigabit tuning
net.core.somaxconn = 32768
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.ipv4.tcp_mem = 94500000 915000000 927000000

# Give the kernel more memory for tcp
# which need with many (100k+) open socket connections
net.core.netdev_max_backlog = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_slow_start_after_idle = 0

## Protect against tcp time-wait assassination hazards
## drop RST packets for sockets in the time-wait state
net.ipv4.tcp_rfc1337 = 1

# Enusre that immediatly subsequent connections use the new values
net.ipv4.route.flush = 1

# Increase system file descriptor limit
fs.file-max = 200000
kernel.pid_max = 65536

# Limit number of orphans, each orphan can eat up to 16M (max wmem) of unswappable memory
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_syn_retries = 3

No Comments

Bash Magic

参考资料:
http://www.cnblogs.com/wangbin/archive/2011/10/11/2207179.html
http://www.programgo.com/article/84763718594/

魔法列表:
1. 变量定义

# var无定义时取值hello
[dong@idc1-server1 ~]$ echo ${var}

[dong@idc1-server1 ~]$ echo ${var-hello}
hello
[dong@idc1-server1 ~]$ echo ${var:-hello}
hello

# var无定义时取值hello且赋值为hello
[dong@idc1-server1 ~]$ echo ${var=hello}
hello
[dong@idc1-server1 ~]$ echo ${var:=hello}
hello
[dong@idc1-server1 ~]$ echo ${var}
hello

# var有定义时取值$var
[dong@idc1-server1 ~]$ echo ${var-bye}
hello

# var有定义时取值bye
[dong@idc1-server1 ~]$ echo ${var+bye}
bye
[dong@idc1-server1 ~]$ echo ${var:+bye}
bye

2. 内部变量

[dong@idc1-server1 ~]$ cat magic.sh
echo $#   #参数个数
echo $*   #所有参数
echo $@     #所有参数
echo ${@:2} #从第2个参数开始到最后的所有参数
echo $?   #上次执行命令返回值
echo $$   #当前shell的进程标识符
/bin/true &  #用&将一个命令放到后台执行
echo $!   #最后一个用&后台执行的命令的进程标识符
echo $0   #当前shell脚本的名称
echo $1   #第一个参数
args=$#   #将args赋值为参数个数
echo ${!#}  #最后一个参数
echo ${!args}  #最后一个参数


## 脚本执行结果
[dong@idc1-server1 ~]$ ./magic.sh param1 param2 param3
3
param1 param2 param3
param1 param2 param3
param2 param3
0
46490
46491
./magic.sh
param1
param3
param3

3. 字符串截取

# 定义字符串
[dong@idc1-server1 ~]$ var=http://heylinux.com/archives/3668.html

# 显示字符串中的字符数量
[dong@idc1-server1 ~]$ echo ${#var}
38

# #号截取,删除左边字符,保留右边字符;
# 其中var是变量名,#号是运算符,*//表示从左边开始删除第一个//号及左边的所有字符;
# 即删除 http://
[dong@idc1-server1 ~]$ echo ${var#*//}
heylinux.com/archives/3668.html

# ##号截取,删除左边字符,保留右边字符;
# ##*/表示从左边开始删除最后(最右边)一个/号及左边的所有字符;
# 即删除 http://heylinux.com/archives/
[dong@idc1-server1 ~]$ echo ${var##*/}
3668.html

# %号截取,删除右边字符,保留左边字符;
# %/*表示从右边开始,删除第一个/号及右边的字符;
[dong@idc1-server1 ~]$ echo ${var%/*}
http://heylinux.com/archives

# %%号截取,删除右边字符,保留左边字符;
# %%/*表示从右边开始,删除最后(最左边)一个/号及右边的字符;
[dong@idc1-server1 ~]$ echo ${var%%/*}
http:

# 从左边第几个字符开始,及字符的个数;
# 其中的0表示左边第一个字符开始,5表示字符的总个数;
[dong@idc1-server1 ~]$ echo ${var:0:5}
http:

# 从左边第几个字符开始,一直到结束;
# 其中的7表示左边第八个字符开始,一直到结束;
[dong@idc1-server1 ~]$ echo ${var:7}
heylinux.com/archives/3668.html

# 从右边第几个字符开始,及字符的个数;
# 其中的0-7表示右边算起第七个字符开始,3表示字符的个数;
[dong@idc1-server1 ~]$ echo ${var:0-7:3}
68.

# 从右边第几个字符开始,一直到结束;
[dong@idc1-server1 ~]$ echo ${var:0-7}
68.html

# 重新定义字符串,方面下面的替换操作
[dong@idc1-server1 ~]$ var=http://heylinux.com/tag1/tag2/tag3/3668.html

# 将字符串中的第一个tag替换为page
[dong@idc1-server1 ~]$ echo ${var/tag/page}
http://heylinux.com/page1/tag2/tag3/3668.html

# 将字符串中的所有tag替换为page
[dong@idc1-server1 ~]$ echo ${var//tag/page}
http://heylinux.com/page1/page2/page3/3668.html

# 将字符串中的第一个tag替换为空字符串(即删除)
[dong@idc1-server1 ~]$ echo ${var/tag}
http://heylinux.com/1/tag2/tag3/3668.html

# 将字符串中的所有tag替换为空字符串(即删除)
[dong@idc1-server1 ~]$ echo ${var//tag}
http://heylinux.com/1/2/3/3668.html

# 将字符串中的前缀http替换为ftp
[dong@idc1-server1 ~]$ echo ${var/#http/ftp}
ftp://heylinux.com/tag1/tag2/tag3/3668.html

# 将字符串中的后缀html替换为php
[dong@idc1-server1 ~]$ echo ${var/%html/php}
http://heylinux.com/tag1/tag2/tag3/3668.php

4. 字符串判断

[[ "${string}" =~ "*" ]] #判断字符串中是否含有符号'*'
[[ "${string}" =~ "rm" ]] #判断字符串中是否含有字符'rm'
[[ "${string}" != "-*" ]] #判断字符串是否由符号'-'开头

5. 判断与运算

[dong@idc1-server1 ~]$ cat magic.sh
#!/bin/bash
echo $(date +%Y%m%d).gz #将命令执行结果作为变量值使用

echo $((4+2*(8-1*6)-4/2)) #加减乘除混合运算
echo $((7%3)) #7除以3取余数
echo $((${RANDOM}%100)) #取100以内的随机数

echo {1..10} #数字1到10的range
echo {01..10} #数字1到10的range,不足两位的用0补齐
echo {01..10..2} #数字1到10的range,公差为2
echo {a..f} #字母a到f的range
echo {a..f..2} #字母a到f的range,公差为2

[ $var != "no" ] && echo "cool.1" #脆弱的实现,当var无定义时会报告语法错误
[ $num -ne 5 ] && echo "yeah.1" #脆弱的实现,当num无定义时会报告语法错误

[ "$var" != "no" ] && echo "cool.2" #健壮的实现,当var无定义时仍能进行判断
[[ $var != "no" ]] && echo "cool.3" #健壮的实现,当var无定义时仍能进行判断
[[ "$var" != "no" ]] && echo "cool.4" #更健壮的实现,当var无定义时仍能进行判断
[[ $num -ne 5 ]] && echo "yeah.2" #健壮的实现,当num无定义时仍能进行判断

(($num>=5)) && echo "wow.1" #脆弱的C-style实现,当num无定义时会报告语法错误,可用的符号有<,>,<=,>=,!=

for i in {1..5}; do echo $i; done #for循环实现

for ((i=1; i<=5; i++)); do echo $i; done #C-style的for循环实现

# 数组的实现
array=(
array_value1
array_value2
array_value3
)
echo ${array[0]} #数组中第一个值
echo ${array[@]} #数组中所有值
echo ${array[*]} #数组中所有值
echo ${#array[*]} #数组中值个数
echo ${#array[@]} #数组中值个数

# 动态增加数组元素
array+=(arrary_new1)
array+=(arrary_new2)
array+=(arrary_new3)
echo ${array[*]}

# 字典的实现
declare -A dict=(
["dict_key1"]="dict_value1"
["dict_key2"]="dict_value2"
)
echo ${dict["dict_key1"]} #字典中第一个值
echo ${dict[@]} #字典中所有值

# 普通赋值操作
a=a+5
echo $a

# 算术型赋值操作
let b=b+5
echo $b

# 更健壮的算术型赋值操作
let "c += 5"
echo $c

# 命令生成器,避免变量与语法的混乱
maxnum=100
logfile=/var/log/httpd/access_log
awk '{print $1}' ${logfile} |sort |uniq -c |awk '($1 > ${maxnum}){print $2}'
# 在上面的命令中第二个awk的单引号会导致${maxnum}无法被解析为100,从而导致语法错误
# 正确的做法应该是先将命令生成好,然后再通过eval执行,如下所示:
command="awk '{print \$1}' ${logfile} |sort |uniq -c |awk '(\$1 > ${maxnum}){print \$2}'"
eval ${command}

# 通过变量PIPESTATUS获取管道命令每一层的返回值
lsof | grep access_log | grep -v httpd
return_codes=(${PIPESTATUS[*]})
echo ${return_codes[*]}

# 通过while命令循环按行获取多个参数
echo -e "A B C\n 1 2 3" | while read param1 param2 param3; do
  echo ${param1}
  echo ${param2} ${param3}
done


## 脚本执行结果
[dong@idc1-server1 ~]$ ./magic.sh
20150609.gz

6
1
56

1 2 3 4 5 6 7 8 9 10
01 02 03 04 05 06 07 08 09 10
01 03 05 07 09
a b c d e f
a c e

./magic.sh: line 8: [: !=: unary operator expected
./magic.sh: line 9: [: -ne: unary operator expected

cool.2
cool.3
cool.4
yeah.2

-bash: ((: >=5: syntax error: operand expected (error token is ">=5")

1
2
3
4
5

1
2
3
4
5

array_value1 array_value2 array_value3
array_value1 array_value2 array_value3
array_value1 array_value2 array_value3
3
3

array_value1 array_value2 array_value3 arrary_new1 arrary_new2 arrary_new3

dict_value1
dict_value1 dict_value2

a+5

5

5

awk: ($1 > ${maxnum}){print $2}
awk:        ^ syntax error
awk: ($1 > ${maxnum}){print $2}
awk:                 ^ syntax error

10.100.1.11

0 0 1

A
B C
1
2 3

6. 编码规范
https://google.github.io/styleguide/shell.xml

No Comments

在AWS上构建企业级VPC私有网络[原创][图示]

参考资料:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html

背景介绍:
目前,公有云越来越普及,基本上绝大部分的初创型企业都会在企业发展初期,采用公有云上的虚拟机作为服务器来开展业务。
而在公有云厂商中,AWS基本上算是一枝独秀,无论是在功能,还是可靠性,可扩展性,以及全球布局等诸多方面。

本文主要对AWS上的VPC私有网络的创建与配置进行讲解,并对Web UI上的各个步骤进行了截图,更加形象,也方便大家的理解。

VPC,顾名思义Virtual Private Cloud,即虚拟私有云。
很多企业或项目在发展初期选择了公有云,但基本上无一例外全都选择了最简单的方式来创建虚拟机,即每台虚拟机都有一个公网IP地址,以及私网IP地址。其中,私网IP地址不可变更,所在的私网IP段不可选择,如果对服务器进行了Stop操作,再次Start之后,私网IP会随机变更;而公网IP地址,则可以绑定一个IP地址来将其固定。

在这样的一个架构当中,如果服务器数量不多,对安全性,可扩展性,高可用性等各个方面没有什么要求的话,也是能满足需要的。

但是,当服务器数量较多时,也就是说,当企业发展到中期的时候,VPC的重要性就越来越体现出来了,其中,最显著的几个方面为:
1. 在安全方面,通过构建一个VPC网络,能够将几乎所有的服务器都部署在一个私有网络中,每台服务器都只有一个私网IP,不需要直接面对公网;而直接面对公网的,则主要是负载均衡与VPN;在这样的一个网络架构中,我们也可以很容易的限制所有服务器的访问入口权限,那就是,每一个需要访问服务器的用户,都需要首先登陆VPN服务器,然后再通过内网IP与内网DNS域名解析服务来访问所有的服务器;
2. 在高可用,可扩展性方面,可以直接部署LVS/Nginx/HAProxy作为负载均衡服务器,部署Keepalived实现双机热备;
3. 在网络方面,可以创建不同的子网,设置不同的路由,灵活的根据业务来对服务器的网络进行分组;可以与本地IDC机房的私有网络通过VPN互联,即实现所谓的混合云;

架构图示:
vpc_100

配置步骤
1. 在VPC Dashboard当中,选择Your VPCs,点击Create VPC,创建一个VPC;
vpc_101

2. 点击Actions,启用ClassicLink
vpc_104

3. 点击Actions,启用DNS hostnames
vpc_105
vpc_106

4. 新建的VPC属性页如下所示:
vpc_107

5. 选择Subnets,点击Create Subnet,创建一个Public Subnet;
vpc_102

6. 再创建一个Private Subnet;
vpc_103

7. 选择Route Tables,将默认的Main路由表命名为private_local_nat,并将这个路由表绑定到Private Subnet上;
vpc_108

8. 选择Route Tables,点击Create Route Table,创建一个新的路由表public_local_igw,并将这个路由表绑定到Public Subnet上;
vpc_109
vpc_110

9. 设置完成后的Route Tables页面如下所示:
vpc_111

10. 选择Internet Gateways,点击Create Internet Gateway创建一个互联网网关,作为Public Subnet的公网出口;
vpc_112
vpc_113

11. 选择DHCP Options Sets,命名默认的DHCP Options Set为default_dhcp;
vpc_114

12. 选择Route Tables,点击public_local_igw,增加一条路由,设置新增的Internet Gateway为Public Subnet的默认公网出口网关;
vpc_115

13. 接下来,为了使所有位于Private Subnet内的Instance也能够访问互联网,我们需要创建一个位于Public Subnet内的Instance,并将其配置为可通过iptables进行NAT共享上网,然后将其添加到private_local_nat路由表中,作为Private Subnet的默认公网出口网关;
13.1 创建Gateway Instance;
vpc_116
vpc_117
vpc_118
vpc_119
vpc_120
vpc_121

13.2 为Gateway Instance分配一个固定的公网IP;
vpc_122

13.3 登录Gateway Instance,将其设置为可通过iptables共享上网的网关服务器;
[dong@Dong-MacBookPro sshkeys]$ chmod 400 drawbridge-tokyo-keypair.pem
[dong@Dong-MacBookPro sshkeys]$ ssh -i drawbridge-tokyo-keypair.pem root@52.68.53.85
[root@ip-172-18-4-11 ~]# setenforce 0
[root@ip-172-18-4-11 ~]# vi /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

[root@ip-172-18-4-11 ~]# vi /etc/sysctl.conf

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 1

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

[root@ip-172-18-4-11 ~]# sysctl -p

net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296

[root@ip-172-18-4-11 ~]# vi /etc/sysconfig/iptables

# Firewall configuration written by system-config-firewall
# Manual customization of this file is not recommended.
*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A POSTROUTING -s 172.18.0.0/16 -o eth0 -j MASQUERADE
COMMIT

[root@ip-172-18-4-11 ~]# /etc/init.d/iptables restart

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Unloading modules:                               [  OK  ]
iptables: Applying firewall rules:                         [  OK  ]

[root@ip-172-18-4-11 ~]# chkconfig iptables on

13.4 修改Gateway Instance的Network属性,禁用Source/Dest. Check;
vpc_123
vpc_124

13.5 选择Route Tables,点击private_local_nat,增加一条路由,设置新增的Gateway Instance为Private Subnet的默认公网出口网关;
vpc_125

14. 至此,在AWS上创建一个企业级的VPC的过程就基本完成了,该VPC默认包括了两个Subnet,其中Public Subnet中的Instance可以直接绑定公网IP,并与Private Subnet中的Instance通过私网IP进行通信;而位于Private Subnet中的Instance不能绑定公网IP,但是可以通过Gateway Instance访问互联网,同时与Public Subnet中的Instance进行通信;

那么,在这样的一个网络中,我们完全可以将负载均衡服务器部署在Public Subnet中,然后将这些服务器绑定专门的Security Group,开放指定的端口并将请求调度到位于Private Subnet中的Instance;可以创建一个DNS服务器,为所有的Instance创建私有域名;可以创建一个VPN服务器,给登录到该VPN服务器的客户端分配一个私网IP,并修改默认DNS为自建的DNS,方便直接通过私有域名来访问Instance;可以通过keepalived来实现基于VIP的各种服务的高可用等等,几乎所有我们在本地IDC中能做的,在这里都可以实现。

不过,需要提到的是,通过VPN将VPC与本地IDC网络互联,这部分需要通过VPC的Virtual Private Gateways和Route Tables来实现;配置方式上根据具体情况的不同而有较大的差异,这部分内容,后面我会再整理然后单独介绍。

, ,

3 Comments

Hadoop运维笔记 之 CDH5.0.0升级到CDH5.3.0

参考资料:
Hadoop: http://www.cloudera.com/content/cloudera/en/documentation/core/v5-3-x/topics/cdh_ig_earlier_cdh5_upgrade.html?scroll=topic_8
Oozie: http://www.cloudera.com/content/cloudera/en/documentation/core/v5-3-x/topics/cdh_ig_oozie_upgrade.html
Hive: http://www.cloudera.com/content/cloudera/en/documentation/core/v5-3-x/topics/cdh_ig_hive_upgrade.html
Pig: http://www.cloudera.com/content/cloudera/en/documentation/core/v5-3-x/topics/cdh_ig_pig_upgrade.html

1. 在所有Hadoop服务器上停止Monit(我们线上使用了Monit来监听进程)
登录idc2-admin1(我们线上使用了idc2-admin1作为管理机以及Yum repo服务器)
# mkdir /root/cdh530_upgrade_from_500
# cd /root/cdh530_upgrade_from_500
# pssh -i -h idc2-hnn-rm-hive 'service monit stop'
# pssh -i -h idc2-hmr.active 'service monit stop'

2. 确认本地的CDH5.3.0的Yum repo服务器已经就绪
http://idc2-admin1/repo/cdh/5.3.0/
http://idc2-admin1/repo/cloudera-gplextras5.3.0/

3. 在Ansible中更新相应的repo模板(我们线上使用了Ansible作为配置管理工具)

{% if "idc2" in group_names %}

...

{% if "cdh5-all" in group_names %}
[heylinux.el6.cloudera-cdh5.3.0]
name= el6 yum cloudera cdh5.3.0
baseurl=http://idc2-admin1/repo/cdh/5.3.0
enabled=1
gpgcheck=0

[heylinux.el6.cloudera-gplextras5.3.0]
name= el6 yum cloudera gplextras5.3.0
baseurl=http://idc2-admin1/repo/cloudera-gplextras5.3.0
enabled=1
gpgcheck=0
{% else %}

...

{% endif %}

4. 更新所有Hadoop服务器的repo文件(/etc/yum.repos.d/heylinux.repo)
# ansible-playbook --private-key /path/to/key_root -u root --vault-password-file=/path/to/vault_passwd.file base.yml -i hosts.idc2 --tags localrepos --limit cdh5-all

5. 升级HDFS相关内容
5.1. 获取当前的Activie Namenode(我们在线上的DNS服务器中创建了一个始终检查并指向Active Namenode的CNAME)
# host active-idc2-hnn
active-idc2-hnn.heylinux.com is an alias for idc2-hnn2.heylinux.com
idc2-hnn2.heylinux.com has address 172.16.2.12

5.2. 在Active NameNode上进入safe mode并生成新的fsimage,并等待整个过程结束。
# sudo -u hdfs hdfs dfsadmin -safemode enter
# sudo -u hdfs hdfs dfsadmin -saveNamespace

5.3 关闭所有的Hadoop服务
回到idc2-admin1上的工作目录
# cd /root/cdh530_upgrade_from_500

首先通过pssh批量关闭Namenode,ResourceManager以及Hive服务器上的Hadoop相关进程(将对应的服务器地址或主机名列表写入到idc2-hnn-rm-hive与idc2-hmr.active)
# pssh -i -h idc2-hnn-rm-hive 'for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x status ; done'
# pssh -i -h idc2-hmr.active 'for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x status ; done'

# pssh -i -h idc2-hnn-rm-hive 'for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done'
# pssh -i -h idc2-hmr.active 'for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done'

# 检查如果存在与新版本相冲突的libhadoop.so文件,如果存在则删除(我们线上安装了Snappy,它会自己生成一个与CDH5.3.0自带的libhadoop.so相冲突的文件并放置到当前的JDK lib目录下面)。
# pssh -i -h idc2-hnn-rm-hive 'rm -f /usr/java/jdk1.7.0_45/jre/lib/amd64/libhadoop.so'
# pssh -i -h idc2-hmr.active 'rm -f /usr/java/jdk1.7.0_45/jre/lib/amd64/libhadoop.so'
Backup the HDFS metadata on the NameNodes

在Namenodes上备份metadata文件(我们线上有两个Namenode组成的HA,分别为idc2-hnn1与idc2-hnn2:
# mkdir /root/cdh530upgrade
# cd /root/cdh530upgrade
# tar -cf /root/nn_backup_data.data1.`date +%Y%m%d`.tar /data1/dfs/nn
# tar -cf /root/nn_backup_data.data2.`date +%Y%m%d`.tar /data2/dfs/nn

6. 升级Hadoop相关软件包
登录并升级Hive服务器idc2-hive1
# yum clean all; yum upgrade hadoop

登录并升级ResourceManager服务器idc2-rm1与idc2-rm2
# yum clean all; yum upgrade hadoop

回到idc2-admin1并升级所有的Datanode服务器idc2-hmr*
# pssh -i -h idc2-hmr.active 'yum clean all; yum upgrade hadoop hadoop-lzo -y'

登录并升级idc2-hnn1(Standby Namenode,由之前的host active-idc2-hnn命令判断得来)
# yum clean all; yum upgrade hadoop hadoop-lzo

登录并升级idc2-hnn2(Active Namenode,由之前的host active-idc2-hnn命令判断得来)
# yum clean all; yum upgrade hadoop hadoop-lzo

回到idc2-admin1并升级所有的Hadoop Clients
# pssh -i -h idc2-client 'yum clean all; yum upgrade hadoop -y'

7. 启动相关服务
登录并启动Journal Nodes服务(我们线上为idc2-hnn1, idc2-hnn2, idc2-rm1三台服务器)
# service hadoop-hdfs-journalnode start

登录所有的DataNode并启动服务(我们线上为idc2-hmr*服务器)
# service hadoop-hdfs-datanode start

登录Active NameNode并更新HDFS Metadata
# service hadoop-hdfs-namenode upgrade
# tailf /var/log/hadoop/hadoop-hdfs-namenode-`hostname -s`.heylinux.com.log

一直等待直到整个过程结束,例如在Log中出现如下类似内容:
/var/lib/hadoop-hdfs/cache/hadoop/dfs/<name> is complete.

等待直至NameNode退出Safe Mode,然后重启Standby NameNode

登录Standby NameNode并重启服务
# sudo -u hdfs hdfs namenode -bootstrapStandby
# service hadoop-hdfs-namenode start

登录所有的ResourceManager并启动服务
# service hadoop-yarn-resourcemanager start

登录所有的NodeManager并启动服务(我们线上为idc2-hmr*服务器)
# service hadoop-yarn-nodemanager start

在Active ResourceManager上启动HistoryServer(我们线上为idc2-rm1服务器)
# service hadoop-mapreduce-historyserver start

至此,整个Hadoop相关的升级就结束了,下面,将对Hive,Oozie和Pig的升级做相应的介绍。

8. 升级Hive与Oozie服务器(我们线上统一安装到了一台机器idc2-hive1)
8.1 升级Hive服务器
备份Metastore数据库
# mkdir -p /root/backupfiles/hive
# cd /root/backupfiles/hive
# mysqldump -uoozie -pheylinux metastore > metastore.sql.bak.`date +%Y%m%d`

更新hive-site.xml

Confirm the following settings are present in hive-site.xml
<property>
  <name>datanucleus.autoCreateSchema</name>
  <value>false</value>
</property>
  <property>
  <name>datanucleus.fixedDatastore</name>
  <value>true</value>
</property>

停止Hive相关服务
# service hive-server2 stop
# service hive-metastore stop

升级Hive相关软件包
# yum upgrade hive hive-metastore hive-server2 hive-jdbc
# yum install hive-hbase hive-hcatalog hive-webhcat

升级Hive的Metastore
# sudo -u oozie /usr/lib/hive/bin/schematool -dbType mysql -upgradeSchemaFrom 0.12.0

启动Hive服务
# service hive-metastore start
# service hive-server2 start

8.2 升级Oozie服务器
备份Oozie数据库
# mkdir -p /root/backupfiles/hive
# cd /root/backupfiles/hive
# mysqldump -uoozie -pheylinux oozie > oozie.sql.bak.`date +%Y%m%d`

备份Oozie配置文件
# tar cf oozie.conf.bak.`date +%Y%m%d` /etc/oozie/conf

停止Oozie
# service oozie stop

升级Oozie软件包
# yum upgrade oozie oozie-client

仔细校对新的配置文件中与原有配置文件中的参数,并将原有配置文件中的参数更新到新的配置文件

备份Oozie lib目录
# tar cf oozie.lib.bak.`date +%Y%m%d` /var/lib/oozie

升级Oozie数据库
# sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh upgrade -run

升级Oozie Shared Library
# sudo -u oozie hadoop fs -mv /user/oozie/share /user/oozie/share.orig.`date +%Y%m%d`
# sudo oozie-setup sharelib create -fs hdfs://idc1-hnn2:8020 -locallib /usr/lib/oozie/oozie-sharelib-yarn.tar.gz

将所有的library从目录/user/oozie/share/lib/lib_<new_date_string>移动到/user/oozie/share/lib(<new_date_string>为目录生成后上面的时间戳)
# sudo -u oozie mv /user/oozie/share/lib/lib_<new_date_string>/* /user/oozie/share/lib/

检查HDFS中/user/oozie/share目录下的所有文件,并与备份后的share.orig.`date +%Y%m%d`中的文件进行一一对比,除了带有"cdh5"版本字样的软件包仅保留更新的以外,其它的都复制到新的lib目录下。

启动Oozie服务器
# service oozie start

9. 升级Pig
杀掉所有正在运行的Pig进程
# pkill -kill -f pig

更新Pig软件包
# yum upgrade pig

10. 在所有的软件包都升级完毕,并且HDFS也能正常工作的情况下,执行finalizeUpgrade命令做最后的收尾
登录Active Namenode并执行以下命令
# sudo -u hdfs hdfs dfsadmin -finalizeUpgrade

, , , , , ,

No Comments

对自己心态上的一些反省

在工作和生活中,自己经常会被一些不太健康的心态所困扰,导致内心纠结,浪费了许多宝贵的时间,特在此做出反省。

1. 接受“不完美”的存在
公司新配了MacBook Pro笔记本,非常喜欢,因为没有使用经验于是尝试安装了很多各种各样的软件,而有不少软件在卸载后会留下一些残存的垃圾,并不能清理到完全没有安装过的程度。
但自己却浪费了很多时间来清理这些软件留下的信息,并在面对一些留有痕迹且无法清除的地方时,内心非常纠结,郁闷,经常有将整个系统都重装的冲动。

反省:
其实,通过常规的方式卸载掉软件,并删除一些很明显的残存的垃圾,就已经能够达到清理的目的了。
如果反复纠结并浪费时间在这上面,其实只是为了满足自己不健康的心态,让自己感觉很“干净”,又回到了“当初”。实际上不仅会浪费宝贵的时间,不会带来任何好处(比如磁盘空间的节省),还会有非常大的风险误删系统的重要文件,导致整个系统都出现问题。
这种类似的心态,其实自己在使用手机,以及一些生活用品时也会有,很没有意义,且浪费时间。
其实,在使用的过程中,安装使用一些新的软件,是在所难免的事情,让整个电脑“干净”的没有一点“杂物”,是完全不可能的事情。
因此,要能够接受电脑上有一些所谓的“不干净”的东西。在清理掉很明显的一些残存的垃圾之后,维护好当前经常使用的软件与系统环境,才是最有意义的。
上升一点高度来看的话,就是要接受“不完美”的存在,过好“当前”才是最主要的,因为“不完美”本身就不存在,且经历过的事情是无法抹去的,需要接受并从中进行反省。

2. 尊重他人的习惯与隐私
公司有一台服务器,分配给了很多同事权限,每个人都有自己的账号,以及一些公共账号。在维护好了自己的HOME目录之后,由于自己的习惯问题,看到系统中其他同事随意命名的文件如a.txt,111111.sh, asdf等,总是有忍不住想重命名或删掉的冲动;有时候会过分的切换到其他同事的HOME目录中,在看到一堆混乱不堪的,没有组织的文件与目录结构时,总是有想整理一下的冲动。
这样的心态,其实更多的时候是给自己添堵,并且如果真的忍不住将其他同事创建的文件和目录按照自己的意愿更改了,肯定会给他们带来不便,也等同于侵犯了别人的隐私。
这种类似的心态,其实自己在编写代码,文档,维护系统等很多方面,在与其他同事共同协作或共用一套环境的时候也会有。

反省:
其实,就如果一个小区一样,维护好自己家里的环境,和整个小区公用的环境,就已经很好了。
对于自己家里的环境,当然可以按照自己的习惯来规划的尽可能的让自己满意;
对于整个小区公用的环境,就不能完全按照自己的想法来,最好的方式是提出更好的建议,然后主动推动并将大家统一认可的好建议执行下去,达到改善公用环境的目的;
而对于别人自己家里的环境,有别人自己的生活方式和习惯,应该完全不去干涉,不要看不顺眼而说三道四。
因此,更好的心态是:
在公用的服务器环境上,对大家提出更好的,规范的建议,比如命名,目录组织等,并由大家认可后主动的进行改善,对于一些不能达到共识,且不符合自己习惯的地方,要接受,坚决避免擅自修改和删除;
对属于其他用户自己的文件和目录,如果放置在公用的区域,并且会交给其他人使用的,可以给出更好的建议,如果采纳了,就协助进行改善,如果不采纳,自己要学会接受和适应,坚决避免不顾他人感受,强行按照自己的想法去改变;
而对于其他用户自己的私有环境,如HOME目录,自己要明白其隐私属性,做到尊重并完全不要干涉。

3. 欣赏他人,并向他人学习
在项目中,与其他同事协作的时候,通常各自都会有一些自己的习惯和所擅长的领域。在一些缺乏可以让大家共同遵守的规则或样板的情况下,自己会比较纠结于其他同事的一些自己认为的“不专业”或“过于个性化”的地方,比如编码的风格,使用的工具,操作等。

反省:
不同的思维碰撞,才会带来进步和创新。
在过去工作的经历当中,有很多新东西都是从其他同事中学习得来的。
因此,在心态上,应该拥抱变化,对于一些自认为,其他同事“不专业”的地方,如果涉及到个人习惯,且不影响彼此的协作,那么应该学会接受和理解;
对于“过于个性化”的地方,应该尝试沟通,如果能够达成一致,尽量按双方都认可的方式执行;如果自己能够找到一个比较权威且可以让大家共同遵守的规则或样板,则可以主动的推荐给所有人;
多看到他人的优点,并从中学习,提升自己,不要固执己见和自以为是,更不能按照自己的习惯和想法去要求他人;协作的目的,是集合所有人去将一件事情做好,而不一定非要按照你想要的方式去把事情完成。

2 Comments