使用BBCP来提升跨互联网的数据传输速度


背景介绍:
目前项目在美国东西部以及欧洲都有服务器节点,跨互联网的数据传输速度很不稳定,之前我们主要是通过SCP以及Rsync等方式进行数据传输的。
无意间发现了BBCP这个软件之后,经过测试,效果非常好,速度提升效果很明显,并且传输速度一直比较稳定,同时支持日志以及失败后重试等参数,非常不错。

参考资料:
http://www.slac.stanford.edu/~abh/bbcp/
http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm
https://www.olcf.ornl.gov/kb_articles/transferring-data-with-bbcp/

安装配置:
[heydevops@east-server1 ~]$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_rhel60/bbcp -O /usr/bin/bbcp
[heydevops@east-server1 ~]$ sudo chmod +x /usr/bin/bbcp

[heydevops@west-server1 ~]$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_rhel60/bbcp -O /usr/bin/bbcp
[heydevops@west-server1 ~]$ sudo chmod +x /usr/bin/bbcp

[heydevops@east-server1 ~]$ which bbcp
/usr/bin/bbcp
[heydevops@east-server1 ~]$ ssh west-server1 which bbcp
/usr/bin/bbcp

[heydevops@east-server1 ~]$ cd heydevops
[heydevops@east-server1 heydevops]$ sudo dd if=/dev/zero of=/home/heydevops/heydevops/file.2g bs=1024M count=2

 
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 45.9129 s, 46.8 MB/s

[heydevops@east-server1 heydevops]$ ls -lh
total 2.0G
-rw-r--r-- 1 root root 2.0G Mar 4 06:40 file.2g

[heydevops@east-server1 heydevops]$ time bbcp -r -P 2 -V -w 8m -s 16 file.2g west-server1:/home/heydevops/heydevops/

 
bbcp: Window size reduced to 245760 bytes.
bbcp: Indexing files to be copied...
bbcp: Copying 0 files in 0 directories.
Source east-server1.heylinux.com using initial send window of 18700
Target west-server1.heylinux.com using initial recv window of 87380
bbcp: Creating /home/heydevops/heydevops/file.2g
bbcp: 140304 06:46:12  0% done; 8.5 MB/s, avg 8.5 MB/s
bbcp: 140304 06:46:14  1% done; 6.9 MB/s, avg 7.5 MB/s
...
bbcp: 140304 06:51:46  99% done; 7.7 MB/s, avg 6.1 MB/s
bbcp: 140304 06:51:48  99% done; 3.3 MB/s, avg 6.1 MB/s
Source cpu=3.643 (sys=3.552 usr=0.091).
File /home/heydevops/heydevops/file.2g created; 2147483648 bytes at 6.0 MB/s
48 buffers used with 0 reorders; peaking at 0.
Source east-server1.heylinux.com using a final send window of 433840
Target cpu=15.149 (sys=14.505 usr=0.644).
Target west-server1.heylinux.com using a final recv window of 2298624
1 file copied at effectively 6.0 MB/s

real    5m42.236s
user    0m0.104s
sys     0m3.567s

[heydevops@east-server1 heydevops]$ time scp file.2g west-server1:/home/heydevops/heydevops/

 
file.2g   100%   2048MB   2.1MB/s   16:06    

real    16m8.448s
user    0m43.497s
sys     0m7.548s

结论:
在上面的测试中,传输一个大小为2G的文件,使用BBCP耗时仅5分钟,而普通的SCP则耗时16分钟,速度提升超过60%。
更进一步的测试报告可以在这里看到:http://heylinux.com/en/?p=258

最近更新:
目前,我们在线上正式使用BBCP已经有一个月了,效果不错,下面,将我们用到的参数分享给大家:
[dong.guo@heydevops ~]$ dd if=/dev/zero of=/home/dong.guo/file.16m bs=1M count=16

 
16+0 records in
16+0 records out
16777216 bytes (17 MB) copied, 0.0457727 s, 367 MB/s

[dong.guo@heydevops ~]$ pwd

 
/home/dong.guo

[dong.guo@heydevops ~]$ ls -lh file.16m

 
-rw-r--r-- 1 dong.guo adm 16M May  5 12:08 file.16m

[dong.guo@heydevops ~]$ bbcp -k -a /tmp/bbcp_checkpoint -r -P 2 -V -f -w 9m -s 16 -T "ssh -x -a -p 2222 -oFallBackToRsh=no -i /home/dong.guo/.ssh/id_rsa -l heydevops heylinux.com /usr/bin/bbcp" file.16m heydevops@heylinux.com:/tmp/

 
Warning: the RSA host key for '[heylinux.com]:2222' differs from the key for the IP address '[54.238.131.140]:2222'
Offending key for IP in /home/dong.guo/.ssh/known_hosts:517
Matching host key in /home/dong.guo/.ssh/known_hosts:528
bbcp: Sink I/O buffers (147456K) > 25% of available free memory (40988K); copy may be slow
bbcp: Window size reduced to 245760 bytes.
bbcp: Indexing files to be copied...
bbcp: Copying 0 files in 0 directories.
Source heydevops using initial send window of 19800
Target ec2-tokyo.localdomain using initial recv window of 87380
bbcp: Appending to /tmp/file.16m at offset 0
bbcp: 140505 12:11:30  28% done; 4.0 MB/s, avg 4.0 MB/s
bbcp: 140505 12:11:32  30% done; 148.0 KB/s, avg 1.6 MB/s
Source cpu=0.239 (sys=0.233 usr=0.006).
File /tmp/file.16m created; 16777216 bytes at 3.3 MB/s
288 buffers used with 33 reorders; peaking at 21.
Target cpu=0.303 (sys=0.291 usr=0.012).
Target ec2-tokyo.localdomain using a final recv window of 502864
Source heydevops using a final send window of 71280
1 file copied at effectively 1.5 MB/s

参数详解:

 
-k 保留所有未传输完成的文件,并允许在重试时进行覆盖
-a 保留checkpoint信息用于校验文件的完整性
-r 递归传输指定路径下的所有文件
-P 2 每两秒显示传输的进程
-V 打印调试信息
-f 强制清除远程主机上传输失败的数据
-w 设置Disk (I/O) buffers
   算法为(window = netspeed/8*RTT = 1000Mb/8*74ms = 1000/1000/8*74 = 9.25 M)
   对应链接:http://www.slac.stanford.edu/~abh/bbcp/#_Toc332986061
-s 16 设置并发数为16
      参考官方建议:http://www.slac.stanford.edu/~abh/bbcp/#_Streams_(-s)
-T "ssh -x -a -p 2222 -oFallBackToRsh=no -i /home/dong.guo/.ssh/id_rsa -l heydevops heylinux.com /usr/bin/bbcp" 
   指定远端主机的认证方式:
   采用-p 2222指定端口;
   设置-oFallBackToRsh=no减少ssh响应时间;
   设置-i /home/dong.guo/.ssh/id_rsa指定SSH Key;
   设置-l heydevops指定登陆用户;
   heylinux.com为远程主机地址;
   /usr/bin/bbcp为远程主机的bbcp路径;

,

  1. #1 by Chen Qi on 2014/03/25 - 09:50

    这个软件加速的原理是怎样的呢?它是怎么提高网络带宽的利用率?

    • #2 by mcsrainbow on 2014/04/01 - 11:47

      一方面是并发,另一方面是利用了Linux本身的一些网络属性,我其实也理解的不是很好,感觉要懂更底层才可以懂的比较透彻。

  2. #3 by 云飞 on 2014/04/19 - 11:08

    这个软件是基于SSH认证的吗?

(will not be published)
*