标签为 check_mk 的文章

一个check_mk源码小bug的解决

在线上,我们使用了icinga结合check_mk作为监控系统。
今天,在用cmk -II更新主机的inventory信息时,无论后面跟的是什么主机,都会报告如下错误:

Removing unimplemented check /
Removing unimplemented check oom_adj_for_cron
Removing unimplemented check oom_adj_for_sshd
Traceback (most recent call last):
    File "/usr/share/check_mk/modules/check_mk.py", line 5801, in <module>
        remove_autochecks_of(host, checknames)
    File "/usr/share/check_mk/modules/check_mk.py", line 2907, in remove_autochecks_of
    if splitted[3] not in check_info:
IndexError: list index out of range

在网上搜寻了半天,根本找不到任何有帮助的信息,于是我尝试通过报错中提到的位置对源码进行调试:
修改/usr/share/check_mk/modules/check_mk.py,加入'print splitted'来打印溢出的List,即splitted。

       for fn in glob.glob(autochecksdir + "/*.mk"):
           lines = []
           count = 0
           for line in file(fn):
               # hostname and check type can be quoted with ' or with "
               double_quoted = line.replace("'", '"').lstrip()
               if double_quoted.startswith('("'):
                   count += 1
                   splitted = double_quoted.split('"')
                   print splitted
                   if splitted[1] != hostname or (checktypes != None and splitted[3] not in checktypes):
                   if splitted[3] not in check_info:
                       sys.stderr.write('Removing unimplemented check %s\n' % splitted[3])
                       continue
                       lines.append(line)
                   else:
                       removed += 1
               if len(lines) == 0:

然后再次运行cmk -II,发现如下信息:

...
("iad1-server5", job, 'oom_adj_for_sshd', None)
Removing unimplemented check oom_adj_for_sshd

("iad1-server5", kernel.util, None, kernel_util_default_levels)
Traceback (most recent call last):
    File "/usr/share/check_mk/modules/check_mk.py", line 5803, in 
       remove_autochecks_of(host, checknames)
    File "/usr/share/check_mk/modules/check_mk.py", line 2909, in remove_autochecks_of
    if splitted[3] not in check_info:

可以发现,
("iad1-server5", kernel.util, None, kernel_util_default_levels)
根本不能通过单双引号分割为一个长度大于3的List,所以会报溢出的错误:'IndexError: list index out of range'
阅读全文 »

,

No Comments