Skip to content

Commit

Permalink
mgr/zabbix: Send max, min and avg PGs of OSDs to Zabbix
Browse files Browse the repository at this point in the history
We already send the max, min and avg fill ratio of OSDs but
knowing the OSD with the highest amount of PGs is also useful.

This allows admins to create a trigger should it happen that there
is a OSD with too many PGs.

This could happen if a lot of OSDs fail and PGs start to move filling
up one or more OSDs with many PGs.

As PGs eat CPU and Memory people usually like to watch out for these
situations.

Signed-off-by: Wido den Hollander <[email protected]>
  • Loading branch information
wido committed Mar 26, 2018
1 parent 4198558 commit 582935f
Show file tree
Hide file tree
Showing 2 changed files with 134 additions and 0 deletions.
5 changes: 5 additions & 0 deletions src/pybind/mgr/zabbix/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ def get_data(self):
data['num_osd_in'] = num_in

osd_fill = list()
osd_pgs = list()
osd_apply_latency_ns = list()
osd_commit_latency_ns = list()

Expand All @@ -180,13 +181,17 @@ def get_data(self):
if osd['kb'] == 0:
continue
osd_fill.append((float(osd['kb_used']) / float(osd['kb'])) * 100)
osd_pgs.append(osd['num_pgs'])
osd_apply_latency_ns.append(osd['perf_stat']['apply_latency_ns'])
osd_commit_latency_ns.append(osd['perf_stat']['commit_latency_ns'])

try:
data['osd_max_fill'] = max(osd_fill)
data['osd_min_fill'] = min(osd_fill)
data['osd_avg_fill'] = avg(osd_fill)
data['osd_max_pgs'] = max(osd_pgs)
data['osd_min_pgs'] = min(osd_pgs)
data['osd_avg_pgs'] = avg(osd_pgs)
except ValueError:
pass

Expand Down
129 changes: 129 additions & 0 deletions src/pybind/mgr/zabbix/zabbix_template.xml
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,135 @@
<valuemap/>
<logtimefmt/>
</item>
<item>
<name>Ceph OSD max PGs</name>
<type>2</type>
<snmp_community/>
<multiplier>0</multiplier>
<snmp_oid/>
<key>ceph.osd_max_pgs</key>
<delay>0</delay>
<history>90</history>
<trends>365</trends>
<status>0</status>
<value_type>0</value_type>
<allowed_hosts/>
<units/>
<delta>0</delta>
<snmpv3_contextname/>
<snmpv3_securityname/>
<snmpv3_securitylevel>0</snmpv3_securitylevel>
<snmpv3_authprotocol>0</snmpv3_authprotocol>
<snmpv3_authpassphrase/>
<snmpv3_privprotocol>0</snmpv3_privprotocol>
<snmpv3_privpassphrase/>
<formula>1</formula>
<delay_flex/>
<params/>
<ipmi_sensor/>
<data_type>0</data_type>
<authtype>0</authtype>
<username/>
<password/>
<publickey/>
<privatekey/>
<port/>
<description>Maximum amount of PGs on OSDs</description>
<inventory_link>0</inventory_link>
<applications>
<application>
<name>Ceph</name>
</application>
</applications>
<valuemap/>
<logtimefmt/>
</item>
<item>
<name>Ceph OSD min PGs</name>
<type>2</type>
<snmp_community/>
<multiplier>0</multiplier>
<snmp_oid/>
<key>ceph.osd_min_pgs</key>
<delay>0</delay>
<history>90</history>
<trends>365</trends>
<status>0</status>
<value_type>0</value_type>
<allowed_hosts/>
<units/>
<delta>0</delta>
<snmpv3_contextname/>
<snmpv3_securityname/>
<snmpv3_securitylevel>0</snmpv3_securitylevel>
<snmpv3_authprotocol>0</snmpv3_authprotocol>
<snmpv3_authpassphrase/>
<snmpv3_privprotocol>0</snmpv3_privprotocol>
<snmpv3_privpassphrase/>
<formula>1</formula>
<delay_flex/>
<params/>
<ipmi_sensor/>
<data_type>0</data_type>
<authtype>0</authtype>
<username/>
<password/>
<publickey/>
<privatekey/>
<port/>
<description>Minimum amount of PGs on OSDs</description>
<inventory_link>0</inventory_link>
<applications>
<application>
<name>Ceph</name>
</application>
</applications>
<valuemap/>
<logtimefmt/>
</item>
<item>
<name>Ceph OSD avg PGs</name>
<type>2</type>
<snmp_community/>
<multiplier>0</multiplier>
<snmp_oid/>
<key>ceph.osd_avg_pgs</key>
<delay>0</delay>
<history>90</history>
<trends>365</trends>
<status>0</status>
<value_type>0</value_type>
<allowed_hosts/>
<units/>
<delta>0</delta>
<snmpv3_contextname/>
<snmpv3_securityname/>
<snmpv3_securitylevel>0</snmpv3_securitylevel>
<snmpv3_authprotocol>0</snmpv3_authprotocol>
<snmpv3_authpassphrase/>
<snmpv3_privprotocol>0</snmpv3_privprotocol>
<snmpv3_privpassphrase/>
<formula>1</formula>
<delay_flex/>
<params/>
<ipmi_sensor/>
<data_type>0</data_type>
<authtype>0</authtype>
<username/>
<password/>
<publickey/>
<privatekey/>
<port/>
<description>Average amount of PGs on OSDs</description>
<inventory_link>0</inventory_link>
<applications>
<application>
<name>Ceph</name>
</application>
</applications>
<valuemap/>
<logtimefmt/>
</item>
<item>
<name>Ceph backfill full ratio</name>
<type>2</type>
Expand Down

0 comments on commit 582935f

Please sign in to comment.