Performance Monitoring Tweaks for Zenoss
From SysAdminWiki
Contents |
File System Growth Graph
- 1. Go the device class: /Devices/Server
- 2. Go to templates and edit the Filesystem template
- 3. Add a new graph called Utilization bytes.
- 4. Add the Usedblocks datapoint and choose the datapoint.
- 5. Add the following to RPN:
${here/blockSize},*
Show Correct Windows Event Count
Windows events often repeat themselves and Windows tells you how many times. Unfortunately Zenoss sees each occurrence as a separate event. Here is how to rename the event to make Zenoss aware that this is just a duplicate of another event:
import re
evt.summary = re.sub('It has done this [0-9]+ time\(s\).', '',
evt.message)
Show the percentage used when one of your file systems crosses it's threshold
- 1. Click Classes -> Events -> Perf -> Filesystem
- 2. Click on the breadcrumb and choose More > Transform
- 3. Add the following:
fs_id = device.prepId(evt.component)
for f in device.os.filesystems():
if f.id != fs_id: continue
p = (float(f.usedBytes()) / f.totalBytes()) * 100
evt.summary = "Filesystem threshold exceeded: current value %3.1f%%" % (p)
break
- 4. Click Save.
Note:
You can browse directly to the transform rule from this:
http://YOUR-ZENOSS:8080/zport/dmd/Events/Perf/Filesystem/editEventClassTransform
Once there is a threshold exceeded the event looks like this.
Filesystem threshold exceeded: current value 92%
Another Variation
Found this but never got to test it yet:
import re
fs_id = device.prepId(evt.component)
for f in device.os.filesystems():
if f.id != fs_id: continue
# Extract the percent and free from the summary
m = re.search("threshold of [^:]+: current value ([\d\.]+)", evt.summary)
if not m: continue
usedBlocks = float(m.groups()[0])
p = (usedBlocks / f.totalBlocks) * 100
freeAmtGB = ((f.totalBlocks - usedBlocks) * f.blockSize) / 1073741824
# Make a nicer summary
evt.summary = "Disk space low: %3.1f%% used (%3.2f GB free)" % (p,freeAmtGB)
# This is where we change to a per device threshold
perDeviceThreshold = 95.0
m = re.search("zz(\d{3})", f.id)
perDeviceThreshold = m and float(m.groups()[0]) or 95.0
if p >= perDeviceThreshold: evt.severity = 5
break
