[ale] SNMP oid for file descriptors

Discussion:

Todor Fassl via Ale

2018-08-06 17:49:51 UTC

Ultimately, what I want to do is to configure nagios to alert me when a
server is getting low on file handles. There are a couple of scripts on
the nagios web site but they look kind of hokey.

I think all I should have to do is cut/pasete the right SNMP object
identifier into the nagios snmp plugin. But how to find that oid?

_______________________________________________
Ale mailing list
***@ale.org
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Jerald Sheets via Ale

2018-08-06 18:21:59 UTC

Permalink

Why not pull the number of file handles you have: "sysctl fs.file-nr" and compare it to current open file handles âlost |wc -lâ and do a multi-graph for the metrics. Then, also subtract the two and alert when you get below whatever threshold you want?

I think you should be able to knock that out in BASH or Perl.

âj

Ultimately, what I want to do is to configure nagios to alert me when a server is getting low on file handles. There are a couple of scripts on the nagios web site but they look kind of hokey.
I think all I should have to do is cut/pasete the right SNMP object identifier into the nagios snmp plugin. But how to find that oid?
_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Todor Fassl via Ale

2018-08-07 14:00:00 UTC

Permalink

Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

Why not pull the number of file handles you have: "sysctl fs.file-nr" and compare it to current open file handles “lost |wc -l” and do a multi-graph for the metrics. Then, also subtract the two and alert when you get below whatever threshold you want?
I think you should be able to knock that out in BASH or Perl.
—j

Ultimately, what I want to do is to configure nagios to alert me when a server is getting low on file handles. There are a couple of scripts on the nagios web site but they look kind of hokey.
I think all I should have to do is cut/pasete the right SNMP object identifier into the nagios snmp plugin. But how to find that oid?
_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

--
Todd

Alex Carver via Ale

2018-08-07 15:55:03 UTC

Permalink

You create your own OID table entry. Use the extend or pass-through
capability of snmpd and a shell script to generate an SNMP table or tree
output and then obtain it starting with the base tree

NET-SNMP-EXTEND-MIB::nsExtensions (.1.3.6.1.4.1.8072.1.3)

Post by Todor Fassl via Ale
Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

Post by Jerald Sheets via Ale
Why not pull the number of file handles you have: "sysctl fs.file-nr"
and compare it to current open file handles “lost |wc -l” and do a
multi-graph for the metrics. Then, also subtract the two and alert
when you get below whatever threshold you want?
I think you should be able to knock that out in BASH or Perl.
—j

Post by Todor Fassl via Ale
Ultimately, what I want to do is to configure nagios to alert me when
a server is getting low on file handles. There are a couple of
scripts on the nagios web site but they look kind of hokey.
I think all I should have to do is cut/pasete the right SNMP object
identifier into the nagios snmp plugin. But how to find that oid?
_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Matty via Ale

2018-08-07 17:52:36 UTC

Permalink

Post by Todor Fassl via Ale
Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

An alternate approach (and given that you use nagios this may not be
suited for you) is to use prometheus along with the node_exporter. It
has two useful file descriptor metrics:

node_filefd_allocated
node_filefd_maximum

We graph both of these with Grafana and use the prometheus alert
manager to notify us of issues. Thought I would pass this on.

Thanks,
- Ryan
https://prefetch.net
_______________________________________________
Ale mailing list
***@ale.org
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Todor Fassl via Ale

2018-08-10 18:20:18 UTC

Permalink

Ah, now I am starting to get it.

It occurs to me that this is a good approach in general for solving all
kinds of system monitoring problems. Like I once worked for a web
hosting service. We stored info about each account in a file within the
account. But how to get to that info remotely? We considered picking a
port number, inventing our own protocol, and writing a TCP/IP server.
Should have just added it to snmp and used snmp's own security protocols
to protect it. Before that I worked for a company that wrote drivers for
medical scanners. We *did* pick a port number and invent our own
protocol to get status info from the scanner.

The beauty of the thing, besides not re-inventing the wheel, is that
standard tools like nagios could be used to monitor the data.

I'll have to do some research.

Post by Alex Carver via Ale
You create your own OID table entry. Use the extend or pass-through
capability of snmpd and a shell script to generate an SNMP table or tree
output and then obtain it starting with the base tree
NET-SNMP-EXTEND-MIB::nsExtensions (.1.3.6.1.4.1.8072.1.3)

Post by Todor Fassl via Ale
Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

--
Todd

Alex Carver via Ale

2018-08-10 22:21:35 UTC

Permalink

I use this approach to get the temperatures of my Raspberry Pi
processors remotely.

My snmpd.conf file contains this line:
extend temp /usr/local/bin/snmp_temperature.sh

The shell script it points to is just this:

#!/bin/bash
/bin/sed 's:$[0-9]\{2\}$$[0-9]\{3\}$:\1.\2:'
/sys/class/thermal/thermal_zone0/temp

If I walk the tree I get this:

$ snmpwalk -v 2c -c secret localhost .1.3.6.1.4.1.8072.1.3
iso.3.6.1.4.1.8072.1.3.2.1.0 = INTEGER: 1
iso.3.6.1.4.1.8072.1.3.2.2.1.2.4.116.101.109.112 = STRING:
"/usr/local/bin/snmp_temperature.sh"
iso.3.6.1.4.1.8072.1.3.2.2.1.3.4.116.101.109.112 = ""
iso.3.6.1.4.1.8072.1.3.2.2.1.4.4.116.101.109.112 = ""
iso.3.6.1.4.1.8072.1.3.2.2.1.5.4.116.101.109.112 = INTEGER: 5
iso.3.6.1.4.1.8072.1.3.2.2.1.6.4.116.101.109.112 = INTEGER: 1
iso.3.6.1.4.1.8072.1.3.2.2.1.7.4.116.101.109.112 = INTEGER: 1
iso.3.6.1.4.1.8072.1.3.2.2.1.20.4.116.101.109.112 = INTEGER: 4
iso.3.6.1.4.1.8072.1.3.2.2.1.21.4.116.101.109.112 = INTEGER: 1
iso.3.6.1.4.1.8072.1.3.2.3.1.1.4.116.101.109.112 = STRING: "54.230"
iso.3.6.1.4.1.8072.1.3.2.3.1.2.4.116.101.109.112 = STRING: "54.230"
iso.3.6.1.4.1.8072.1.3.2.3.1.3.4.116.101.109.112 = INTEGER: 1
iso.3.6.1.4.1.8072.1.3.2.3.1.4.4.116.101.109.112 = INTEGER: 0
iso.3.6.1.4.1.8072.1.3.2.4.1.2.4.116.101.109.112.1 = STRING: "54.230"

The last four OIDs of 116.101.109.112 translates to "temp" as appears in
the extend configuration. The '4' before that I believe marks the
boundary of the offical OID tree (for which the values before it are
documented) and the user defined OID values (the "temp" in this case")

There's also a pass through method that requires a bit more effort on
the scripting side but provides a bit more flexibility.

On 2018-08-10 11:20, Todor Fassl wrote:> Ah, now I am starting to get it.

Post by Todor Fassl via Ale
It occurs to me that this is a good approach in general for solving all
kinds of system monitoring problems. Like I once worked for a web
hosting service. We stored info about each account in a file within the
account. But how to get to that info remotely? We considered picking a
port number, inventing our own protocol, and writing a TCP/IP server.
Should have just added it to snmp and used snmp's own security protocols
to protect it. Before that I worked for a company that wrote drivers for
medical scanners. We *did* pick a port number and invent our own
protocol to get status info from the scanner.
The beauty of the thing, besides not re-inventing the wheel, is that
standard tools like nagios could be used to monitor the data.
I'll have to do some research.

You create your own OID table entry. Use the extend or pass-through
capability of snmpd and a shell script to generate an SNMP table or tree
output and then obtain it starting with the base tree
NET-SNMP-EXTEND-MIB::nsExtensions (.1.3.6.1.4.1.8072.1.3)

Post by Todor Fassl via Ale
Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Jerald Sheets via Ale

2018-08-11 17:58:32 UTC

Permalink

There are SO many tools out there that do this for you. Open Source and commercial products, too.

Puppet/Chef will give you a system inventory from the hardware perspective (they both use things like dmiedcode, and some ruby âsecret sauceâ to make that happen). Instrumenting your OS that way is a cool thing to try. I saw one site implement Puppet to only do three things:

1. Get the machine inventory
2. Use mcollective to do orchestrated functions in âcollectivesâ of machines
3. Enforce NTP/SSH config across their env.

Thereâs tools. Donât try and reinvent the wheel.

âj

Post by Todor Fassl via Ale
Ah, now I am starting to get it.
It occurs to me that this is a good approach in general for solving all kinds of system monitoring problems. Like I once worked for a web hosting service. We stored info about each account in a file within the account. But how to get to that info remotely? We considered picking a port number, inventing our own protocol, and writing a TCP/IP server. Should have just added it to snmp and used snmp's own security protocols to protect it. Before that I worked for a company that wrote drivers for medical scanners. We *did* pick a port number and invent our own protocol to get status info from the scanner.
The beauty of the thing, besides not re-inventing the wheel, is that standard tools like nagios could be used to monitor the data.
I'll have to do some research.

Post by Todor Fassl via Ale
Well, I am not too worried about being able to calculate it. What I
cannot figure out is how to get to the numbers remotely.

Post by Jerald Sheets via Ale
Why not pull the number of file handles you have: "sysctl fs.file-nr"
and compare it to current open file handles âlost |wc -lâ and do a
multi-graph for the metrics. Then, also subtract the two and alert
when you get below whatever threshold you want?
I think you should be able to knock that out in BASH or Perl.
âj

_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

--
Todd
_______________________________________________
Ale mailing list
https://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo