PS-NetMon

PS-NetMon is an automation script for checking and alerting various system and network outage and warning conditions currently including ping, HTTP responses and WMI calls to monitor Windows systems for services, drive space, event logs, CPU, RAM and VRAM conditions. When configured conditions are met PS-NetMon alerts are sent via SMTP email and PushBullet.

What is PushBullet?

PushBullet is a great messaging service that does not utilize a mail server. There are PushBullet clients  for Android, iPhone and Windows which will allow you to receive outage notifications even if your mail server is down. Simply go to www.PushBullet.com, create an account and they will provide you with an API key that will be used in your PS-NetMon configuration later. The use of PushBullet is totally optional but highly recommended. The free version of PushBullet will give you 100 messages per month which is fine for a small/stable environment or there is a Pro version that is rather inexpensive and gives unlimited messaging.

DOWNLOAD PS-NETMON

Being a systems administrator I understand the necessity of proactive monitoring, seeing issues before the users do. I have used many different tools over the years watching the price of one of my long time go-to tools rise from the $40 I purchased version 1 for so many years ago to the $35,000 I purchased the latest version for. That kind of money is hard to justify to management even in large businesses, let alone the smaller ones with more limited budgets.

Not my first open source code, not my first big programming project, not even my first PowerShell but this is my first big open source PowerShell programming project and I would love constructive input from anyone that has some.

SYSTEM REQUIREMENTS
* PowerShell 3.0+ (http://www.microsoft.com/en-us/download/details.aspx?id=34595)

* Discover.ps1 requires AD Modules for PowerShell install

http://blogs.msdn.com/b/rkramesh/archive/2012/01/17/how-to-add-active-directory-module-in-powershell-in-windows-7.aspx

https://p0w3rsh3ll.NoPress.com/2013/02/16/new-active-directory-cmdlets-available-on-windows-server-2012/

Basically; from admin PS prompt;

Add-WindowsFeature -Name RSAT-AD-PowerShell -Restart:$false

* Network access through firewalls between the monitoring PC and the monitored devices/servers. If Windows firewall is being used; the ENABLE-SERVER.CMD run from an administrative prompt on the machine to be monitored will set the appropriate firewall rules to allow access.

Getting started

The first step in monitoring a network infrastructure is auditing a network infrastructure. PS-Audit helps accomplish the task of auditing the network and creating the base configuration (CFG) files for use in PS-NetMon. Creation of the CFG files can be done individually by running PS-Audit.PS1 or by running PS-Discover.PS1 to create CFG files in mass from external sources. It would be suggested that you first use PS-Audit a few times to familiarize yourself before doing full discovery with PS-Discover.

PS-Audit.PS1 – When running PS-Audit manually you will be prompted for a server name and once entered PS-Audit will attempt to connect to that server and run a battery of checks creating a CFG file with entries appropriate to the server audited. Upon completion PS-Audit will prompt again for the next server, an empty response will exit the script.

Initial CFG files are populated with checks for found windows services (minus “common” services as defined in the CommonServices.cfg file), drives, system and application logs, CPU, RAM, VRAM and Ping. CFG files created by PS-Audit are placed in the DISCOVER sub-directory for manual editing prior to being moved to PS-Netmon’s MONITOR directory for active monitoring.

CommonServices.cfg – This is a configuration file used by PS-Audit to reduce the number of WMI checks on a given server and should be located in the same directory as PS-Audit. When PS-Audit detects a running service it compares it to the list of services in CommonServices.cfg and if it exists, the detected service WILL NOT be added to the generated CFG file. This action is to reduce the overhead of PS-NetMon on both the monitoring and monitored servers as well as reducing notifications for services not critical for server function. The CommonServices.cfg file should be viewed and edited before being used.

PS-Discover.PS1 – PS-Discover is used to find devices and servers in mass. PS-Discover uses an input source for device/server names/addresses and serially launches an instance of PS-Audit for each item in the input source. PS-Discover is intended for initial setup of an environment. Input sources for PS-Discover include 1) Text file with one device per line, 2) Active Directory OU and 3) A Class C subnet.

PS-NetMon – PS-NetMon consists of two files; PS-NetMon.PS1 and PS-NetMonCore.PS1. PS-NetMonCore.PS1 is ONLY launched from PS-NetMon.PS1 and should not be run directly. When run PS-NetMon.PS1 will search the MONITOR sub-directory and run checks for each CFG file found. The MONITOR sub-directory does not initially exist and should be manually created in the same directory as the PS-NetMon scripts reside. After using PS-Audit or PS-Discover to create CFG files for your systems, they should be examined and edited for your specific needs before being moved from the DISCOVER directory to the MONITOR directory for active monitoring.

PS-NetMon.cfg – Configuration settings for PS-NetMon are stored here, and this file should exist in the same directory as PS-NetMon. The settings are as follows; ‘smtp-server’, ‘smtp-from’, ‘smtp-to’ are all self-explanatory settings for email while logerrors and logstatus are true/false settings that determine logging levels during PS-NetMon system checks. If the logerrors setting is set to true then errors encountered during polling will be logged to the current log file in the LOG directory, while setting the logstatus setting to true will cause successful actions to be logged. Due to the large amount of activity, even in a small environment it is highly suggested to set logstatus to false unless troubleshooting. Optional: If you have chosen to take advantage of the PushBullet API for additional messaging you will need to add your API key to the PushBulletAPI setting.

sample PS-Netmon.cfg file;

==========================
smtp-server,mail.mydomain.com
smtp-from,no-reply@mydomain.com
smtp-to,admin@mydomain.com

PushBulletAPI,o.TcC1R2vJ5i311187DrF5pWrkQSK
logerrors,true
logstatus,false
==========================

<DEVICE>.CFG Files – The CFG files are simply text files with a single comma-delimited item per line defined as follows;

SERVER, Server Name

A single SERVER entry is allowed per CFG file. The server name is used for both connection and alert notification.
Ex: SERVER,Server1.mydomain.com

SERVICE,Service Name,Display Name

Check for running Windows services. Multiple SERVICE entries are allowed, one for each Windows service to be checked. The Service Name is the command line name for the windows service and must be identical to it’s listing inside of Windows. The Display Name is an English friendly easily readable description used in alerting and can be changed to accurately reflect it’s purpose on that server (instead of “MS-SQL” it might be more understandable as “Accounting Database”). Initial SERVICE entries created by PS-Audit will not include “common” services as defined in CommonServices.cfg. The elimination of common services is intended to reduce both the WMI overhead of polling as well as reducing the number of alerts for services not critical for operation. Feel free to edit the CommonServices.cfg file to reflect your own personal tastes.
Ex: SERVICE,”WsusService”,”WSUS Service”

DRIVE,Drive Letter,Free Space Alert

Check for available free space on a system hard drive. A single DRIVE entry will be made for each drive detected during audit with a default 10% free space warning. Note that the drive letter needs to have the following colon “C:”.
Ex: DRIVE,C:,10

EVENT,Log,Alert Level

Check for Windows event log events. The log name followed by the minimum alert level (Information, Warning, Error, Critical, Audit Success, Audit Failure). NOTE: Checks for lesser events will also return results for more critical events; such as setting alert level at “error” will bring back results for “error” and “critical” events.
Ex: EVENT,System,Critical

PING,Count

Check for a simple network ping request followed by the ping count. If a single ping responds successful the check is successful.
Ex: PING,5

CPU,Alert Level

Check for Windows CPU usage. A CPU use higher than the alert level results in notification.
Ex: CPU,95

RAM,Alert Level

Check for Windows RAM usage. A RAM use higher than the alert level results in notification.
Ex: RAM,10

VRAM,Alert Level

Check for Windows virtual RAM usage. A virtual RAM use higher than the alert level results in notification.
Ex: VRAM,10

URL,URL,Response

Checks an HTTP, HTTPS or FTP URL for a response string. The response string checked for should only exist in a positive response.
Ex: URL,”http://www.mydomain.com”,”shopping cart”

Sample Device CFG file;

==========================
SERVER,WEBSERVER1
SERVICE,"BITS","Background Intelligent Transfer Service"
SERVICE,"ftpsvc","Microsoft FTP Service"
SERVICE,"IISADMIN","IIS Admin Service"
SERVICE,"MpsSvc","Windows Firewall"
SERVICE,"MSSQL$SQLEXPRESS","SQL Server (SQLEXPRESS)"
SERVICE,"Schedule","Task Scheduler"
SERVICE,"SQLBrowser","SQL Server Browser"
SERVICE,"SQLWriter","SQL Server VSS Writer"
SERVICE,"W3SVC","World Wide Web Publishing Service"
DRIVE,C:,10
DRIVE,E:,10
EVENT,system,critical
EVENT,application,critical
CPU,90
RAM,90
VRAM,90
PING,2
==========================

Once PS-Netmon has been downloaded and configured for your site, it should be run at least once manually while testing monitoring and notification email. Once satisfied with the initial configuration don’t forget to setup PS-Netmon.PS1 as a scheduled task. Personally I run PS-Netmon in five minute increments which seems to serve me well.

NOTES:

  • When monitoring RAM, VRAM and CPU it important to remember that the act of monitoring uses these resources and may add enough work to push the server into an alert situation. The act of observation can cause the condition itself (it’s a quantum physics thing).
  • Active servers such as Exchange or Domain Controllers may use considerable resources with event log scans.
  • The server running PS-NetMon may read high values during scanning process due to multiple PowerShell instances.

-fin