Operations Manager 2012: Network Monitoring

Speaker: Vishnu Nath, PM for Network Monitoring feature in OpsMgr 2012.

Discovery, monitoring, visualisation and reporting. Key takeaway; OpsMgr will help IT Operations gain visibility into the network layer of service to reduce meantime to resolution. All the required MPs, dashboards, and reports are built in-box. Server to network dependency discovery with support for over 80 vendors and 2000+ devices certified. It supports SNMP V1, v2c and V3. There is support for IPv4 and IPv6 endpoints.

Supported devices:

Bridges
Firewalls
Load balancers
Switches
Routers

Discovery

Process of identifying network devices to be monitored. Designed to be simple, without the need to call in network admins.

Demo

You can run the normal discovery wizard to discover network devices. There is also a Discovery Rule that you can configure n Administration/Network Management. This can run on a regular schedule. You can pick a management or gateway server to run the rule, and you set the server resource pool for the monitoring. Note that the design guide prefers that you have a dedicated network monitoring resource pool (min 2 Mgmt servers) if doing this at scale.

There are two discovery types, which are like the types of customer MSFT has encountered. You list the IPs of devices and do explicit discovery. Alternately, you can do a recursive discovery which crawls the network via router ARP and IP tables. That’s useful if you don’t know the network architecture.

You’ll need runas accounts for he community strings … read only passwords to MIBS and SNMP tables in the network devices. It does not need read-write private strings. Using a runas account secures the password/community string. You can have a number of them for complex environments.

You can import a text file of device IP addresses for an explicit discovery. You can use ICMP and/or SNMP access mode to monitor the device. ICMP gives you ping up/down probe monitoring. SNMP gives you more depth. An ISP won’t give you SNMP access. A secure environment might not allow ICMP into a DMZ. You can set the SNMP version, and the runas account for each device. During discovery, OpsMgr will try each community string you’ve entered. It will remember which one works. In some environments, devices can send trap alerts if they have failed logins and that can create a storm of alerts … SO BEWARE. You can avoid this by selecting the right runas account per device.

There are retry attempts, ICMP timeout, SNMP timeout. You also can set a max device number discovery cap. This is to avoid discovering more than you need to in a corporate environment.

You can limit the discovery to Name, OID, or IP range. And you can exclude devices.

You can also do the discovery on a regular basis using a schedule. Not important in static environment. Maybe do it once a week in larger or more fluid environments. You can run the discovery rule manually. When you save the rule, you have the choice to run the rule right then.

What’s Discovered

Connectivity of devices and dependencies, servers to network and network to network
VLAN membership
HSRP for Cisco
Stitching of switch ports to server NICs
Key components of devices: ports/interfaces/processor/ and memory I think

The process:

Probing (if not supported, it’s popped in pending management for you to look at. If OpsMgr knows it then they have built in MIBS to deal with it) –> Processing –> Post Processing (what VLANs, what devices are connected, NIC stitching mapping).

Works only on Gateway/management server
Single rule per gateway/management server
Discovery runs on a scheduled basis or on demand
Limited discoveries can be triggered by device traps – enabled on some devices. Some devices detect a NIC swap, and the device traps, and OpsMgr knows that it needs to rediscover this device. Seamless and clever.

Port/Interface Monitoring

Up/down
Volumes of inbound/outbound traffic
% utilization
Discards, drops, Errors

Processor % utilization

Memory counters (Cisco) and free memory

Connection Health on both ends of the connection

VLAN health based on state of switches (rollup) in the VLAN

HSRP Group Health is a rollup as well

Network Monitoring

Supports resource pools for HA monitoring
Only certain ports monitored by default: ports connecting two network devices together or ports that the management server is connected to
User can override and monitor other ports if required

Visualisation

4 dashboards:

Network summary: This is the high level view, i.e. top 10 nodes list
Network node: Take any device and drill down into it.
Network interface: Drill into a specific interface to see traffic activity
Vicinity: neighbours view and connection health.

Reporting

5 reports:

Memory utilisation
CPU utilisation
Port traffic volume
Port error analysis
Port packet analysis

Demo

Behind the scenes they normalise data, e.g. memory free from vendor A and memory used from vendor B, so you have one consistent view. You can run a task to enable port monitoring for (by default) un-monitored discovered ports (see above).

End

You can author custom management packs with your own SNMP rules. They used 2 industry standard MIBS and it’s worked on 90-95% of devices that they’ve encountered so far. Means there’s a good chance it will work on future devices.

Technorati Tags: Event Notes,Events,System Center,Operations Manager,Networking

Leave a Reply Cancel reply