Speaker: Vishnu Nath, PM for Network Monitoring feature in OpsMgr 2012.
Discovery, monitoring, visualisation and reporting. Key takeaway; OpsMgr will help IT Operations gain visibility into the network layer of service to reduce meantime to resolution. All the required MPs, dashboards, and reports are built in-box. Server to network dependency discovery with support for over 80 vendors and 2000+ devices certified. It supports SNMP V1, v2c and V3. There is support for IPv4 and IPv6 endpoints.
- Load balancers
Process of identifying network devices to be monitored. Designed to be simple, without the need to call in network admins.
You can run the normal discovery wizard to discover network devices. There is also a Discovery Rule that you can configure n Administration/Network Management. This can run on a regular schedule. You can pick a management or gateway server to run the rule, and you set the server resource pool for the monitoring. Note that the design guide prefers that you have a dedicated network monitoring resource pool (min 2 Mgmt servers) if doing this at scale.
There are two discovery types, which are like the types of customer MSFT has encountered. You list the IPs of devices and do explicit discovery. Alternately, you can do a recursive discovery which crawls the network via router ARP and IP tables. That’s useful if you don’t know the network architecture.
You’ll need runas accounts for he community strings … read only passwords to MIBS and SNMP tables in the network devices. It does not need read-write private strings. Using a runas account secures the password/community string. You can have a number of them for complex environments.
You can import a text file of device IP addresses for an explicit discovery. You can use ICMP and/or SNMP access mode to monitor the device. ICMP gives you ping up/down probe monitoring. SNMP gives you more depth. An ISP won’t give you SNMP access. A secure environment might not allow ICMP into a DMZ. You can set the SNMP version, and the runas account for each device. During discovery, OpsMgr will try each community string you’ve entered. It will remember which one works. In some environments, devices can send trap alerts if they have failed logins and that can create a storm of alerts … SO BEWARE. You can avoid this by selecting the right runas account per device.
There are retry attempts, ICMP timeout, SNMP timeout. You also can set a max device number discovery cap. This is to avoid discovering more than you need to in a corporate environment.
You can limit the discovery to Name, OID, or IP range. And you can exclude devices.
You can also do the discovery on a regular basis using a schedule. Not important in static environment. Maybe do it once a week in larger or more fluid environments. You can run the discovery rule manually. When you save the rule, you have the choice to run the rule right then.
- Connectivity of devices and dependencies, servers to network and network to network
- VLAN membership
- HSRP for Cisco
- Stitching of switch ports to server NICs
- Key components of devices: ports/interfaces/processor/ and memory I think
Probing (if not supported, it’s popped in pending management for you to look at. If OpsMgr knows it then they have built in MIBS to deal with it) –> Processing –> Post Processing (what VLANs, what devices are connected, NIC stitching mapping).
- Works only on Gateway/management server
- Single rule per gateway/management server
- Discovery runs on a scheduled basis or on demand
- Limited discoveries can be triggered by device traps – enabled on some devices. Some devices detect a NIC swap, and the device traps, and OpsMgr knows that it needs to rediscover this device. Seamless and clever.
- Volumes of inbound/outbound traffic
- % utilization
- Discards, drops, Errors
Processor % utilization
Memory counters (Cisco) and free memory
Connection Health on both ends of the connection
VLAN health based on state of switches (rollup) in the VLAN
HSRP Group Health is a rollup as well
- Supports resource pools for HA monitoring
- Only certain ports monitored by default: ports connecting two network devices together or ports that the management server is connected to
- User can override and monitor other ports if required
- Network summary: This is the high level view, i.e. top 10 nodes list
- Network node: Take any device and drill down into it.
- Network interface: Drill into a specific interface to see traffic activity
- Vicinity: neighbours view and connection health.
- Memory utilisation
- CPU utilisation
- Port traffic volume
- Port error analysis
- Port packet analysis
Behind the scenes they normalise data, e.g. memory free from vendor A and memory used from vendor B, so you have one consistent view. You can run a task to enable port monitoring for (by default) un-monitored discovered ports (see above).
You can author custom management packs with your own SNMP rules. They used 2 industry standard MIBS and it’s worked on 90-95% of devices that they’ve encountered so far. Means there’s a good chance it will work on future devices.
This blog post is the property of Aidan Finn (@joe_elway / http://www.aidanfinn.com) and may not be reused in any manner without prior consent of Aidan Finn. You may quote one paragraph from this blog post if you link to the original blog post.