Enterprise Level Information Management

From NMSWiki

Jump to: navigation, search

Contents

NMS Information Management Architectures

Network information management is an essential component of robust network management architectures. Tools that provide planning, design, and life cycle management for network assets should be a primary focus of any well managed network. Management systems should be implemented that provide actionable information which administrators can use to improve network management performance and help develop effective network asset control processes.

This document provides strategies and recommendations on the architecture and design necessary to support an NMS Information Management Architecture". The recommendations in this guide are a supporting element within the Enterprise NMS Architectures.

The intent of this document is to provide sufficient information to allow the reader to build and operate an intelligent information management system to be used for all network asset information gathering and dissemination.

This document will focus on the requirements necessary to build a basic NMS Information Database (Data Store) and will outline how the information should be stored and retrieved so that other NMS systems may reliably gather inventory information as well as provide relevant data back into it.


NMS Architectural Model Review

As outlined in the Enterprise NMS Architectures, this model shows the major components that make up a comprehensive NMS architecture. The model will serve as the foundation for designing a reliable flow of information and is provided here to help familiarize you with the concept of a tiered approach to network management.

The main purpose of this architecture is to help the organization understand what components comprise a basic level of integrated network management. This foundation of network management can be characterized as addressing three functional areas/layers of the ITIL model: Element Management (EML), Network Management (NML), and Service Management Layer (SML).

Diagram: Heirarchical NMS Architecture

Enlarge

NMS Information Management Architecture

Building upon the architecture outlined above, the following diagram shows how the information flow fits into the tiered architectural model. In the diagram below, the Data Store (database) resides in the top layer (SML) of the NMS Architectural Model because of its capability to intelligently interact with most, if not all other NMS systems in the architecture.

Diagram: NMS Information Architecture

Enlarge


NMS Information Goals and Benefits

Operational Goals

To maximize the potential benefits, the proposed architecture must be able to:

  • Centralize important inventory data so that that it can be leveraged by many management tools.
  • Provide accurate and timely information.
  • The Data Store must be architected on an efficient and scalable database.
  • Automate inventory collection as much as possible to avoid human error.
  • Incorporate an accurate, reliable "Master Poller" (MP).
  • Utilize an "open interface" so that future elements being installed in the network can easily access and update this Data Store.
  • Provide a simple end-user interface for managing the information.
  • Improve productivity, accuracy and workflow through the use of automation and integration.

Benefits

Some of the potential benefits from adding inventory/asset management tools to a network management infrastructure include:

  • Lower support costs resulting from a more efficient help desk and technical support infrastructure.
  • Improved network availability and reliability due to decreased MTTR.
  • Improve quality and capabilities of other NMS tools
  • Enhance security and network availability through the timely and accurate identification of elements exposed to security vulnerabilities or hardware defects.
  • Improve network availability by proactively identifying network components approaching End of Life and End of Service Support.
  • Enhances security by quickly identifying unauthorized network assets.

Building an NMS Information Storage System

The following sections highlight the functional requirements necessary to begin building and maintaining an enterprise-wide reliable Data Store.

Master Poller

A Master Poller (MP) is used to define a reliable base of information that all other NMS systems implicitly trust. Careful consideration should be taken when defining which information source to designate as the MP. The System selected should have the least amount of human interaction with regards to inventory discovery. The more automation involved, the more likely the information is to be reliable.

Polling Cycles

A polling frequency should be determined that will allow sufficient time to gather inventory information from all devices in the network. Determining the polling frequency depends entirely on the capability of the poller and the size of the network. For example, if it only takes 5 minutes to poll every device in a network, then consider setting the discovery for every 1 or 2 hours. However, if it takes 3 days to poll the entire network, then consider setting the poll rate on a weekly schedule.

Roles

Informational Roles are defined so that a trusted hierarchy of information is set up between NMS systems.

In the table below, three types of roles are defined:

Master

The Master Poller (MP) has the final say in deciding which information is correct. For example, if two NMS’s disagree on information, the MP will be deemed the correct source. For this reason, it is extremely important that there is as little human interaction with data validation as possible (the MP should be set up to poll devices directly for information rather than relying on input from users).

Parent

The Parent role has authority over all children and is the mediator between MP and child. This data source should contain the information gathered by multiple sources.

Child

The child role is assigned as a read-only type role. Any NMS assigned a child role is only allowed to pull information from higher level roles (i.e. they do not have permission to write information, only to ask for it).


Costs

Costs are assigned to the network management systems so that any potential conflicts of information can be settled. In order to mitigate the possibility of conflicting information from multiple sources, a "cost" factor should be assigned to each source. The assigned cost will be taken into account when determining which source is more reliable. The lower the cost, the more trusted the source of information. For this reason, the "Master Poller" should always have the lowest cost assigned among all systems.

Information Sources

The following table outlines some generic sources of information as well as what role they may play in the information hierarchy.

Table: NMS Information Sources

Source Destination Role Cost Comments
{PRODUCT1} Data Store MP 10 Auto-discovery of network inventory
Data Store All NMS’s and Users P 20 Child to {PRODUCT1}, Parent to all other NMS’s
Cisco LMS Users C 30 Possibility of using ANI for supplemental polling
{PRODUCT2} Users C 50  
{PRODUCT4}   P/C 40  
Cisco CNC Users C 60  
{PRODUCT3} Users C 70  
Syslog Users P 25 Will be used to verify/correlate inventory
         
         
         
         
         


Database Design

Planning the database is one of the most important aspects of designing an intelligent information system.

The recommendations below will help you implement a scalable, fault tolerant system that will grow with the enterprise.

Use an "Open" Database

The term "open" is defined as, "an architecture that employs open standards for key interfaces within a system". An open SQL database such as MySQL should be implemented that offers simple access to data stored by multiple systems. A web interface such as PHP or PERL can be placed in front of the DB so that authorized users can easily store, retrieve and (possibly) edit this information. This interface should be the primary access point for all users and NMS systems.

Table Design

Tables and rows may be added and removed from this database at will, but initially should contain at least some basic information. Multiple tables will be used to store data from different sources such as Ciscoworks, {PRODUCT1}, contact information, etc.

This section describes example usage of tables that can be used in the database. Deciding what information should be stored in each table will be defined by the workflow desired for your organization.

  • The SQL data types should be discussed prior to implementation, a default of varchar(75) has been selected for now.


Table: Sample Ciscoworks LMS Table Structure
Row Name Description SQL Type Comments
id SQL ID int Auto number
ip IP Address varchar(75)  
hostname DNS Hostname varchar(75)  
domainname DNS Domain Suffix varchar(75)  
devid LMS Device ID varchar(75)  
displayname Display Name varchar(75)  
sysobjid SNMP OID varchar(75)  
dcrdevtype LMS Device Type varchar(75)  
mdftype LMS MDF Type varchar(75)  
v2ro SNMP RO String varchar(75)  
v2rw SNMP RW String varchar(75)  
v3uid SNMP V3 User ID varchar(75)  
v3pass SNMP V3 Password varchar(75)  
v3engid SNMP V3 Engine ID varchar(75)  
v3authalg SNMP V3 Authentication Algorithm varchar(75)  
username Device Username varchar(75)  
password Device Password varchar(75)  
enable Device Enable Password varchar(75)  
softrev Device Software Rev varchar(75)  
flashfn Device Flash Filename varchar(75)  
chassistype Device Chassis Type varchar(75)  
sysname Device Local Hostname varchar(75)  
nmsname NMS Server that last updated this row varchar(75)  
cost NMS reliability cost integer(10)  
entrydate Date this entry was last updated  


Table: Sample Contacts Table Structure
Row Name Description SQL Type Comments
id SQL ID int Auto number
site Site Name varchar(75)  
location Physical Location varchar(75)  
devid Device ID’s this site maintains varchar(75) Pulled from ID field of other tables
manager Site Manager varchar(75)  
engineer Site Engineer(s) varchar(75)  
engph Engineer Phone varchar(75)  
mgrph Manager Phone varchar(75)  
hours Display Name varchar(75)  
division SNMP OID varchar(75)  
market LMS Device Type varchar(75)  
nmsname NMS Server that last updated this table varchar(75)  
cost NMS reliability cost integer(10)  
entrydate Date this entry was last updated datetime  
  Many more things here, address, phone, email, etc    
       


Correlating Information from Multiple Tables

The number of tables that can be created is limited primarily by the server hardware. It is recommended that multiple tables of information for specific functions are created rather than one very large table.

Creating New Tables

1. All tables should contain a minimum set of data (e.g. nmsname, cost, entrydate)

2. All tables should contain at least one field that can be used to associate it to other tables. (E.g. the contacts table above contains a "devid" (device ID) field whereas the LMS table contains a similar "devid" field. These two fields must be identical in order to correlate information between the two tables.

Database Security

An important process in evaluating database security is performing vulnerability assessments against the database. A vulnerability assessment attempts to find vulnerability (computing) holes that could be used to break into the database. Database administrators or Information security administrators should run vulnerability scans on databases to discover misconfiguration of controls along with known vulnerabilities within the database software. The results of the scans should be used to harden the database in order to mitigate the threat of compromise by intruders.

Database Administration

A database administrator (DBA) should be assigned to be responsible for the environmental aspects of the database. In general, these include:

  • Recoverability - Creating and testing Backups
  • Integrity - Verifying or helping to verify data integrity
  • Security - Defining and/or implementing access controls to the data
  • Availability - Ensuring maximum uptime
  • Performance - Ensuring maximum performance given budgetary constraints
  • Development and testing support - Helping programmers and engineers to efficiently utilize the database.


Database Redundancy

As with any mission critical application, redundancy should be built into the database. There are several levels of redundancy which can be defined based on factors such as budget and criticality.

Databases can be set up to communicate directly with each other so that when data is entered into one table, it is mirrored on the other database.

Alternatively (and much easier), you may utilize RSYNC to sync the databases together as well as to keep a rotation of offline backups.

Because this database is not a "real-time" application - i.e. thousands of web users accessing it at the same time - the simpler method of RSYNC may be used with the assurance of dependability.

Database Performance

A method for monitoring database performance should be implemented. One simple tool available to do this is "mytop" for MySQL which allows administrators to view DB stats in real-time. Another method is to set up graphs and thresholds using tools such as Cacti or an MRTG-like tool.

Database Scalability

When defining the database structure, consideration should be given to future needs and growth, but keep in mind that this comes at a price; each data type of an SQL table takes memory to create and store, regardless of what data actually resides in the set. For example, if a field is configured as varchar(75) and the most you ever enter into that field is 3 characters, then it would take a lot less memory to define the data type as varchar(3). However, if the data type is altered to (3) then be aware that an attempt to store more than 3 characters will fail to store the information and return an error. This is important to note for any application developers that would potentially work with the database.

Database Backups

A backup and restoration plan should be implemented in the event of a failure. If using the recommended RSYNC method, simply rebuilding the database from an offline copy will suffice.


NMS Inter-Communications

The goal of this section is to help determine how information should be transmitted between various NMS elements, what pre and post-processing tools may be needed, how often the information should be exchanged as well as how and when to log any errors generated by this exchange of information.

Communication between multiple Network Management Systems is extremely important for event correlation and data aggregation. Optimally, NMS’s should be able to communicate bi-directionally. This ensures the ability to provide correlated events as well as the coordination of data sources throughout the network such as inventory, access, performance data, etc.

Interfacing with various NMS’s is possible in many forms. Some software vendors may provide a direct API into their systems while other systems may require, for example, a script that runs on a pre-determined time basis to compare inventory between two systems.

The following section gives examples of some generic applications and how they might communicate with their downstream or upstream peers. Each topic is split into two subsections – Role, which describes what role the NMS should play in the architecture and Communications which provides a recommended method for communicating with other systems in the architecture.

{PRODUCT1}

Role

{PRODUCT1} will be used as the Master Poller. It will be set up to discover network elements on a predetermined basis and report the information it finds to a central database. A script will be written to dump this information and import it into the database at regular intervals. It is imperative that this information be as accurate as possible since it will feed information to every other NMS system throughout the enterprise.

Communication

All inventory information contained within the {PRODUCT1} Database should be exported to the Information DB (see below) on a predetermined schedule. The designated {PRODUCT1} Administrator will need to provide some method for obtaining this information whether it is from an SQL call or a simple text file dumped to disk, etc. Once this information is exported, a method can be defined to import the information into the Information Database.


Information Database

Role

This database, also referred to as the Data Store, is used to provide a centralized method for storing all information gathered by multiple systems and will act as a "Parent" to all other NMS systems. Priorities (costs) will need to be defined for each NMS system that stores information here. In the event of a conflict, the information collected from the NMS system with the lowest cost value (LCV) will take precedence.

Communication

Getting information into the database will vary based on which NMS system is being communicated with. One method would be to devise a PERL script that is modular in fashion so that each NMS system inserting data into the database uses the same base template and adds extra information for the parts that it needs. For example:

  • Main PERL script sets up .csv parsing
  • Main PERL script defines DB communication
  • NMS’s PERL template passes information into the main PERL script with additional information on what DB tables it should have, what fields are in the csv file, etc.
  • A script should be set up between this Data Store and the MP ({PRODUCT1}) that compares inventory in a bi-directional manner so that any information (i.e. Devices) missing from {PRODUCT1} can be found and inserted into the {PRODUCT1} server for polling.

CiscoWorks LMS

Role

CiscoWorks LMS will be used for inventory validation, configuration validation and software deployment. Because LMS has its own inventory collection engine (ANI), it may be used to feed data into the Data Store and should be assigned a low cost.

Communication

The LMS DCR (device credentials repository) will pull its inventory and device credentials from the Information DB. Once it has obtained a list of devices and credentials, it should then poll the devices directly and update the Data Store with any new, updated or additional information it has obtained.


{PRODUCT2} is used primarily for configuration auditing and management. It will act as a Child node to the Data Store but may also be used to supply information back into the Data Store.

Communication

{PRODUCT2} will obtain inventory and credentials from the Data Store. A script should be run on a daily basis to compare {PRODUCT2}’s inventory against the Data Store to ensure accuracy.

{PRODUCT3}

Role

{PRODUCT3} is used to gather and report on performance metrics obtained from devices throughout the enterprise. It will act as a Child node.

Communication

{PRODUCT3} will obtain its inventory directly from {PRODUCT1} as the two are integrated.


{PRODUCT5}

Role

{PRODUCT5} is a network simulation/modelling tool used to determine possible problems in the network.

Communication

{PRODUCT5} will obtain information from CiscoWorks LMS. It will act as a child only and will not update information back to any databases. {PRODUCT6}

Role

{PRODUCT6} will obtain information from the Data Store and will act as a Child node.

Communication



CiscoSecure ACS

Role

ACS will obtain information from the Data Store and will act as a Child node.

Communication

Initially, ACS may be used to identify devices possibly missing from inventory. A script should be written to compare what the Data Store has to what is in the ACS database. As the architecture matures, the Data Store will be depended upon to feed device inventory into ACS.

Syslog

Role

Syslog may be used to identify devices missing from inventory. It should play an important role in verifying that the systems which are set up to discover and store information (Master Poller and Data Store) are indeed getting everything. This is valuable since messages coming into the syslog server are from network elements that are obviously available.

Communication

A script should be written to compare which devices the Data Store has to which ones are in the Syslog database. Messages from devices not in the Data Store should then be investigated to determine if the Data Store needs to be updated. Once the missing devices are added to the Data Store, the {PRODUCT1} Poller should be alerted to their presence and begin polling for them.


Operational Dependencies

This section details network management leading practices that apply to the strategies provided in the previous sections. Your environment may already be partially or fully compliant with some of these leading practices listed.

Auditing

It is extremely important that a process be put in place that provides a monthly (or more often) procedure for auditing all NMS systems. These audits should verify, at a minimum:

  • Device Inventory
  • SNMP Passwords
  • Reachability

Device Reachability

A device inventory system is only as good as the polling engine it depends upon. If for some reason, any of the following functions are unavailable, devices will, according to the database, not exist. Steps will need to be required in order to minimize the risks listed below.

  • If a device is blocked by a Firewall, ACL or some other mechanism, steps must be taken to allow components of the Information Management Architecture .
  • SNMP Configuration – If a device lacks basic SNMP configuration, there is no way for the system to know it exists. It is imperative that all new hardware be configured with SNMP enabled.
Inventory Synchronization

Inventory synchronization is essential to a successful Information Management Architecture deployment. A system of checks and balances must be put in place to ensure complete accuracy of the main poller so that no devices are missing from it. This can be done by correlating information between systems such as the Data Store and the main polling engine ({PRODUCT1}). Syslog may also be used to correlate any devices that may be missing from the main poller. The main poller may also be used to verify its inventory against syslog. Syslog is an excellent source for one simple reason - if a device is speaking to the syslog server, it obviously exists. Additionally, auditing on a regular basis will help ensure that this process remains intact.

Note: As syslog provides a valuable data source for inventory correlation; all devices in the network should be configured to send syslog messages to the syslog server.

Inventory Conflicts

A method must be set up to verify that when a device shows up in inventory that it should, in fact, get polled by the MP (every device discovered in the Enterprise may not be managed by this group). In the event of a conflict, the cause should be investigated and determined so that future similar conflicts can be eliminated or avoided.

Inventory Reporting

The scope of this architecture assumes that any reporting requirements will be satisfied through the use of 3rd party apps such as Crystal Reports which allows various adhoc reports against the database.

Inventory Adds, Deletes and Changes

A process needs to be developed to provide intelligence between systems that monitors when a device is added, removed or changed in each NMS. When one NMS detects an update, it should notify the Data Store of this change. As a result, the Data Store should first verify and then update the Master Poller. Once verification is completed, the Data Store will then make the updated information available for each of the remaining NMS systems.

Hardware Considerations

Providing redundant hardware is an important aspect of business continuity. A minimum of two servers should be used for the Data Store. These servers should be located in geographically disperse locations and should be set up to synchronize data between each other as often as realistically possible. Local disks should also be set up with redundancy in mind by utilizing Raid functionality.

Software Considerations

A setup known as LAMPS may be used to build the necessary components of this server. LAMPS is an acronym for Linux, Apache, MySQL, PHP and SSL.

A set of recommended software is listed below:

Linux

Apache

MySQL

  • Open Source – MySQL v4.x or 5.x

PHP

  • PHP - Version 5.x

Staffing Considerations

Because of the complexity of managing a system of this type, it is important to retain the proper staff to manage the server and the software. It will be necessary to designate a full-time staff member that is skilled in the following areas:

  • DBA – Database Admin
  • PHP – Php skills would be helpful - but not mandatory if the web interface to the database is built using another medium.
  • PERL – Perl skills will be necessary for interfacing with many of the NMS’s
  • Linux Administrator – A UNIX/Linux admin will need to be used to maintain the operating system software.

Development Methodology

A standard development approach should be followed to maximize the benefit to Information Services and Support as well as its underlying customer base. A sample development methodology may include the following:

  • Management approval of the high-level Information Management Architecture, its underlying components, goals and integration requirements
  • A list of development and integration goals of the IMA system to establish development priorities, their benefits to workflow, as well as identifying the difficulty, timeline and resource requirements for each development effort
  • An approved list of initial IMA development requirements (IMA release 1.0)
  • Customer involvement, including testing, acceptance and input to future system improvements
  • Define and document intersystem communication
  • A gap analysis to determine any additional hardware or software requirements to support IMA 1.0
  • The consideration of a development environment for the initial release and subsequent release of the IMA system
  • Appropriate level of documentation of integration and development efforts
  • Support Matrix and underlying service plan including disaster planning and recovery techniques

Future Goals/Possibilities

Once the initial NMS-IMA system has been successfully released, a second development phase should be undertaken to consider adding a top layer of intelligence to systems that tie into the data store. Automations can be set up that will greatly enhance the end user experience and productivity. An example of this is the web interface that provides user access to the Data Store. When a device attribute is modified by a web user, the database can communicate that change back to all other NMS’s so that edits only need to take place in one system. Also, new devices could be added directly from this interface which would spawn a process for the poller to gather more information about the devices and return that data to the database (which in turn updates all other NMS’s).

Views
Personal tools