ITIL Service Management: June 2007

Jun 21, 2007

MS SC Service Manager Beta 1 is Here

Is Microsoft a serious new player in Service Management field?

I had the pleasure of downloading and installing Beta 1 Microsoft System Center Service Manager last week. I am not the fan of MS, or any other tools vendor, so this post is not an add. Since I have some experience in ITSM and have seen, implemented and developed some tools, I have something to say here.

Service Manager is slow and forms are ugly. It is slow because it is in Beta 1 and running on sub-optimal code. Forms are ugly because it is in Beta 1, they are very simple unpolished InfoPath forms.

Other then that, I look forward to this product, because it seems to me that it is based on firm theory (ITIL/MOF/DSI), relies on mature Microsoft technologies (SQL Server, SharePoint, .Net, Office) and it will give a high level of integration with other System Center products, like Ops Manager and Configuration Manager.

Console resembles the Operations Manager 2007 Outlook-like console, so short learning curve and quick adoption is expected.

Functionality is kept mostly in Solution Packs, similar to Management Packs in Ops manager. So everything, from forms, workflows, repots, data models etc is in a solution pack. For now, Incident Management and Change Management solution packs are available, and a part of a future Asset Management solution pack. Reporting is performed on MS SQL Server Reporting Services. Most of the forms in all processes can be customized, fields added, data model extended.

Incident Management is planned to be ITIL compliant out of the box, so there are lot of customizing parameters available. Queues, drop-down values, locations, org. structure etc.

Change Management is also ITIL based, and seems decently engineered. As I said, forms are a bit slow, but in Beta 1 that should be expected.

Asset Management module is implemented for now as a simple inventory, a container for the connected SMS sync data.

Reporting is rudimental for now.

In Beta 1 there are only two connectors: AD and SMS. So one can import users from AD and assets from SMS.

Customer Portal is on SharePoint 2007 and it looks very sexy and promissing. Sincerely, one of the best-looking customer portals I have seen.

Beta 2 is planned for the 3Q2007. It will have improved Incident and Change Management solution packs and a functional Asset Management. Additional connectors for Ops Manager and Configuration Manager will be available. Also, larger set of reports.

This version satisfied my expectations, and I look forward to Beta 2. Seems to me that Service Manager has potential to be a very serious competition to HP Open View Service Desk. As I said earlier, good for us ;-)

Jun 17, 2007

CMDB - What You Need To Know

"Take care of the luxuries, and the necessities will take care of themselves." -- Dorothy Parker, an American writer and poet

Much is said about Configuration Management Database in IT information circles and literature. More or less, everyone agrees that CMDB is some kind of hub of all ITSM processes, but opinions differ on the scale of it's importance. From "nice to have because ITIL says so" to "first thing you have to do, without it nothing makes sense". Well, I am with this second extreme bunch.

ITIL is full of details and descriptions, but without a direct prescription how things should be done. Data model, relations, APIs? I don't blame them, technology is moving fast and things that will be possible tomorrow are imaginable by fewer and fewer people, so this vagueness in ITIL is given with a good measure of direct guidelines and freedom of interpolation. After all, CMDB became a kind of religion, even better, a Loch Ness monster parallel story: everyone is talking about it, and a very few people witnessed it. To some extent, one can agree with IT Skeptic that it can't be done by definition of Configuration Management goal:

"account for all the IT assets and configurations within the organization and its services"

ALL IT assets? What have you been smoking lately? There is a sniff of inconsistency here, since ITIL also talks about setting the level of CI control. So, what is it, ALL or SOME assets? Mouse, keyboard and web camera, or my precious USB mug heater, are they assets? Of course, we can give a better eye to this def, and read it backwards. ITIL is business oriented, and makes an (however large, still feeble) attempt to put a CUSTOMER in focus of poor over-worked sex depraved IT people. Not an easy task.

ITIL focuses on introducing order in IT, as a function of serving BUSINESS. ITIL mentions ALL assets, but in this context ALL means "as much as you irresponsible people can manage, and focus to those that are important for the business services to the ladies in the Accounting dept, or whatever dept. you signed your SLA with, without real understanding what SLA really means."

Every IT organization starting to deal with services quickly learns that it is not about Service Desk, Incident or Problem Management. It is WHAT you have, WHERE is it, WHO is responsible, HOW it works, WHEN is it going to die, and in WHICH way it supports our business. And approximately in that order. To every sensible IT guy it is obvious that CMDB can't be done, but that's something we HAVE to try to do. 30% is infinitely better then nothing, you can live with 70% and prosper for a long time. Take care of the luxuries first, cover the assets most important for your services, other CIs will fall into place with time and effort. A large US pharmaceutical company that implemented a new NW discovery tool found out that they have some 10.000 network assets more then they knew about. Have they survived somehow before that discovery? Of course. Have they found valuable resources and knowledge about their infrastructure that will help them be more agile and produce better decisions in the future? You bet.

Another thing. ITIL is IT oriented, and it covers basic best practices of IT organizations. Implementing ITIL is a first step in introducing organized structure to IT from bottom end, base of the pyramid. Above it, on the road to better understanding of business, stand methodologies which are not exclusively IT oriented, and they serve to connect IT to business even more, like BSC, SOX, COBIT, various project, work and performance management technologies. ITIL's task is to introduce first enablers of customer satisfaction. ITIL asks from IT to first form knowledge of their own business (CMDB) and then agree with Business on the best way to support it (SLM).

Views on CMDB and CI relations

I will discuss SLM and SLA later, and importance of Service Catalogs. For now, let me point out the evolution of structured thought: CMDB enables all 10 ITSM processes. It contains crucial information on CIs. Out of that info we start when we define basic components and IT services. Only then we are able to introduce the IT to Business and really listen to it's requests. That's why the Service Catalog resides in CMDB, or ON the CMDB, because it comes from the opposite direction. CMDB thinks bottom-up, and Service is defined top-to-bottom. So it is a crucial link from CEO to your network switch, from IT to Business. That's why in implementing ITIL, a common sense is to start with Service Desk, Configuration and Incident Management, and Service Level Management in parallel.

A number of possible problems can occur in CMDB life, aside from the obvious and mentioned in ITIL Blue Book. Integration with other ITSM tools is a very important one. I have seen some IT companies implement a stand-alone CMDB. Not sure what is the purpose of that. Also, CMDB deals with the data that naturally or historically reside in other applications, like ERP, HRM, Document Management, Asset Management... There is always a grey area or overlapping as some data wants to live in more then one place.

Main rule is: if there is a legal, financial, technical or strong cultural reason for a type of data to be referent in other systems, then periodical mirroring of data is performed (hourly or daily) to CMDB. Some data will be synchronized inbound to CMDB, (ERP people are particularly sensitive to their data) and some will be synchronized both ways, like contact data (mail, phone etc.) to and from Human Resources Management application. Some data will eventually be synched outbound, to Asset Management application for example.

A lot of it can go wrong in CMDB implementation. But, based on all the above and some more, having a CMDB conforming to ITIL recommendations, common sense and your business needs is a one of the hardest and most rewarded projects that an IT can perform.

Jun 6, 2007

Configuration Management Basics

Mission
To identify, record and report on configuration items and their relationships that underpin IT services.

Goals

To account for all IT assets, configurations and services within organization
To provide accurate information and documentation on configurations and assets to other SM processes
To provide a sound basis for Incident, Problem, Change and Release management
To verify configuration record and correct exceptions

Configuration Management Mind Map

Definitions
Configuration Management: The process of identifying and defining Configuration Items in a system, recording and reporting the status of Configuration Items and Requests for Change, and verifying the completeness and correctness of Configuration Items.
CMDB - Configuration Management Database: database which contains details about the attributes and the history of each CI and details of the important relationships between CIs.
CI - Configuration Item: basic CMDB element. Component of an infrastructure that is under the control of Configuration Management. CIs may vary widely in complexity, size and type, from an entire system (including all hardware, software and documentation) to a single module or a minor hardware component.
DSL - Definitive Software Library: a physical library or storage repository which contains authorized master copies of software versions. May consist of one or more physical software libraries or filestores, and can include physical store to hold master copies of purchased software, disaster safe.
DHS - Definitive Hardware Store: secure storage of definitive hardware spare components and assemblies. Details of these components and their builds and contents should reside in CMDB. DHS items can be used on demand for additional systems or in the recovery from major Incidents.

Objectives

To provide accurate configuration information
To define and document processes and procedures
To identify, label and record CI
To control and store authorized specifications documentation and software
To report on status and history of CIs
To record changes to CI in a timely way
To audit physical items and reconcile any differences between them and the CMDB
To educate and train in control processes
To produce metrics on CI, changes and releases
To audit and report and exceptions to standards and procedures

Process

Planning
Identification
Control
Status accounting
Verification and auditing

Benefits

Accurate information on CIs and their documentation
Controlling valuable CIs
Adherence to legal obligations
Financial and expenditure planning
Making software Changes visible
Contributing to contingency planning
Supporting and improving Release Management
Improving security by controlling the versions of CIs in use
Enabling the organisation to reduce the use of unauthorised software
Allowing the organisation to perform impact analysis and schedule Changes safely, efficiently and effectively
Providing Problem Management with data on trends

Possible Problems

Wrong CI detail level
Adequate initial analysis and design
Configuration Management implemented in isolation
Lack of commitment to maintaining accuracy

Critical Success Factors

Managing Configuration Item information
Providing capability to perform risk analysis of changes and releases

Key Performance Indicators
Managing CI Information

Number of CIs logged and tracked
Number of CIs with attribute failures
Number of changes to CI attributes
Number of additional CIs
Number of deletions of CIs
Number and frequency of exceptions in configuration audits

Providing Capability To Perform Risk Analysis Of Changes and Releases

Number of incidents caused by inaccurate configuration data
Percentage of Services tracked with CIs versus known products and services

Jun 4, 2007

Problem Management Facts

Problem Management process can be roughly defined by a definition of Problem:

Problem is the unknown cause of one or more incidents, often identified as a result of multiple similar incidents. It will become a Known Error when the Root Cause is known and a temporary Workaround or a Permanent Fix has been identified.
Problem Control scope is transforming Problems into Known Errors.
Error Control deals with resolving Known Errors via the Change Management process.

Goal
To minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure, and to proactively prevent recurrence of related Incidents, Problems and errors.

Problem Management Mind Map

Inputs

Incident details
CMDB details
Incident Management workarounds

Activities
a. Problem control

Problem identification and recording
Problem classification
Problem investigation and diagnosis

b. Error control

Error identification and recording
Error Assessment
Recording error resolution
Error closure
Monitoring resolution progress

c. Major Incident resolution assistance
d. Proactive PM

Trend Analysis
Targeting Support Action
Providing info to organization

e. Obtaining management information
f. Completing major Problem reviews
Have in mind: Function of Problem control is to transform problems into Known Errors. Error Control resolves Known Errors thru Change Management.

Incident Matching

Outputs

Known errors
RFCs
Updated and Closed problems
Updates to Incidents
Management Information

Benefits

Improved IT services
Reduced number of avoidable incidents
Reduced impact on service availability
Quicker recovery
Permanent solutions
Improved organizational learning
Improved customer productivity
Learn from past experiences
Good reputation for IT department
Greater control of IT services trough management information
Improved first-call fix rate

Possible Problems

Lack of management and staff commitment
Ineffective Incident Management Process
No reliable historical data for trend analysis
Limited integration – incidents, problems, errors
Limited integration with development
Insufficient time for proactive work
Bypassing the service desk
Inability to build effective knowledge base
Inaccurate assessment of business impact
Cultural difficulties

Critical Success Factors

Avoiding Repeated Incidents
Minimizing Impact Of Problems

Key Performance Indicators
a. Avoiding Repeated Incidents

Number of repeat incidents
Number of existing Problems
Number of existing Known Errors

b. Minimizing Impact Of Problems

Average time for diagnosis of Problems
Average time for resolution of Known Errors
Number of open Problems
Number of open Known Errors
Number of repeat Problems
Number of Major Incident/Problem reviews

Differences Between Incident and Problem Management
There is a potential conflict of interests between these two:
Incident Management wants to restore the service to operative state ASAP, while Problem Management focus is to discover unknown underlying cause of multiple incidents, and to resolve/prevent it, using top of the crop IT people. Obviously this can often lead to some disagreements between Incident and Problem Manager.

Implementation
Since Problem Management depends heavily on good Incident Management, generally accepted recommendation is to implement Problem Management in parallel or immediately after Incident Management. I strongly disagree here.

Problem Management is one of the most difficult Service Support processes to sell to the Management, it involves expensive resources with vague results, and Management is usually reluctant to buy it. If this is the case, Incident Management should be implemented and left alone for some time to gather critical mass of information which can later be used to justify the introduction of Problem Management.

Also, it is known that that some 20% of problems cause 80% of incidents, so develop Problem Management business processes with this fact in mind.

As for the IT tools for Problem Management support, have no worries: this is the easiest process to automate. Three to four forms and a very simple workflow, this process is more about the people and underlying technology, less about process automation. So probably any good ole tool will satisfy your needs, if it has Problem Management module. Focus on other processes automation.

Jun 1, 2007

Incident Priority - What Everyone Should Know

As ITIL defines it, Incident priority is primarily formed out of it's Impact and Urgency. There are also additional elements, like size, scope, complexity and resources required for resolution.

So, most consultants recommend the simple matrix which will automatically calculate incident priority out of the simple value of Impact x Urgency.

Recommended granulation of Priority is 4 to 5 different values.
Usually the lower the value - the higher the Priority, thus Priority=1 is the highest one, and Priority=5 the lowest.
There are also some sophisticated methods for defining Priority, but eventually it all boils down to something like:

Picture: Standard Priority Matrix

Impact of the incident is the measure of how business critical it is. Since this is difficult to determine in shoes of overworked and underpaid 1st level operator, some simplifications are necessary here. So impact is usually directly proportional to a number of users influenced by the incident. If an up-to date CMDB is available, then it's easy to determine affected users from the Business Service which suffers from specific Configuration Item malfunction.

Urgency is a necessary speed of resolving an incident. Some incident management tools perform automatic calculations for Urgency based on Impact, SLA and OLA involved etc. Nice feature if you have it. If not, a simple workaround would be to educate Service Desk operators on a regular basis and to inform them on parameters of signed contracts. Incident Urgency for certain Services may vary in time (example: HR application during payroll calculation) and this additional complexity is easier to resolve by raising staff awareness level then to implement it in software tools.

Major Incident is an incident with extreme impact to business, or an excessive disruption of service. It will have Priority=1, and additionally, depending on your SLA and support process definition, it usually has additional attribute (checkbox is fine) that says this is a Major or Hot incident. All key Support Staff members attend to resolution of major incidents, and the project is strongly supported by Problem Management.

Also, should be said here that Incidents do not age gracefully. Data on that reside in the escalation schemes, based on Service Level Agreement (SLA) targets. If an incident owner can't deal with that, his manager has to be notified in time.

Update: Incident Escalation

Just a few words on Incident escalations. Escalations are mechanisms that help us to resolve incidents on time. There are two major types:

Functional Escalation - reassigning incidents to a higher tier support group due to lack of expertise. Also, this can happen after a predefined time interval passes, in accordance with SLA.

Picture: Functional Escalation

Hierarchical Escalation - when a support employee can't deal with an incident, either due to lack of knowledge or insufficient time, his manager has to be informed in order to preserve SLA targets and customer satisfaction. In practice, this escalation type usually boils down to a simple notification to the manager.

Picture: Hierarchical Escalation

ITIL Service Management

Jun 21, 2007

MS SC Service Manager Beta 1 is Here

Jun 17, 2007

CMDB - What You Need To Know

Jun 6, 2007

Configuration Management Basics

Jun 4, 2007

Problem Management Facts

Jun 1, 2007

Incident Priority - What Everyone Should Know

SEARCH

Tags

Popular Posts

Blog Archive

Links

Linking stuff