Find White Papers
Home
About Us
List Your Papers
    
> IBM > Best Practices in High Availability Cluster MultiProcessing

Best Practices in High Availability Cluster MultiProcessing

White Paper Published By: IBM

IBM HACMP supports a wide variety of configurations, and provides the cluster administrator with a great deal of flexibility. With this flexibility comes the responsibility to make wise choices. This paper discusses the choices that the cluster designer can make, and about the alternatives that make for the highest level of availability.



Tags : 
high availability, backup, recovery, utility computing, network management, ibm, backup and recovery

IBM
Published:  Feb 25, 2008
Type:  White Paper
Length:  24 pages


HACMP
HIGH AVAILABILITY CLUSTER
MULTIPROCESSING
BEST PRACTICES
July 2007
. A l e x A b d e r r a z a g . Ve r s i o n 0 2 . 0 0 . Table of Contents
I. Overview
1
II. Designing High Availability
1
Risk Analysis
2
III. Cluster Components
3
Nodes
3
Networks
3
Adapters
5
Applications
5
IV. Testing
9
V. Maintenance
9
Upgrading the Cluster Environment
10
VI. Monitoring
12
VII. HACMP in a Virtualized World
13
Maintenance of the VIOS partition - Applying Updates
18
VIII. Summary
19
IX. References
21
X. About the Authors
21
H A C M P B e s t P r a c t i c e s
2WHITE PAPER
Overview
IBM High Availability Cluster Multiprocessing (HACMP TM) product was ?rst shipped in 1991 and is now in its 14th release, with over 60,000 HACMP clusters in production world wide. It is generally recognized as a robust, mature high availability product. HACMP supports a wide variety of con?gurations, and pro-vides the cluster administrator with a great deal of ?exibility. With this ?exibility comes the responsibility to make wise choices: there are many cluster con?gurations that are workable in the sense that the cluster will pass veri?cation and come on line, but which are not optimum in terms of providing availability. This document discusses the choices that the cluster designer can make, and suggests the alternatives that make for the highest level of availability*.
Designing High Availability
".A fundamental design goal of (successful) cluster design is the elimination of single points of failure (SPOFs)."
A High Availability Solution helps ensure that the failure of any component of the solution, be it hardware, software, or system management, does not cause the application and its data to be inaccessible to the user community. This is achieved through the elimination or masking of both planned and unplanned down-time. High availability solutions should eliminate single points of failure (SPOF) through appropriate de-sign, planning, selection of hardware, con?guration of software, and carefully controlled change manage-ment discipline.
While the principle of "no single point of failure" is generally accepted, it is sometimes deliberately or in-advertently violated. It is inadvertently violated when the cluster designer does not appreciate the conse-quences of the failure of a speci?c component. It is deliberately violated when the cluster designer chooses not to put redundant hardware in the cluster. The most common instance of this is when cluster nodes are chosen that do not have enough I/O slots to support redundant adapters. This choice is often made to re-duce the price of a cluster, and is generally a false economy: the resulting cluster is still more expensive than a single node, but has no better availability.
A cluster should be carefully planned so that every cluster element has a backup (some would say two of everything!). Best practice is that either the paper or on-line planning worksheets be used to do this plan-ning, and saved as part of the on-going documentation of the system. Fig 1.0 provides a list of typical SPOFs within a cluster.
"..cluster design decisions should be based on whether they contribute to availability (that is, eliminate a SPOF) or detract from availability (gratuitously complex) ."
* This document applies to HACMP running under AIX®, although general best practice concepts are also applicable to HACMP running under Linux®.
H A C M P B e s t P r a c t i c e s
1Fig 1.0 Eliminating SPOFs
Risk Analysis
Sometimes however, in reality it is just not feasible to truly eliminate all SPOFs within a cluster. Examples, may include : Network ¹, Site ². Risk analysis techniques should be used to determine those which simply must be dealt with as well as those which can be tolerated. One should :
Study the current environment. An example would be that the server room is on a properly sized UPS but there is no disk mirroring today.Perform requirements analysis. How much availability is required and what is the acceptable... [download for more]

Browse Technology Topics

Data Center

Virtualization, Cloud Computing, Infrastructure, Design and Facilities, Power and Cooling, Green Computing  
    

Data Management

Application Integration, Analytical Applications, Business Intelligence, Configuration Management, Database Development, Data Integration, Data Mining, Data Protection, Data Quality, Data Replication, Database Security, EDI, SOAP, Service Oriented Architecture, Web Service Management, Data Warehousing  
    

Enterprise Applications

Application Integration, Application Performance Management, Best Practices, Business Activity Monitoring, Business Analytics, Business Integration, Business Intelligence, Business Management, Business Metrics, Business Process Automation, Business Process Management, Call Center Management, Call Center Software, Change Management, Corporate Governance, Customer Interaction Service, Customer Relationship Management, Customer Satisfaction, Customer Service, EBusiness, Enterprise Resource Planning, Enterprise Software, EProcurement, Extranets, Groupware Workflow, HIPAA Compliance, IP Faxing, IT Spending, Marketing Automation, Performance Testing, Product Lifecycle Management, Project Management, Return On Investment, Risk Management, Sales & Marketing Software, Sales Automation, Server Virtualization, Simulation Software, Supply Chain Management, System Management Software, Total Cost of Ownership, Video Conferencing, Voice Recognition, Voice Over IP, Workforce Management, Incentive Compensation, Spend Management, Manufacturing Execution Systems, International Computing  

Human Resource Technology

Human Resources Services, Payroll Software, Time and Attendance Software, Workforce Management Software, Financial Management, Employee Monitoring Software, Employee Training Software, Recruiting Software/Services, Employee Performance Management, ELearning, Benefits Management, Expense Management  
    

IT Career Advancement

Cisco Certification, Microsoft Certification, Linux Certification, Network Security Certification, Software Development Certification  

IT Management

Employee Performance, ITIL, Productivity, Project Management, Software Compliance, Sarbanes Oxley Compliance, Service Management, Desktop Management  
    

Knowledge Management

Collaboration, Collaborative Commerce, Contact Management, Content Delivery, Content Integration, Content Management System, Corporate Portals, Customer Experience Management, Document Management, Information Management, Intranets, Messaging, Records Management, Search And Retrieval, Search Engines, Secure Content Management, SLA  

Networking

Active Directory, Bandwidth Management, Convergence, Distributed Computing, Ethernet Networking, Fibre Channel, Gigabit Networking, Governance, Grid Computing, Infrastructure, Internetworking Hardware, Interoperability, IP Networks, IP Telephony, Local Area Networking, Load Balancing, Migration, Monitoring, Network Architecture, Network Management, Network Performance, Network Performance Management, Network Provisioning, Network Security, OLAP, Optical Networking, Quality Of Service, Remote Access, Remote Network Management, Server Hardware, Servers, Small Business Networks, TCP/IP Protocol, Test And Measurement, Traffic Management, Tunneling, Utility Computing, VPN, Wide Area Networks, Green Computing, Cloud Computing, Power and Cooling, Data Center Design and Management, Colocation and Web Hosting  
    

Platforms

AS/400, Domino, Linux, Microsoft Exchange, Oracle, PeopleSoft, SAP, Siebel, Solaris, Tivoli, Unix, Web Sphere, Windows, Windows Server  

Security

Access Control, Anti Spam, Anti Spyware, Anti Virus, Application Security, Auditing, Authentication, Biometrics, Business Continuity, Compliance, DDoS, Disaster Recovery, Email Security, Encryption, Firewalls, Hacker Detection, High Availability, Identity Management, Internet Security, Intrusion Detection, Intrusion Prevention, IPSec, Network Security Appliance, Password Management, Patch Management, Phishing, PKI, Policy Based Management, Security Management, Security Policies, Single Sign On, SSL, Secure Instant Messaging, Web Service Security, PCI Compliance, Vulnerability Management  
    

Software Development

.NET, C++, Database Development, Java, Middleware, Open Source, Software Outsourcing, Quality Assurance, Scripting, SOAP, Software Testing, Visual Basic, Web Development, Web Services, Web Service Security, XML  

Storage

Backup And Recovery, Blade Servers, Clustering, IP Storage, ISCSI, Network Attached Storage, RAID, Storage Area Networks, Storage Management, Storage Virtualization, Email Archiving, Data Deduplication  
    

Wireless

802.11, Bluetooth, CDMA, GPS, Mobile Computing, Mobile Data Systems, Mobile Workers, PDA, RFID, Smart Phones, WiFi, Wireless Application Software, Wireless Communications, Wireless Hardware, Wireless Infrastructure, Wireless Messaging, Wireless Phones, Wireless Security, Wireless Service Providers, WLAN  
Search