VMware VCB - Carahsoft

seedgemsbokΑποθήκευση

10 Δεκ 2013 (πριν από 3 χρόνια και 10 μήνες)

232 εμφανίσεις

Operational Recovery and Disaster
Recovery Alternatives for VMware
Infrastructures

Rob Zylowski

Services Director


Virtualization and Director IP

April 2009

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

2

Agenda


Operational Recovery


Introduction


Technologies


Strategies


Disaster Recovery


Introduction


Major Costs of DR


LUN Replication Alternative


Backup/Dedupe Alternative


VMware Site Recovery Manager


Alternative Strategies


Managing Multi
-
vendor storage

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

3

Introduction
-

Operational Recovery


Definition of Operational Recovery


Mine


Something very very important that is often overlooked in
importance


Recovery of one or more applications and associated data to correct a
failure such as a corrupt database, user error or hardware failure,
within a datacenter.


Characteristics of Operational Recovery


Few organizations do it well


Can be complex requiring many manual steps which take significant
amounts of time and resource


Not often well tested providing challenges for staff that are not certain of
expected results


Should be developed into products by the App Developer but often is not


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

4

Introduction
-

Operational Recovery


Benefits Virtualization Provides for Operational Recovery


With VMware HA and ESX redundancy all systems are provided
quick local recovery from server and network hardware failures


Servers are encapsulated into a small number of files that can be backed
up and restored more easily than with physical servers


Entire servers can be backed up to disk for quick recovery


VMware Snapshots can be used before upgrades or significant system
changes and the system can be rolled back to the point of the snapshot
easily


Recovery is simplified as is testing of recovery because a VM can be
restored and mounted with no real network access


Can significantly lower RTO


Provides some challenges for RPO that can be ameliorated with
technology


Requirements to achieve benefits


Nearline Storage or SAN/NAS Snapshot Space


Significant amount of storage required for any online backup technology


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

5

Technologies for Operational Recovery


VMware VCB


Excellent Architecture


Tactically immature


Does not yet scale vertically


Script based
-

has integration issues


Does not work as well for Linux as Windows


Valuable when used to its strengths


Large number of files


Large file systems


Use it for what it’s good at and it will get better


Data Dedupe Targets and VTL’s


Integrated with backup software for example many vendors now have
Symantec OST support like Data Domain


Provides benefits for Virtual and Physical Systems real life examples seem to
be up 20 to 1 Reduction Ratios


People Work reduces better than natural data for example seismic data


Data that changes infrequently will also reduce more that frequently
changing data


Significantly simplifies recovery by eliminating tape and the latency issues of
RTO that tape switching causes

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

6

Technologies for Operational Recovery


Traditional Backup Vendors vs. Virtual Backup Vendors


Use depends on “Best in Breed” versus “Framework Standards Debate”


Most traditional products are becoming much more mature with VM’s


Some vendor solutions are becoming very feature rich especially when
integrated with dedupe either hardware or software based


SAN/NAS Technologies


SAN/NAS Snapshots may be used for operational recovery but without
an integrated backup application this can be difficult to mange


LUNs normally share many virtual machines making recovery of a single
VM from snapshot challenging


Best used in conjunction with an integrated backup system for example
many SAN vendors now have symantec OST support


FastScale


Shrinks VM’s by managing OS configuration to only what is required


Makes backup requirements much smaller for Linux OS’s



© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

7

Alternative Strategies


Most Often Architected to Date


Backup Agent in VM for Most Backups


Matches Physical Server recovery standards


Use of VCB for large File Systems or File Systems with Millions of files


Service offering for Point in Time Image Backups kept for a period of time


Used for Upgrades or Major Change Rollback


Can be kept longer than Snapshots which affect performance over the long run


Change in Architecture Driven by Dedupe and Image Technologies


VCB & Point in Time Image Backups as above


Use Image Backup for applications that require very short RTO ie < 4 hours


Image backup technologies becoming mature


At many organizations % of systems virtualized is becoming very high allowing for economy
of scale and change of standards


Dedupe allows for increased number of online backup days


Replication of deduped backups is efficient for WAN fulfilling offsite storage requirement


Integrated into DR process


Moderate implementations can move fully to image backups for VM’s but this is a
challenge still for very large organizations

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

8

Future Advancements



Changes due in vSphere


vStorage Data Protection API’s enabling Backup vendors to
bypass VCB


No longer require VCB proxy


Scalable High Performance solution


Based on Virtual Appliance


Better support for Windows (VSS & File Level Restores) than Linux


Preprocessing SW based Dedupe


Can Integrate with HW Dedupe


Should make recovery of VM’s simple and straight forward


Should perform much better than VCB


Significant IOP increase 3
-
4 times WOW!


10 GB Ethernet moving into architectures as prices fall


Enhanced Network Performance coming from Cisco and HP


Near wire Speed with 1000V and Nexus Switches


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

9

Introduction


Disaster Recovery


Definition for Disaster Recovery


Mine


Something everyone plans for and few actually do


Wiki



(A good One) planning for resumption of applications, data,
hardware, communications (such as networking) and other IT
infrastructure.


The IT part of the greater “Continuity of Operations” which includes
much more than IT


Characteristics of DR Implementations


Some organizations do it well


Usually when the cost of a failure is very high


Most don’t


Its not just storing backups on tapes offsite


Can be difficult to afford


Seen as insurance

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

10

Introduction


Disaster Recovery


Benefits of Virtualization for DR:


Virtualization can offer significant advantages for simplifying DR from a
technology process perspective


Entire servers can be copied/replicated between sites and easily
recovered


Can provides ubiquitous DR for all tiers


Can significantly lower RTO


Provides some challenges for RPO


Requirements to achieve benefits:


Significant bandwidth for replication


Significant investment in DR site infrastructure especially SAN and
replication software from SAN or Software vendors

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

11

Major Costs of DR


Server HW


Made affordable with virtualization @ 25
-
30 to 1 consolidation


Storage


Tier 1 (DMX, HDS, etc..) replicated
-

very expensive


Tier 2 replicated
-

still very expensive


Tier3 SATA/FATA based more economical


Bandwidth


GB Speeds Regional
-

very expensive


GB Speeds Metro
-

moderately expensive


OC12 (600 Mb/s) Regional
-

very expensive


OC3 (150 Mb/s) Regional
-

expensive


Software


OS
-

Depends


Active / Active versus copies


VMware
-

Depends


If All failover its expensive


If once live DC backs up another and Dev/Test is mixed with production then its not


Applications Depends


Staff and Development
-

Expensive


Datacenter Space, power, cooling
-

Expensive





© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

12

Infrastructure Comparison SAN/NAS LUN Replication

SAN LUNs
Virtualization Assisted DR without De
-
duplication
Virtual Center
SAN LUNs
VM
1
_
PartC
.
vmdk
VM
1
_
PartD
.
vmdk
VM
2
_
PartC
.
vmdk
VM
2
_
PartD
.
vmdk
VMn
_
PartC
.
vmdk
VMn
_
PartD
.
vmdk
VM
1
_
PF
.
vmdk
VM
2
_
PF
.
vmdk
VMn
_
PF
.
vmdk
ESX Cluster
Primary DC
VM
1
_
PartC
.
vmdk
VM
1
_
PartD
.
vmdk
VM
2
_
PartC
.
vmdk
VM
2
_
PartD
.
vmdk
VMn
_
PartC
.
vmdk
VMn
_
PartD
.
vmdk
VM
1
_
PF
.
vmdk
VM
2
_
PF
.
vmdk
VMn
_
PF
.
vmdk
DR Cluster
DR Facility
Daily Replication
Hourly Replication
One Time Replication
LUN
Server
Alt Virtual Center
VM
VM
Virtual Machine

Almost immediate RTO


Supports tiers of RPO


Can be automated eg.
VMware SRM


May or may not include
quiescing applications


Relatively Expensive


Must Fail Over All Vm’s
on a LUN


If Application Failover is
required must segregate
by application which can
impact performance


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

13

Infrastructure Comparison Backup/Dedupe

Virtualization Assisted DR with De
-
duplication
Virtual Center
SAN LUNs
VM
1
_
PartC
.
vmdk
VM
1
_
PartD
.
vmdk
VM
2
_
PartC
.
vmdk
VM
2
_
PartD
.
vmdk
VMn
_
PartC
.
vmdk
VMn
_
PartD
.
vmdk
VM
1
_
PF
.
vmdk
VM
2
_
PF
.
vmdk
VMn
_
PF
.
vmdk
ESX Cluster
Primary DC
DR Non
-
Clustered
DR Facility
Daily Replication
LUN
Server
Alt Virtual Center
VM
VM
Virtual Machine
DeDupe
Dedupe
VCB
Recovery
VCB

RTO based on method of
recovery


Multiple VCB
Proxies


Non VCB Proxies


Media Server to
Agent in VM


RTO higher than LUN
replication in general but
much shorter than tape


Usually Supports a
single tier of RPO 1 day


Recovery is simple


Relatively Low Cost


Enhances Operational
Recovery


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

14

DR Benefits of Backup Integrated De
-
duplication


Lower storage requirements for online backup providing lower cost or space
for more backups


Lower bandwidth requirements for DR replication


Faster operational RTO from having online backups rather than going to tape


Simple operational recovery of entire VM based on image backups


Reference Architecture that does not require the same level of storage in DR
site


Recovery of single VMs rather than entire LUNs as in the San replication
model can allow for single applications to be failed over to DR without
segregating the applications by LUN which can affect performance


© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

15

VMware Site Recovery Manager


Manages the SAN Replication DR option for VMware ESX


Holds DR Recovery Plan Documentation


Automates Configuration and Setup of the DR process


Create and Test Recovery Plans


Report Results of Tests


Integration between VMware ESX, vCenter and SAN Vendors


Initiate failover when necessary, automating important changes like IP
address assignments and performance allotments


Uses LUN replication. Failing over a single VM is possible but it will break
replication and the other VMs on the LUN will be at risk therefore it is
intended for entire site failover. It is possible to have a single LUN per VM or
to segregate VMs on LUNs by application but this is hard to manage

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

16

Alternative DR Strategies


Active/Active DR


Mix Dev/Test/Prod in at least 2 DC’s


Sync DR both directions


On Clusters favor Prod VMs for Performance


During an event Dev/Test can be shutdown in favor of Prod


Significantly lower cost over Active/Passive


Tiered Solution with VMware SRM


Only designated systems are included


Normally based on low RTO


LUNs designated for replication or not based on SLA


Applications can be segregated onto LUNs if application failover and
consistency is required


Must be careful of performance issues


Requires diligent monitoring for hot spots


May require adding LUNs for applications


Lower Tiered systems can be restored from normal backups or
dedupe/VTL





© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

17

Alternative DR Strategies


Solution based on Backup to Dedupe NAS or VTL


Can Support backup to image for low RTO and normal agent based file
backups to same devices for longer RTO


Can be integrated with VCB


Can integrate with existing backup software and strategy


Has a longer RTO due to restore time


Can use software based replication products for small
number of VM’s


Good if there is no SAN in the target site


Good option for smaller remote offices with VM Infrastructure

© 2001
-
2009 GlassHouse Technologies, Inc.

This material may not be reprinted or redistributed without the express written consent of GlassHouse Technologies, Inc.

18

Multi
-
vendor Storage Resource Management Discussion


Jeff Phipps from Zot thought this would be interesting to discuss and I
agreed


SRM originally caused considerable excitement


Multi
-
Vendor SRM is certainly something all large organizations could use


Industry Standard Monitoring and Alerting Frameworks/Applications based
on SMI
-
S have been very disappointing with only sketchy support from the
vendor community


These tools do not provide the power and performance required to manage a
multi
-
vendor storage environment well


At GlassHouse we have switched to using vendor management products
integrated into our monitoring platform via snmp and some email.


Still trying to use various tools for reporting but the vendor tools work best


The prevalent strategy we see here is to limit the number of platforms within
your organization to ease the management burden associated with different
platforms