Jul 222013

Horizon Mirage is a part of the Horizon Suite from VMware and it is generating a lot of buzz.  I’m not going to go into the benefits why, you can read the link I’ve provided for that.  However, one of the most amazing things about Mirage is that it user a technology called sourced-based deduplication in order to backup all of the desktop endpoints.  Let’s talk about that technology, how it works and when it works best.

Source-based deduplication works by having a server in the datacenter with a lot of capacity attached to it.  We’ll refer to this server as the “repository.”  Now for the endpoints (which, in the case of Mirage, are Windows-based desktop/laptops.)  The client will begin by taking backups of the endpoints (Mirage calls them snapshots) and copying them to the repository.  It’s this process and how it works that is so amazing.  You would immediately think that when I take a backup of a endpoint that is 10GB on disk, the system will send 10GB over the network.  For the FIRST machine that you backup, it typically does.  It sends practically the whole image of the endpoint to the server for the first endpoint you backup.  It’s when you go to backup the second endpoint where the magic starts to happen.  Once the first endpoint has been “ingested”, for any additional endpoints added, the repository will use the data it has already seen to comprise all future backups.  I know this can be somewhat confusing, you can look at this article for some comparisons of different deduplication technologies.  For our example, let’s go a little deeper into exactly what happens during this process.

We will begin with the first Windows desktop that is 10GB on disk total and back it up.  The repository will “ingest” the files from the endpoint.  When it does this, it runs a hashing algorithm against the file to give it a hash code.  Once it does that for every file, the client also breaks the file into “blocks” or “chunks.”  It then runs a hashing algorithm against those chunks.  After all this it stores the backup down on disk in the repository.  Now, for the next (and every subsequent) client we want to backup or capture:  The client will ask the server for it’s hash table of files.  This is a small amount of data sent from the server to the client because the hash table is a list of all of the hash codes for all of the files in the repository not the actual data in the files.  The client then takes this data and analyzes each file on the second endpoint’s file system.  It develops a list of files that it has never seen before in the repository (and tells the repository which files are on this endpoint that the repository has seen before.)  Typical we see about 90-95% common files between images.  This is where it starts to get even more crazy efficient.  So the client has figured out which files the server already has in the repository and has told the server a list of those files that are on Endpoint #2 that the server has seen before.  Now the client looks at the files that the server has not seen before.  Let’s suppose there are 100 files that list that the server has not seen before.  The client will separate those files into blocks at the client (this is why it’s called sourced-based, the majority of the processing and checking for deduplicated data happens at the enddpoint, not the server).  So the client has separated the 100 files into blocks and runs the same hashing algorithm on the blocks.  Now the client compares the blocks to the blocks the server has in the repository and develops a list of blocks that the server has not seen before.  Let’s say the client finds 10 blocks that the server has never seen before.  It tells the server to mark down all of the blocks that are on this endpoint as being part of this endpoints backup.  Note: to this point in the process, the client has not sent any of the backup actual data to the server yet.  The last step is to take the blocks of files that are unique to this endpoint and compress them and send them to the server for storage, thus completing the backup, inventorying all of the common data and sending the unique data.

Whew!  What does all this look like in reality?  Let’s take a look at this log entry from a Proof-of-concept we are running for a customer right now:

Screen Shot 2013-07-21 at 10.17.42 AMThis is a initial first upload from a client to the Mirage repository.  This endpoint is running a Windows 7 base image.  It is about 7,634 MB on disk (listed by the total change size.)  Since this is the first time this endpoint has been backed up, all of the data on the endpoint is listed in the total change size.  On all subsequent backups, this capacity will be the size of the files that have changed since the last backup.  The next statistic is the killer number: Data Transferred is 29MB!  Mirage took a full backup of this system’s 7,634 MB and only sent 29MB (the unique data) over the network to the repository!

Here’s how it got there: Mirage inventoried 36,436 files on the endpoint that had changed since the last backup (all the files on the endpoint had “changed” since there was no previous backup of this endpoint.) Mirage ran the hash on all of those files and found that there were 2,875 files that it had not seen before in the repository  (the Unique Files number).  These 2,875 files totaled 221MB (the Size after file dedupe number).  Then Mirage pulled those files apart and looked for the blocks of those 2,875 files that it had not seen before.  Once Mirage found those unique blocks they wittled down the 221MB of files that were unique to 95MB of blocks that were unique (the Size after Block Dedupe number).  Mirage then takes the 95MB of unique blocks (which is the real uniqueness of this endpoint) and compresses it.  Every single step in processing at this point has happened at the client.  The last step is to send the unique data to the Mirage Server (repository).  This data sent is 29MB of actual data for a full backup! (the Size after compression number)  This whole process took 5 minutes and 11 seconds on the client.  This first backup of the endpoint will take longer because the hashing has to happen on all of the changed files (36,436 files for this backup).  However, all subsequent backups from this machine will only look at the files that have changed since the last backup because we already have a copy of the files that have not changed.

Where source-based dedupe works and where it does not

Sourced-based dedupe works the best when we have tons of endpoints with very similar OSes, apps and data (this is why it’s perfect for desktops and laptops).  Where source-based dedupe has it’s challenges is when the files are big and really unique.  Audio and video files are like this.  Unless the files are copies, no two video files are alike, at all.  Not all is lost if your users perform video or audio editing or just work with a lot of these files.  There are ways to accommodate that as well.  We would typically recommend using folder redirection or persona management to move those files to a network drive where we would backup with the typical methods and offload them from the endpoints.  We can also exclude certain file types from being backed up at all by Mirage.

Screen Shot 2013-07-21 at 11.14.29 AM

As shown above, Mirage includes an upload policy which allows you to set rules on file types you do not want to protect from the endpoints.  Some standard ones included already are media files (however as you see in rule exceptions, media files in the c:\windows directory will be backed up).

Mirage is definitely the way to go for any mobile endpoints or branch office endpoints where bandwidth limits and connectivity reliability make  VDI a less-than-optimal choice for the management a recoverability of these endpoints.  I don’t recommend products that don’t work as advertised.  Once the light bulb kicks on and customers understand this technology the real value of it shines thru.  Make no mistake, Mirage is not a mirage, it’s a reality and a really good one at that.

Dec 092011

vSphere Replication and Site Recovery Manager make it very easy to replicate your VMs to your DR site (ahem, once they are set up).  Some customers asked me if there is any way to throttle the bandwidth used for replication.  The good news is that there is a way in vMware software but it cannot be found in SRM.  Unfortunately, it can only be found in the Enterprise Plus Edition of vSphere 5.  It’s Network I/O Control in the Distributed vSwitch (DvS) in v5.  I’m not going to go into a deep dive on Network I/O Control but I will recommend that you read the Network I/O Control best practices doc here.

To enable Network I/O Control we need to have a DvS in place.  If we select the distributed switch and then select the Resource Application tab on the right, this gives us the “properties” option on the far right.  By selecting the Properties option, you can enable Network I/O Control on the DvS.  Once enabled you can see all of the System network resource pools.  There is one at the bottom of the list labeled “vSphere Replication (VR) Traffic”.  Selecting it and then clicking the “Edit Settings” link just below it opens up the settings window.

From here, you can edit the adapter shares.  The shares are to balance the bandwidth so that network flows can use the bandwidth thats available from a given dvuplink.  The shares are for a given dvUplink.

Alternatively, you can uncheck the Unlimited checkbox and set a host limit.  Keep in mind that this is Megabits per sec, not MegaBytes.  This is also the limit of the combined set of dvUplinks on a given host.

Lastly, a QOS priority tag can be used.  The traffic will have a 802.1p tag applied to it.  The IEEE does not standardize or mandate the use of the priority tag applied to the packets but the switches should treat higher tags with higher priority.  The choices are None, 1-7.

While not the granular controls that we may wish for, say individual bandwidth controls on a per VM or per-site replication limits, these settings and options are a start.  Hopefully in the future in vSphere Replication v2 we will have more granular controls for bandwidth throttling but until then, these are what we can use.  Happy computing.


Jun 292010

In the past, I have reviewed all of the technical papers on the VMware site.  I’ve decided to change direction a little and I only plan on reviewing papers that would apply to the everyday VM Admin.  I’m also going to throw in my own ranking on each article (*****, 1 to 5 stars).  You will also notice a “vKeeper” reference in some of the papers.  This award is for the papers that I keep a local copy of on my computer for reference when I need them.  They are the docs that all admins should read thru and use as a reference as needed.  I have also added a section to my admin bookmark page just for the vKeeper docs.

PCoIP Display Protocol: Information and Scenario-Based Network Sizing Guide – (12 pages) A good paper with very good insight on the PCoIP protocol used in VMware View.  It gives some good suggestions and the required bandwidths needed to satisfy the end users on their desktop experience.  A must have for view deployments.  (****, 4 of 5 stars)

Application Presentation to VMware View Desktops with Citrix XenApp – (3 pages) This is a whitepaper to show how to deploy applications in VMware View desktops from XenApp.  While I can see this being useful for View admins who use XenApp, the description and instructions are very minimal.  Probably something better suited for a KB article. (**, 2 of 5 stars)

Timekeeping in VMware Virtual Machines – (26 pages) This is a very important topic for all VM Admins to know.  Time is relevant to everything in a VM, whether you are trying to authenticate to Active Directory or troubleshooting using event logs, accurate time is very important.  This paper goes into some really great detail on how VMware maintains accurate time in VMs.  If you are a VMware admin, this should be a standard read.   (*****, 5 of 5 stars, vKeeper)

SAN System Design and Deployment Guide – (244 pages of storage goodness)  I have a storage background so I specifically enjoy this one.  If you are running ESX on SAN shared storage (you should be on some type of shared storage) then this is a must read.  This whitepaper is also very helpful if you are studying for the VCP or one of the new VCAP exams.  This is another paper I keep local and definitely one all VM admins with SAN should review.  (*****, 5 of 5 stars, vKeeper)

Best Practices for Running vSphere on NFS Storage – (14 pages) On the heels of the SAN design and deployment guide, this paper describes the best practices for running NFS on vSphere.  I like the fact that this article references outdated best practices that have changed and why they have changed.  This is a HUGE help to admins who google a topic only to find conflicting information.  My only regret on this paper is that I would like to see more detail on the advanced options and how they affect the performance of NFS.  Still a important doc for VM Admins using NFS storage.  Should be reviewed by all of them to make sure they are current in their deployment of NFS best practices.  (****, 4 of 5 stars)

Location Awareness in VMware View 4 – (8 pages) Good information for View Admins to know where to find out where their clients are connecting from.  This is a common request from hospitals to have printers “follow the user” as they float from terminal to terminal.  There are some advanced topics in this article and some Active Directory knowledge is definitely required especially when using loopback mode in group policy processing.  Good info and hopefully View will include some GUI-based  native features in the future to assist with this.  (***, 3 of 5 stars)

VMware vSphere 4.0 Security Hardening Guide – (70 pages) This is a outstanding reference for any VM Admin.  Security affects everyone’s environment, from the 3-man shop to the largest infrastructure.  Setting the precedence of a solid, secure enviornment from the ground up will provide you with a infrastructure that is solid as a rock. I recommend reviewing this paper often and keeping this one handy   (*****, 5 of 5 stars, vKeeper)

VMware vStorage Virtual Machine File System – Technical Overview and Best Practices – (13 pages) This is a entry level paper on some of the very basics of VMFS and how they relate to RDMs.  This should be a good introduction to VMFS to new VM Admins.  I hoped with “Best Practices” in the title that there would be more technical references (advanced options for VMFS and how tweaking them affects the storage performance for instance).  I was also disappointed to see the LUN size question answered vaguely, suggesting to refer to the storage vendor to size your LUNs appropriately.  I prefer Duncan’s approach to LUN sizing and it’s what I recommend to all of my customers.  (***, 3 of 5 stars)

Look for the vPaper Report again next quarter (hopefully with some new releases in between). Until then, happy reading!

Jul 082009

Some good ones came out last week.  Let’s take a look:

May 292009

A couple new technical papers got posted this week.  Some good reading for the IT staffers working hard this summer.

Microsoft Exchange Server 2007 Performance on VMware vSphereTM 4 <-Great reading for seeing how Exchange performs on ESX4.

Smart Card and Certificate Authentication in VMware View  <-If you need to use smart cards with VMware View this a must read

Repurposing a PC to a Thin Desktop Using VMware View <-a very common question from customers that want to extend the life of their PC a little longer.  Good reading on a few ideas on how to do so.

Network Segmentation in Virtualized Environments  <-Some good ideas if you need to seperate and firewall off sections of your infrastructure.