Virtualization has become a ubiquitous technology for our times. Improvements in hypervisors have reduced complexity, improved server utilization and increased agility. But there is also a potential downside. In order to transform a physical server into many virtual machines (VMs), an additional software layer is added.
While simplifying the admin user experience, virtualization raises the overall complexity of the IT environment, making it more difficult for admins to know which physical system their VMs are running on or which storage is used for a particular machine in the event of data loss. With fewer people to maintain and monitor a larger number of virtual machines there is more chance for data loss than ever before.
To help prevent against data loss, modern systems often use some form of replication of data across multiple physical drives (HDD or SSD) that is consolidated into a single logical unit. This data protection can be a hardware or software-based solution. RAID combines multiple hard drives or data stripes to improve redundancy, increase data reliability and boost I/O (input/output) performance.
Unfortunately, data loss is not uncommon with RAID storage. The complexity of RAID is added to by the presence of deduplication and compression. Now factor in an additional virtualization layer and the likelihood of a fault increases. If a RAID configuration becomes corrupted, the interconnectivity of multiple systems can potentially cause significant data loss and downtime.
Reformatting and re-installing are additional causes of data loss in virtualization environments. Corruption can happen due to buggy updates, poorly planned implementation, integration issues or database corruption.
Thin provisioning data loss should be considered. Instead of allocating all the data the VM will need and positioning the file system structures at their specified physical offsets, thin provisioning only provisions the amount of space immediately needed and adds additional blocks to the virtual disk as it grows. This can result in a more complex and fragmented virtual environment. If the metadata pointers to the data are missing or damaged, it is challenging to locate the various fragments and rebuild the virtual disk.
Virtual file system metadata corruption
Yet another source of data loss is metadata corruption. Metadata is even more important in virtualization environments due to the number of layers and VMs that exist. A small problem with VMFS metadata can have serious repercussions to data availability.
A surprisingly large amount of failures are due to virtual disks deleted by mistake, VMs being overwritten or their space reassigned. There can also be snapshot chain corruption, i.e. one of a series of snapshots is either corrupted, gets deleted or becomes unavailable for some other reason. This can foul up backups and make it difficult to recover data.
What can enterprises do when they experience data loss from a virtualization environment? There is no back or undo button. A deleted VM is gone. Fortunately, data recovery is often possible through global data recovery service providers.
The first point of entry is at the storage level. It can be possible in some cases to directly recover data from physical drives by taking an image of the drives and reading whatever raw data might be available on the disk.
The next option is to attempt to recover data from the logical volumes (LUNs) or RAID. If the RAID controller is available, it can be used to track down the many slices of data spread across virtual disks.
The next level up is the host file system level. In VMware this would be VMFS and in Hyper-V, NTFS or ReFS. In many cases, data isn’t available directly at the storage level. But if the right tools are used, recovery experts can trace data from the basic storage data blocks, map it to the host level and recompile it.
If that process doesn’t provide an adequate recovery, additional tools can be employed to extend further into the guest file system level. By investigating the virtual file system, data recovery specialists can sometimes find data that would otherwise be lost. Finally, it is possible to reach into the guest file level and access data lurking in application files such as SQL, Exchange, SharePoint, Oracle, Office files, ZIP files and more.
What it takes is an understanding of each level and knowing what might be available where. Those well-versed in storage architectures can track down data that seemed lost by finding pieces of it in one level and other parts in another level.
Virtualization may save time and eliminate complexity from the user view. But it comes with a unique set of challenges. Whether through volume corruption, ransomware, corrupted virtual backups, hardware failures or accidently deleted files, data loss is a reality for anyone managing virtual systems. Whilst backup is necessary to safeguard enterprise data, it is far from fool proof so shouldn’t be overly relied upon.
More in Servers & Storage
Conference Session Scalable storage - where should the data sit?