Paul Bunn, CTO, UltraBac Software
"A system to allow efficient image backup and restore (full or selective) by using an object-storage system as a backing store."
Background of Invention
Current prior art, such as UltraBac's Image Backup and UltraBac Warp allow for an "image" (or block-level backup) to be taken of a user's computer and stored on a filesystem – usually resident on the network, or locally attached (either internal, or external disk drives). When storing the backup data in this way, the data is normally stored in monolithic files of many GBs or even TBs in size. These are typically broken down into "data" files which represent the compressed/encrypted data, and smaller "index" files which describe the layout of the data contained in the "data" files so that they may be used to skip regions of the drive that are marked as not-in-use and describe the ranges of data and how they map from uncompressed to compressed sizes. The "index" data may be interspersed into the "data" files themselves eliminating the need for separate files, but having the data separated allows the restore processes to be more efficient.
Currently there are "object-storage systems" such as OpenStack Swift architecture developed by Rack Space and released as open-source. Although there are other competing technologies (such as from Amazon.com Inc.) for object-storage systems that this invention could use, for purposes of illustration this document is based on using the Swift architecture. This invention could easily be applied to other object-storage systems other than Swift. Object-storage systems such as Swift are used by cloud-storage providers to allow end-users (their customers) to store large amounts of personal data "in the cloud." Swift, and similar technologies, allows a massively scalable and distributed object-storage infrastructure to be built up quickly and cost-effectively. The distributed component of Swift means that Swift, when properly configured, provides data redundancy so that the objects stored in Swift are not exposed to any single point-of-failure.
While Swift, and other object storage systems, are extremely effective at managing millions, or even billions, of objects, it is very inefficient at handling large objects that would be necessary to store if Swift were to be used as a target for an image (block based) backup. Swift does provide "large object support" which allows "segmentation" – this allows very large files (>5GB) to be supported by breaking the large object into segments and then describing these segments with a "manifest file."
Some technologies exist to present the Swift object storage system as a filesystem which would allow existing backup technologies to read/write backup data. This is a poor solution as the creation of files greater than 5GB may cause the technology to fail. Even if it works, a problem exists where if the internet connection or cloud service provider falter, it may result in a partially transferred backup (which may have taken days or weeks to transfer thus far) having to restart from the beginning.
Summary of Invention
To solve the problem with large files, this invention describes a technique to write directly to the Swift (or other cloud technology) and to manage the segmentation to assist Swift in efficiently storing the image backup data. A "mount" of the backed-up data can be achieved by implementing a user-mode service to process requests from the filter-driver that makes the virtual mount device visible to the operating system.
The backup process, as part of its normal backup operation, will write many GB (or TB) of data to the Swift storage. It writes to the storage directly via the use of libcurl (or it could issue http commands directly via TCP, but libcurl facilitates this process). It will set a maximum file system (which is configurable) and, for sake of argument, we will assume it is set to 100MB for this description. When the backup program has written 100MB of data to Swift, it will write to a local storage a file that will eventually become the manifest to represent all of the files present in Swift that represent the one image backup file. If the internet link fails, or the internet/cloud service provider becomes too congested, it is possible that libcurl/TCP will time out and the current transfer of the backup file will fail. This will not result in starting over the backup, but rather only re-starting the current 100MB segment that was in progress at the time of the failure. The backup will wait for the link to re-establish and then continue the backup at the beginning of the current segment. The backup will write its traditional index also to a local file – this file describes the offset/length and compressed-length for each block of data written to the main "data" file. The index will also now contain a segment-identifier so that it knows which object the data exists in for each defined range.
Only once the backup data has been completely written will the backup process then transfer to Swift the objects that will represent the manifest and the index. The manifest will also contain a reference identifier inside its data to identify the name of the object that contains the index for this backup.
The current art in UltraBac and UltraBac Warp is to use the UltraBac Filter Driver ("UBFD") to mount a virtual drive that represents the data contained in the backup. In order to do this for Swift, UBFD will communicate with a user-mode service (functionality could be added to the "WarpService" process for this, or a dedicated service could be added). The user mode service will receive requests from UBFD that will be read or write requests for specific ranges of data from the backup. The user-mode service will determine which object holds the data by reading (or maintaining in-memory tables) the index and manifest objects and determining which object-segment holds the data required. It will use libcurl/TCP to request the data from Swift, then decompress/decrypt it and pass the data back to UBFD which will then satisfy the operating system's request for the data (in the form of an IRP). In this way, a virtual drive of the user's backed-up data can be made available for any point-in-time requested. A user can also restore (in the event of total data-loss of the original drive) the entire drive by using the existing prior art in UltraBac Warp to restore the whole volume (this can be done with a boot CD or USB key where there is no longer any bootable operating system).
CIP — Continuous Image Protection
Existing prior art in UltraBac Warp allows the user to continually back up changes made to the protected drives to the backup destination. When the backup destination is a locally or network-attached drive it is reasonable to back up these changes immediately as they occur. When targeting a backup on a Swift server, it may make more sense to aggregate changes after a certain period (say every 15 minutes). In this case, the backup application will keep track of all changes made to protected drives since the last successful transfer to Swift and then generate a new object that represents all of the changes made since that point. It is quite possible that these changes will exceed 100MB (or other segment size defined), so the backup process will use a similar process as described above for backup where, once the object size exceeds the defined limit, the current one will be closed and validated and then a new object will be created for transfer. For each object transferred in this way a second "index" object will be created (this can be easily identified by using a suffix such as "-i" in the object name, or by maintaining a master list of the names of all objects in Swift that are related to the CIP data).
The user will likely want to define how long multiple copies of data are maintained in Swift. To this end, similar to what is done in UltraBac Warp today, the user will define a schedule for consolidation (also known as “pruning”). The user may set a schedule that states:
- Keep all 15 minute "incrementals" for a period of a minimum 24 hours, then consolidate to one "daily" incremental.
- Keep all "daily" incrementals for a minimum of 60 days to "weekly" incremental.
- Keep all "weekly" incrementals for 20 weeks, and then consolidate into "master."
The consolidation of the incrementals is fairly straight-forward to those accustomed in the art. For example, all of the 15 minute incrementals that are older than 24 hours are identified; the index data is read to determine the most recent instance of all ranges covered by the data during that period. Only the most recent data available is then written out to a new data/index file for each new "daily." The master list of names of CIP objects is replaced with a new file to represent the consolidation once complete.
Once data will no longer be kept in CIP data (for example the weekly incrementals above referenced that are now 20 weeks old), these files will be read, merged into the "master" data/index files, and then deleted. Because of the architecture described above, it is possible to update parts of the master image backup by generating just new 100MB segments that will replace the existing segments. New data can also be inserted into the master image by generating new segments and then inserting a reference to these segments. By reducing the size of the individual segments, to say 10MB, this increases the efficiency of the consolidation because it reduces the amount of "incidental" updates that each update is likely to incur (for example if 1MB of data changes in one segment, then 99MB has to be read/written to fill the segment with the original non-impacted data), but places additional resource utilization on Swift by making it store 10 times as many objects. These conflicting efficiency improvements have to be balanced when deciding upon the segment size.