|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
Peter Baer Galvin
Over the past few months, I’ve been covering new and useful Solaris 8 features in the Solaris Companion. This month continues the trend by looking at the new fssnap command, which provides snapshot copies of the default UFS file system, much like commercial file systems provide. But is it a winner like UFS logging and IP multipathing? This month I test fssnap, I provide useful new reference services, and some feedback on a previous column.
Snapshots, in Theory
On that note, my column this month explores another hidden Solaris feature — UFS snapshots. A snapshot can be thought of as a cousin of RAID 1. Rather than performing a block-by-block copy of a disk, and then performing all writes to both copies, a snapshot takes a shortcut. The snapshot starts from an original disk (in this case, actually a UFS partition) and instead of copying all of the original blocks, it creates a copy of the metadata structures. In essence, it has pointers to all the data blocks. Thus, a snap is very fast to create.
The snapshot is placed within a file system or (theoretically) on a raw device. The snapshot target is called the “backing store”. Changes to the snapped file system are then handled specially. For every block (metadata or normal data) that is to be written to the snapped file system, a copy of the original contents is created and placed on the snapshot and then the write is allowed to occur to the original file system. In this manner, the original source file system is kept up to date and its snapshot copy has the contents that the file system had when the snapshot occurred.
Why is this useful? In theory, there are many uses for it. Certainly other products that include this snapshot feature (Network Appliance, the Veritas File System) allow some great functionality.
Because snapshots are fast and low overhead, they can be used extensively without great concern for system performance or disk use (although those aspects must also be considered).
How does Sun’s current implementation compare to the others that are available? I tested UFS snapshots on a Solaris 8 7/01 release Ultra 10, upgraded with the latest kernel jumbo patch and file system patches. There is only one command for UFS snapshots, which certainly keeps testing simple. fssnap performs UFS snapshots, provides information about them, and manages and deletes them.
The basic command to create a snapshot is:
# fssnap -o backing-store=/snap / /dev/fssnap/0
Where backing-store is the file system on which to put the snapshot, and the last argument / is the file system to snap. The command returns a device name, which is an access point to the snapshot file system. Of course, you can create multiple snapshots, one per file system on the system:
# fssnap -o backing-store=/snap /opt /dev/fssnap/1
The snapshot operation on a quiet file system took a few seconds. The busier the file system, the longer the operation.
A snapshot can reside on any file system type, even NFS, which allows you to snap to a remote server. Of course, the snap is only useful when accessed from the original snapped server, where the rest of the data blocks reside.
Unfortunately, my testing revealed that an unmounted device cannot currently be used as the backing store, contrary to the documentation. There is a bug on sunsolve.sun.com against this problem, so hopefully it will get solved with a patch or a future Solaris release. There are several other errors in the documentation on http://docs.sun.com, including incorrect arguments to several commands. The examples in this column use the correct commands and arguments.
Now we can check the status of a snapshot:
# fssnap -i / Snapshot number : 0 Block Device : /dev/fssnap/0 Raw Device : /dev/rfssnap/0 Mount point : / Device state : idle Backing store path : /snap/snapshot0 Backing store size : 2624 KB Maximum backing store size : Unlimited Snapshot create time : Wed Oct 31 10:20:18 2001 Copy-on-write granularity : 32 KB
Note that there are several options on snapshot creation, including limiting the maximum amount of disk space that the snap can take on its backing store.
From the system point of view, the snapshot looks a bit strange. The disk use, at least at the initial snap, is minimal as would be expected:
# df -k Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 4030518 1411914 2578299 36% / /proc 0 0 0 0% /proc fd 0 0 0 0% /dev/fd mnttab 0 0 0 0% /etc/mnttab swap 653232 16 653216 1% /var/run swap 653904 688 653216 1% /tmp /dev/dsk/c0t1d0s7 5372014 262299 5055995 5% /opt /dev/dsk/c0t0d0s7 4211158 2463312 1705735 60% /export/home /dev/dsk/c0t1d0s0 1349190 3313 1291910 1% /snap
However, an ls shows seemingly very large files:
# ls -l /snap total 6624 drwx------ 2 root root 8192 Oct 31 10:19 lost+found -rw------- 1 root other 4178771968 Oct 31 10:30 snapshot0 -rw------- 1 root other 5500942336 Oct 31 10:24 snapshot1
These files are “holey”. Logically, they are the same size as the snapped file system. As changes are made to the original, the actual size of the snapshot grows as it holds the original versions of each block. However, almost all of the blocks are empty at the start, and so are left as “holes” in the file. The disk use is thus only the metadata and blocks that have changed.
The performance impact of a snapshot is that any write to a snapped-file system first has the original block written to the snap, so writes are 2X non-snapped file systems. This is similar to the overhead of RAID-1. Typically, RAID-1 writes are done synchronously to both mirror devices. That is, the writes must make it to both disks before the write operation is considered to be complete. This extra overhead makes writes more expensive. It is not clear whether snapfs commands are done synchronously or asynchronously, although it is likely the former.
What can be done once a snapshot is created? Certainly a backup can be made of the snapshot, solving the previously ever-present “how to back up a live file system consistently” problem. In fact, fssnap has built-in options to make it trivial to use in conjunction with ufsdump:
# ufsdump 0ufN /dev/rmt/0 'fssnap -F ufs -o raw,bs=/snap,unlink \ /dev/rdsk/c0t0d0s0'
This command will snapshot the root partition, ufsdump it to tape, and then unlink the snapshot so the snapshot file is removed when the command finishes (or at least the file should be removed). In testing, the unlink option does indeed unlink the snapshot file, but the fssnap -d command is required to terminate the use of the snapshot and actually free up the disk space. Thus, this would be the full command:
# ufsdump 0ufN /dev/rmt/0 'fssnap -F ufs -o raw,bs=/snap,unlink \ /dev/rdsk/c0t0d0s0' # fssnap -d /
fssnap gets interesting when the snapshot itself is mounted for access, as in:
# mount -o ro /dev/fssnap/0 /mnt
Now we can create a file in /, and see that it does not appear in:
/mnt: # touch /foo # ls -l /foo -rw-r--r-- 1 root other 0 Nov 5 12:25 /foo # ls -l /mnt/foo /mnt/foo: No such file or directory
Unfortunately, there does not appear to be any method to promote the snapshot to replace the current active file system. For example, if a systems administrator was about to attempt something complicated, such as a system upgrade, she could perform a snapshot first. If she did not like the result, she could restore the system to the snapshot version. (Of course, the “live upgrade” feature that is just now rolling out as part of Solaris provides a similar functionality.)
The backing store can be deleted manually after it’s finished. fssnap -d “deletes” the snap, but that is probably the wrong terminology. Rather, it stops the use of the snapshot, more like “detaching” it from the source file system. To actually remove the snapshot, the snapshot file must also be deleted via rm.
Alternately, the unlink option can be specified when the snap is created. This prevents a file system directory entry from being made for the file. In essence, it is then like an open, deleted file. Once the file is closed, the inode and its data are automatically removed. Unlinked files are not visible in the file system via ls and similar commands, making them harder to manage than normal “linked” file systems.
Apparently only one active snapshot of a file system can exist. This limits the utility of UFS snapshots to be a kind of safety net for users or systems administrators. For instance, a snapshot could be made once a night, but only one day’s worth of old data would then be available.
Another limitation comes from the Sun documentation. If the backing-store file runs out of space, the snapshot might delete itself, which could cause backups and access to the snapshot to fail. Errors are logged to /var/adm/messages, which should be checked for possible snapshot errors.
Summary
On the whole, fssnap is a welcome addition to the UFS functionality. Sun is obviously paying attention to its file systems, and adding features to make it more competitive with the commercial offerings. There are a couple of limitations in the current implementation, making it useful only for creating consistent ufsdump backups. I hope the functionality will increase over time.
Useful References
Book publishers are finally starting to take advantage of Internet functionality to allow their books to be more widely used. Two examples are http://www.books24X7.com and http://safari.oreilly.com/. O’Reilly’s Safari. Books 24X7 provides unlimited access to hundreds of books on line, with a sophisticated search engine, live links from within books to Internet resources, and fully scanned diagrams. Of course, it has a price to match those high-end features.
The Safari project, which includes O’Reilly, Addison Wesley, New Riders, Prentice Hall, and other presses, provides access to many books, but in a more limited fashion. It has a point system in which you pay for a certain number of points per month, and that gives you access to that many points worth of books in the month. Every month, you can change which books you access with those points.
Both services are worth looking into if you enjoy ready access to reference and “how-to” materials.
Letters
Thanks for the informative note from Boyd Fletcher:
Good article (Reliable Network with Solaris, November 2001: http://www.samag.com/documents/s=1441/sam0111i/0111i.htm). You are correct — AP and IP Multipathing are mutually exclusive. AP has been replaced as of Solaris 8 07/01 with MPXIO (IO multipathing) for hard disks and IP multipathing for network cards. Both are easier to configure, faster, and more reliable that AP. On the Serengeti machines AP is not available so you have to use MPXIO and IPMP. You might want to mention in a follow-up article that IPMP can wreck havoc on your VLANs and may not even work if you are running with port lockdowns based on MAC addresses.
Also, SAN StorEdge 3.0 at http://www.sun.com/storage/san/ is required for MPXIO. AP will still work with existing hardware, just not on the SunFire line.
Boyd
AnswerBook2 Solaris 8 System Administration Supplement
The fssnap command is new in the Solaris 8 1/01 release. The following information supplements information on backing up file systems that is in "Backing Up and Restoring File Systems (Overview)" in the System Administration Guide, Volume 1.
For the most current man pages, use the man command. The Solaris 8 Update release man pages include new feature information that is not in the Solaris 8 Reference Manual Collection.
The Solaris 8 1/01 release includes the new fssnap command for backing up file systems while the file system is mounted.
You can use the fssnap command to create a read-only snapshot of a file system. A snapshot is a file system's temporary image that is intended for backup operations.
When the fssnap command is run, it creates a virtual device and a backing-store file. You can back up the virtual device, which looks and acts like a real device, with any of the existing Solaris backup commands. The backing-store file is a bitmapped file that contains copies of pre-snapshot data that has been modified since the snapshot was taken.
UFS snapshots enables you to keep the file system mounted and the system in multiuser mode during backups. Previously, you were advised to bring the system to single-user mode to keep the file system inactive when you used the ufsdump command to perform backups. You can also use additional Solaris backup commands like tar and cpio to back up a UFS snapshot for more reliable backups.
The fssnap command gives administrators of non-enterprise-level systems the power of enterprise-level tools like Sun StorEdge(TM) Instant image without the large storage demands.
UFS snapshots is similar to the Instant Image product. Instant Image allocates space equal to the size of the entire file system that is being captured. However, the backing-store file that was created by UFS snapshots occupies only as much disk space as needed, and you can place a maximum size on the backing-store file.
This table describes specific differences between UFS snapshots and Instant Image.
UFS Snapshots | Instant Image |
---|---|
Size of the backing-store file depends on how much data has changed since the snapshot was taken | Size of the backing-store file equivalent equals the size of the entire file system being copied |
Does not persist across system reboots | Persists across system reboots |
Works on UFS file systems | Cannot be used with root (/) or /usr file systems |
Part of the Solaris 1/01 release | Part of the Enterprise Services Package |
Although UFS snapshots can make copies of large file systems, Instant Image is better suited for enterprise-level systems. UFS snapshots is better suited for smaller systems.
When the file-system snapshot is first created, users of the file system might notice a slight pause. The length of the pause increases with the size of the file system to be captured. While the file-system snapshot is active, users of the file system might notice a slight performance impact when the file system is written to, but they will see no impact when the file system is read.
When you use the fssnap command to create a file-system snapshot, observe how much disk space the backing-store file consumes. The backing-store file uses no space, and then it grows quickly, especially on heavily used systems. Make sure the backing-store file has enough space to grow, or limit its size with the -o maxsize=n [k,m,g] option, where n [k,m,g] is the maximum size of the backing-store file.
If the backing-store file runs out of space, the snapshot might delete itself, which causes the backup to fail. Check the /var/adm/messages file for possible snapshot errors.
# df -k |
# ls /file-system/backing-store-file |
# fssnap -F ufs -o bs=/file-system/backing-store-file /file-system |
The following example creates a snapshot of the /usr file system. The backing-store file is /scratch/usr.back.file, and the virtual device is /dev/fssnap/1.
# fssnap -F ufs -o bs=/scratch/usr.back.file /usr /dev/fssnap/1 |
The following example limits the backing-store file to 500 Mbytes.
# fssnap -F ufs -o maxsize=500m,bs=/scratch/usr.back.file /export/home /dev/fssnap/1 |
You can display the current snapshots on the system by using the fssnap -i option. If you specify a file system, you see detailed information about that snapshot. If you don't specify a file system, you see information about all of the current file-system snapshots and their corresponding virtual devices.
# fssnap -i 0 / 1 /usr |
To display detailed information about a specific snapshot, use the following:
# fssnap -i /usr Snapshot number : 1 Block Device : /dev/fssnap/1 Raw Device : /dev/rfssnap/1 Mount point : /usr Device state : idle Backing store path : /scratch/usr.back.file Backing store size : 480 KB Maximum backing store size : Unlimited Snapshot create time : Tue Aug 08 09:57:07 2000 Copy-on-write granularity : 32 KB |
When you create a UFS snapshot, you can specify that the backing-store file is unlinked, which means the backing-store file is removed after the snapshot is deleted. If you don't specify the -o unlink option when you create a UFS snapshot, you will have to delete it manually.
The backing-store file occupies disk space until the snapshot is deleted, whether you use the -o unlink option to remove the backing-store file or you remove it manually.
You can delete a snapshot either by rebooting the system or by using the fssnap -d command and specifying the path of the file system that contains the file-system snapshot.
# fssnap -i |
# fssnap -d /file-system Deleted snapshot 1. |
# rm /file-system/backing-store-file |
The following example deletes a snapshot and assumes that the unlink option was not used.
# fssnap -i 0 / 1 /usr # fssnap -d /usr Deleted snapshot 1. # rm /scratch/usr.back.file |
The virtual device that contains the file-system snapshot acts as a standard read-only device. This means you can back up the virtual device as if you were backing up a file-system device.
If you are using the ufsdump command to back up a UFS snapshot, you can specify the snapshot name during the backup. See the following section for more information.
If you are using the tar command to back up the snapshot, mount the snapshot before backing it up, like this:
# mkdir /backups/home.bkup # mount -F UFS -o ro /dev/fssnap/1 /backups/home.bkup # cd /backups/home.bkup # tar cvf /dev/rmt/0 . |
For more information on how to back up a file system see "Backing Up Files and File Systems (Tasks)" in the System Administration Guide, Volume 1.
# fssnap -i /file-system |
For example:
# fssnap -i /usr Snapshot number : 1 Block Device : /dev/fssnap/1 Raw Device : /dev/rfssnap/1 Mount point : /usr Device state : idle Backing store path : /scratch/usr.back.file Backing store size : 480 KB Maximum backing store size : Unlimited Snapshot create time : Tue Aug 08 09:57:07 2000 Copy-on-write granularity : 32 KB |
# ufsdump 0ucf /dev/rmt/0 /snapshot-name |
For example:
# ufsdump 0ucf /dev/rmt/0 /dev/rfssnap/1 |
# ufsrestore ta /dev/rmt/0 |
If you want to create a file-system snapshot incrementally, which means only the files that have been modified since the last snapshot are backed up, use the ufsdump command with the new N option. This option specifies the file-system device name to be inserted into the /etc/dumpdates file for tracking incremental dumps.
The following ufsdump command specifies an embedded fssnap command to create an incremental dump of a file system.
# ufsdump 1ufN /dev/rmt/0 /dev/rdsk/c0t1d0s0 `fssnap -F ufs -o raw,bs= /export/scratch,unlink /dev/rdsk/c0t1d0s0` |
The -o raw option is used in the example to display the name of the raw device instead of the block device. By using this option, you make it easier to embed the fssnap command in commands that require the raw device instead, such as the ufsdump command.
# ufsrestore ta /dev/rmt/0 |
The backup created from the virtual device is essentially just a backup of what the original file system looked like when the snapshot was taken. When you restore from the backup, restore as if you had taken the backup directly from the original file system, such as one that used the ufsrestore command. For more information on restoring file systems, see "Restoring Files and File Systems (Tasks)" in the System Administration Guide, Volume 1.