|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
Yesterday, All those backups seemed a waste of pay. Now my database has gone away. Oh I believe in yesterday. unknown source (well originally Paul McCartney :-) |
If you never have a hardware crash that wiped out your hard drive, you can't fully appreciate the importance of backups. But if you do not have adequate backup it can be because linux became way too compelx and it is unlear what is optimal path for example for bare metal recovery of OS. As a result people just do not start working on that :-). Relying of large corporation backup can save you from firing but generally is too optimistic policy. That path can fail you, especially if the corporation is using complex tools like DataProtector whioch can stop making backups at whim, without any user or sysadmin intervention.
After the disaster the key question is: did your have a good backup. That means that an hour spent now may save you several days of frantic efforts to recover the data later. And not only user data for which usually enterprise backup may (or may not) exist (again, for example, Data Protector sometimes stops making backup for no reason, so even with enterprise software there is no guarantee), but also a recoverable OS backup (rescue image).
I think that the attitude of a sysadmin to backups reflects the level of professionalism of Linux/Unix system administrator better then many other metrics and as such is a good indicator of the level of potential candidate during job interviews.
And good means recent (in most case the preferable is from today or, in worst, case yesterday) and readable. In this sense having a local backup OS on USB stick such as Cruiser Fit (which is can have sized up to 128 GB) makes perfect sense as the amount of data in "pure" OS backup is usually minuscule (around 8GB for most versions of Linux). Add to this some critical system directories and user data and you might get 32GB of compressed tar. You can use the tool that utilize hardlinks to de-duplicate previous version. One good example of such tools is Rsnapshot.
Often disasters happen as a chin of unpredictable events, for example power outage. In this case some server that work OK before can refuse to book and some component might need to be replaced. If the hardware has been damaged, the parts should be replaced before attempting to restore the OS and data no matter what the level of pressure is applied to you. Being suicidal is not the best strategy in such situations.
In Linux you usually can recover each filesystem separately. Often you can find some unused server of VM on thich you can restore at least part of the data and this way releve pressure from users. If you need to repartition the drive the easiest way to do it in RHEL is to use Kickstart file in /etc/root, which contains partitioning instructions you used in initial OS install. Preseving the Kickstart file on one of the function of any decent baseliner. So if you have baseline you are covered.
After that you ca use cpio, tar or other backup tool to restore partition by partition. Simple tools are preferable because they are the most flexible and the most relaible. To discover some ununtisipated nuance of you complex backup software during recovery is the situation that you would not wish even to the enemy.
linux no have pretty flexible and powerful remounting capabilities. chroot often also can help. To re-mount the root partition from the read-only status to read-write status you can use the command mount -u -o rw /mnt.
But as people say each disaster also represents an opportunity. And first of all to improve and enhance you backup strategy. also it is perfect time to apply firmware fixes, especially if you are too behind.
The key lesson in such situation is that whatever caused the crash or data loss this time may strike again. And due to Murphy law it will happen in the most inopportune time when your backup are mostly out of date.
One thing that distinguishes a professional Unix sysadmin from an amateur is attitude to backups and the level of knowledge of backup technologies. Professional sysadmin knows all too well that often the difference between major SNAFU and a nuisance is availability of the up-to-date backup of data. Here is pretty telling poem (From unknown source (well, originally, by Paul McCartney :-) on the subject :
Yesterday,
All those backups seemed a waste of pay.
Now my database has gone away.
Oh I believe in yesterday.
You can read more about this important issue at the section called Missing backup of Sysadmin Horror Stories
Bare metal recovery is the process of rebuilding a computer after a catastrophic failure. because such tituatin are rare and happen once in several year from one situation to another you lose all you qualification obtained during previous incident. Typically you start again on the level 0. this level can be somewhat raised in you have agood post-motem of the previous incident, written as a Web page.
Generally Unix and Linux suffer from neglect: there are very few good bare metal back up solutions.
The most important part is you level of knoledge of the tool, not the tool itself. So the right strategy is to use the tool that you know best. most often that means tar. that's why Relax-and-Recover is so attractive as backup solutio, despite being very brittle: you will have the tarball and even if revoery disk will not boot you still will be able to revocer all or most of the the information and at least have OS bootable again.
You can also create your own "ppor manbackup solution" using Kickstart the following way:
In this case restore process is as following
NOTE: RAM based distributions like SystemRescueCD or Knoppix are not very convenient/usable to with RHEL as they do not understand security attributes used in RHEL. USB drive created from you kickstartr file and the standard RHEL ISO is the only reliable rescue disk for this flavor of Linux
The normal bare metal restoration process usually is similar to the following
In case of bare metal recovery the target computer does not have any operating systems on it.
First you need to back up your system with a classic backup tool such as dd, tar, and cpio. Often cpio is used for root partition (it is able to restore device files and links). For other partitions tar can was used.
Users of Red Hat Package Manager (RPM)-based Linux distributions should also save RPM metadata as part of their normal backups or baseline. Something like:
rpm -Va > /etc/rpmVa.txt
in your backup script will give you a basis for comparison after a bare metal restoration.
To get to this point, you need to have:
The last stage is a total restoration from tape or other media. After you have done that last stage, you should be able to boot to a fully restored and operational system.
there is a provern that defeated armies usually learn really well. This is also true in situation of sysadmin and some SNAFU connected with the loss of data.
The size of the backup depends on how well you data are organized. Often exceessive size is caused "bloating": users are especially bad and many of them are real pack rats, that collect ISO and other easily downloadable files in their home directories. After you restored your system, it is a good time to delete unnecessary files, users, de-install useless applications and daemons RPMs, etc. In preparation for a complete backup of the system, it is a good idea to empty the temporary folder and remove any unwanted ISO and installation files. Especially if you srote then in system directory like /var
Unmount any external drives and remove any optical media such as CDs or DVDs that you do not want to include in the backup. This will reduce the amount of exclusions you need to type later in the process.
Go through the contents of your user folder in /home and delete all unwanted files in the subdirectories, often people download files and forget about them for instance.
If you normally boot directly to X, that could cause problems. To be safe, change your boot run level temporarily. In /etc/inittab, find the line that looks like this:
id:5:initdefault:
and change it to this:
id:3:initdefault:
When you're multi-booting or installing a new operating system onto a used system, sometimes the MBR (Master Boot Record) gets all messed up, so you need to wipe it out and start over. You can do this with the dd command. Be sure to use your own drive name for the of= value:
# dd if=/dev/zero of=/dev/hda bs=446 count=1
That preserves the partition table. If you also want to zero out the partition table, do this:
# dd if=/dev/zero of=/dev/hda bs=512 count=1
You can use logrotate utility for rotation of
simple backups and baselines. For example you can rotates files with the .tgz
extension which are located
in the /var/data/backups
directory. The parameters in this
file will override the global defaults in the /etc/logrotate.conf
file. In this case, the rotated files won't be compressed, they'll be
held for 30 days only if they are not empty, and they will be given
file permissions of 600 for user root
.
/var/data/etcbackups/*.tgz { daily rotate 30 nocompress missingok notifempty create 0600 root root }This was you can simply put tar command in cron and logrotate will ensure preservation of 30 last copies of /etc/ directory. This way you can avoid writing script that take care of all those actions.
Backups are usually run in one of three general forms:
Typically, a full backup is coupled with a series of either differential backups or incremental backups, but not both. For example, a full backup could be run once per week with six daily differential backups on the remaining days. Using this scheme, a restoration is possible from the full backup media and the most recent differential backup media. Using incremental backups in the same scenario, the full backup media and all incremental backup media would be required to restore the system. The choice between the two is related mainly to the trade-off between media consumption (incremental backup requires slightly more space) versus backup time (differential backup takes longer, particularly on heavily used systems).
For large organizations that require retention of historical data, a backup scheme longer than a week is created. Incremental or differential backup media are retained for a few weeks, after which the tapes are reformatted and reused. Full backup media are retained for an extended period, in some cases permanently. At the very least, one full backup from each month should be retained for a year or at least six months.
A backup scheme such as this is called a media rotation scheme , because media are continually written, retained for a defined period, and then reused. The media themselves are said to belong to a media pool, which defines the monthly full, the weekly full, and differential or incremental media assignments, as well as when media can be reused. When media with full backups are removed from the pool for long-term storage, new media join the pool, keeping the size of the pool constant. Media may also be removed from the pool if your organization chooses to limit the number of uses media are allowed, assuming that reliability goes down as the number of passes through a tape mechanism increases.
The architecture of the backup scheme depends of pattern o usage of computers and type of data that are backups. It is clear that backing up each week a set of vendor ISOs that are stored on the server but that that never change is a waist of space and time (but that how typically large organizations operate).
On systems in which many people frequently update mission-critical data, a more conservative backup scheme is essential with a larger percentage of full backups. For casual-use systems, such as desktop PCs, only a weekly full backup usually is needed (but you will be surprised how much critical data in an organization is stored outside of servers on user desktops).
To be effective, backup media must be tested to ensure successful restoration of files. That means that any backup scheme must also include a systematic, recurrent procedure of backup verification in which recently written backup media are tested for successful restore operations. This is especially important if failures are almost non-existent and backup was not used for restoration for a period over a month. This create complacency for which organization can pay dearly.
Such a test requires a "restore" server and involves restoration of a select group of files on a periodic basis for testing. For example for OS partitions /boot and /etc directory tests gives pretty good information about quality of backup. Some organizations practice restoration of set of file selected randomly. In any case period test of full backup is also necessary, but can be performed more rarely. So the scheme of testing can replicate the scheme of backups.
The verification is as critical as backup process. Discovery that your backup data are unusable in a case of catastrophic event is too high price to pay for avoiding tiny load associated with regular backup verification.
Without a solid proof that your backups are reliable and they can be used in case of data loss you actually do not have a backup. You are just playing a game of chance, backup lottery so to speak.
Two important new capabilities is availability of SSD drives and internal SD card based storage on servers such as Dell vFlash.
Dell vFlash media is standard type 10 SD card which inserts into the iDRAC enterprise daughter card iether on the back of the server or in a slot on the front panel (for the Dell 12th generation rack (for example R620) and tower servers). Blades must be removed from their chassis to access this port. You can use Dell-provided SD cards or your own as long as you have the iDRAC Enterprise license.
The SD card can be up to 16GB. It is accessible from the server front panel. It emulates USB flash storage to the operating system (OS), but unlike regular USB drive its content can be updated remotely through the iDRAC GUI using DRAC network interface.
It is very convenient for backup of configuration files such as file within /etc directory. For small servers storage of full backup of OS partitions is also feasible. \
Tremendous advantage of this organization is that data are not travelling via backup segment.
In some case the server is loaded 24 x 7 and there is no time slot to perform the backup. In this case one can use so called slow backup when the process of backup is artificially slowed so that it does not kill I/O bandwidth on the server. SSD drive, although more expensive are better for this type of servers as they have much higher read speed then write speed. In case of NFS mounted storage it can be mounted "read-only" from a separate server and backup can be performed from this "backup server" creating separate network channel does does not interfere with the network streams on the servers.
with USB3 connections and SSD drive you "shadow IT" backup can be really fast ;-)
Generally network backup is more tricky then local as you need carefully plan separate backup segment and switched for so called "backup segment". And as typical server has at least four NIC cards it makes sense to use its own physically separate backup segment and additional NIC interface. As activity at night is usually the lowest backup can be scheduled on a sliding schedule, taking into account network topology starting, say, form 9 PM.
The problem arise when backup is so big that it take all night and spills into morning hours. In this case it can badly affect your network.
Here source based de-duplication has tremendous value as it minimizes network data flows and make spilling the backup into morning hours less frequent.
Quality of backup architecture in IT organization usually directly correlates with the quality of IT organization in general. Lack of IQ in organizing backups and backup of large amount of unnecessary data are sure sign of stagnant IT organization, that lost touch with reality and exits just for benefits of barely competent middle management.
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Aug 28, 2017 | superuser.com
1 Answer active oldest votes
up vote down vote favorite Trying to copy files with rsync, it complains: rsync: send_files failed to open "VirtualBox/Machines/Lubuntu/Lubuntu.vdi" \ (in media): Permission denied (13)That file is not copied. Indeed the file permissions of that file are very restrictive on the server side:
-rw------- 1 1000 1000 3133181952 Nov 1 2011 Lubuntu.vdiI call rsync with
sudo rsync -av --fake-super root@sheldon::media /mnt/mediaThe rsync daemon runs as root on the server. root can copy that file (of course). rsyncd has "fake super = yes" set in /etc/rsyncd.conf.
What can I do so that the file is copied without changing the permissions of the file on the server? rsync file-permissions
share improve this question asked Dec 29 '12 at 10:15 Torsten Bronger 207 add a comment |
If you use RSync as daemon on destination, please post grep rsync /var/log/daemon
to improve your question – F. Hauri Dec 29 '12 at 13:23
up vote down vote As you appear to have root access to both servers have you tried a: --force ? Alternatively you could bypass the rsync daemon and try a direct sync e.g.
rsync -optg --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose --recursive --delete-after --force root@sheldon::media /mnt/media
share improve this answer edited Jan 2 '13 at 10:55 answered Dec 29 '12 at 13:21 arober11 376
Using ssh means encryption, which makes things slower. --force does only affect directories, if I read the man page correctly. – Torsten Bronger Jan 1 '13 at 23:08
Unless your using ancient kit, the CPU overhead of encrypting / decrypting the traffic shouldn't be noticeable, but you will loose 10-20% of your bandwidth, through the encapsulation process. Then again 80% of a working link is better than 100% of a non working one :) – arober11 Jan 2 '13 at 10:52
do have an "ancient kit". ;-) (Slow ARM CPU on a NAS.) But I now mount the NAS with NFS and use rsync (with "sudo") locally. This solves the problem (and is even faster). However, I still think that my original problem must be solvable using the rsync protocol (remote, no ssh). – Torsten Bronger Jan 4 '13 at 7:55
Aug 28, 2017 | unix.stackexchange.com
nixnotwin , asked Sep 21 '12 at 5:11
On my Ubuntu server there are about 150 shell accounts. All usernames begin with the prefix u12.. I have root access and I am trying to copy a directory named "somefiles" to all the home directories. After copying the directory the user and group ownership of the directory should be changed to user's. Username, group and home-dir name are same. How can this be done?Gilles , answered Sep 21 '12 at 23:44
Do the copying as the target user. This will automatically make the target files. Make sure that the original files are world-readable (or at least readable by all the target users). Runchmod
afterwards if you don't want the copied files to be world-readable.getent passwd | awk -F : '$1 ~ /^u12/ {print $1}' | while IFS= read -r user; do su "$user" -c 'cp -Rp /original/location/somefiles ~/' done
Aug 28, 2017 | stackoverflow.com
up vote 10 down vote favorite 4jeffery_the_wind , asked Mar 6 '12 at 15:36
I am using rsync to replicate a web folder structure from a local server to a remote server. Both servers are ubuntu linux. I use the following command, and it works well:rsync -az /var/www/ [email protected]:/var/www/The usernames for the local system and the remote system are different. From what I have read it may not be possible to preserve all file and folder owners and groups. That is OK, but I would like to preserve owners and groups just for the www-data user, which does exist on both servers.
Is this possible? If so, how would I go about doing that?
Thanks!
** EDIT **
There is some mention of rsync being able to preserve ownership and groups on remote file syncs here: http://lists.samba.org/archive/rsync/2005-August/013203.html
** EDIT 2 **
I ended up getting the desired affect thanks to many of the helpful comments and answers here. Assuming the IP of the source machine is 10.1.1.2 and the IP of the destination machine is 10.1.1.1. I can use this line from the destination machine:
sudo rsync -az [email protected]:/var/www/ /var/www/This preserves the ownership and groups of the files that have a common user name, like www-data. Note that using
rsync
withoutsudo
does not preserve these permissions.ghoti , answered Mar 6 '12 at 19:01
You can also sudo the rsync on the target host by using the--rsync-path
option:# rsync -av --rsync-path="sudo rsync" /path/to/files user@targethost:/pathThis lets you authenticate as
user
on targethost, but still get privileged write permission throughsudo
. You'll have to modify your sudoers file on the target host to avoid sudo's request for your password.man sudoers
or runsudo visudo
for instructions and samples.You mention that you'd like to retain the ownership of files owned by www-data, but not other files. If this is really true, then you may be out of luck unless you implement
chown
or a second run ofrsync
to update permissions. There is no way to tell rsync to preserve ownership for just one user .That said, you should read about rsync's
--files-from
option.rsync -av /path/to/files user@targethost:/path find /path/to/files -user www-data -print | \ rsync -av --files-from=- --rsync-path="sudo rsync" /path/to/files user@targethost:/pathI haven't tested this, so I'm not sure exactly how piping find's output into
--files-from=-
will work. You'll undoubtedly need to experiment.xato , answered Mar 6 '12 at 15:39
As far as I know, you cannotchown
files to somebody else than you, if you are not root. So you would have torsync
using thewww-data
account, as all files will be created with the specified user as owner. So you need tochown
the files afterwards.user2485267 , answered Jun 14 '13 at 8:22
I had a similar problem and cheated the rsync command,rsync -avz --delete [email protected]:/home//domains/site/public_html/ /home/domains2/public_html && chown -R wwwusr:wwwgrp /home/domains2/public_html/
the && runs the chown against the folder when the rsync completes successfully (1x '&' would run the chown regardless of the rsync completion status)
Graham , answered Mar 6 '12 at 15:51
The root users for the local system and the remote system are different.
What does this mean? The root user is uid 0. How are they different?
Any user with read permission to the directories you want to copy can determine what usernames own what files. Only root can change the ownership of files being written .
You're currently running the command on the source machine, which restricts your writes to the permissions associated with [email protected]. Instead, you can try to run the command as root on the target machine. Your read access on the source machine isn't an issue.
So on the target machine (10.1.1.1), assuming the source is 10.1.1.2:
# rsync -az [email protected]:/var/www/ /var/www/Make sure your groups match on both machines.
Also, set up access to [email protected] using a DSA or RSA key, so that you can avoid having passwords floating around. For example, as root on your target machine, run:
# ssh-keygen -dThen take the contents of the file
/root/.ssh/id_dsa.pub
and add it to~user/.ssh/authorized_keys
on the source machine. You canssh [email protected]
as root from the target machine to see if it works. If you get a password prompt, check your error log to see why the key isn't working.ghoti , answered Mar 6 '12 at 18:54
Well, you could skip the challenges of rsync altogether, and just do this through a tar tunnel.sudo tar zcf - /path/to/files | \ ssh user@remotehost "cd /some/path; sudo tar zxf -"You'll need to set up your SSH keys as Graham described.
Note that this handles full directory copies, not incremental updates like rsync.
The idea here is that:
- you tar up your directory,
- instead of creating a tar file, you send the tar output to stdout,
- that stdout is piped through an SSH command to a receiving tar on the other host,
- but that receiving tar is run by sudo, so it has privileged write access to set usernames.
Aug 28, 2017 | superuser.com
1 Answer active oldest votes
up vote down vote favorite I'm trying to use rsync to copy a set of files from one system to another. I'm running the command as a normal user (not root). On the remote system, the files are owned by apache and when copied they are obviously owned by the local account (fred). My problem is that every time I run the rsync command, all files are re-synched even though they haven't changed. I think the issue is that rsync sees the file owners are different and my local user doesn't have the ability to change ownership to apache, but I'm not including the
-a
or-o
options so I thought this would not be checked. If I run the command as root, the files come over owned by apache and do not come a second time if I run the command again. However I can't run this as root for other reasons. Here is the command:/usr/bin/rsync --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose [email protected]:/src/dir/ /local/dirunix rsync
share improve this question edited May 2 '11 at 23:53 Gareth 13.9k 11 44 58 asked May 2 '11 at 23:43 Fred Snertz 11 add a comment |
Why can't you run rsync as root? On the remote system, does fred have read access to the apache-owned files? – chrishiestand May 3 '11 at 0:32
Ah, I left out the fact that there are ssh keys set up so that local fred can become remote root, so yes fred/root can read them. I know this is a bit convoluted but its real. – Fred Snertz May 3 '11 at 14:50
Always be careful when root can ssh into the machine. But if you have password and challenge response authentication disabled it's not as bad. – chrishiestand May 3 '11 at 17:32
up vote down vote Here's the answer to your problem: -c, --checksum This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly. The sending side generates its checksums while it is doing the file-system scan that builds the list of the available files. The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size as the corresponding sender's file: files with either a changed size or a changed checksum are selected for transfer. Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check. For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5. For older protocols, the checksum used is MD4.So run:
/usr/bin/rsync -c --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose [email protected]:/src/dir/ /local/dirNote there may be a time+disk churn tradeoff by using this option. Personally, I'd probably just sync the file's mtimes too:
/usr/bin/rsync -t --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose [email protected]:/src/dir/ /local/dir
share improve this answer edited May 3 '11 at 17:55 answered May 3 '11 at 17:48 chrishiestand 1,098 10
Awesome. Thank you. Looks like the second option is going to work for me and I found the first very interesting. – Fred Snertz May 3 '11 at 18:40
psst, hit the green checkbox to give my answer credit ;-) Thx. – chrishiestand May 12 '11 at 1:56
Aug 28, 2017 | unix.stackexchange.com
up vote 11 down vote favorite 1
Eugene Yarmash , asked Apr 24 '13 at 16:35
I have a bash script which usesrsync
to backup files in Archlinux. I noticed thatrsync
failed to copy a file from/sys
, whilecp
worked just fine:# rsync /sys/class/net/enp3s1/address /tmp rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) ERROR: address failed verification -- update discarded. rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] # cp /sys/class/net/enp3s1/address /tmp ## this worksI wonder why does
rsync
fail, and is it possible to copy the file with it?mattdm , answered Apr 24 '13 at 18:20
Rsync has code which specifically checks if a file is truncated during read and gives this error !ENODATA
. I don't know why the files in/sys
have this behavior, but since they're not real files, I guess it's not too surprising. There doesn't seem to be a way to tell rsync to skip this particular check.I think you're probably better off not rsyncing
/sys
and using specific scripts to cherry-pick out the particular information you want (like the network card address).Runium , answered Apr 25 '13 at 0:23
First off/sys
is a pseudo file system . If you look at/proc/filesystems
you will find a list of registered file systems where quite a few hasnodev
in front. This indicates they are pseudo filesystems . This means they exists on a running kernel as a RAM-based filesystem. Further they do not require a block device.$ cat /proc/filesystems nodev sysfs nodev rootfs nodev bdev ...At boot the kernel mount this system and updates entries when suited. E.g. when new hardware is found during boot or by
udev
.In
/etc/mtab
you typically find the mount by:sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0For a nice paper on the subject read Patric Mochel's – The sysfs Filesystem .
stat of /sys filesIf you go into a directory under
/sys
and do als -l
you will notice that all files has one size. Typically 4096 bytes. This is reported bysysfs
.:/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ ls -l -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type -r--r--r-- 1 root root 4096 Apr 24 20:09 address -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len ...Further you can do a
rsync vs. cpstat
on a file and notice another distinct feature; it occupies 0 blocks. Also inode of root (stat /sys) is 1./stat/fs
typically has inode 2. etc.The easiest explanation for rsync failure of synchronizing pseudo files is perhaps by example.
Say we have a file named
address
that is 18 bytes. Anls
orstat
of the file reports 4096 bytes.
rsync
- Opens file descriptor, fd.
- Uses fstat(fd) to get information such as size.
- Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm .
read_size == 4096
- Ask; read: 4096 bytes.
- A short string is read i.e. 18 bytes.
nread == 18
read_size = read_size - nread (4096 - 18 = 4078)
- Ask; read: 4078 bytes
- 0 bytes read (as first read consumed all bytes in file).
nread == 0
, line 255- Unable to read
4096
bytes. Zero out buffer.- Set error
ENODATA
.- Return.
- Report error.
- Retry. (Above loop).
- Fail.
- Report error.
- FINE.
During this process it actually reads the entire file. But with no size available it cannot validate the result – thus failure is only option.
cp
- Opens file descriptor, fd.
- Uses fstat(fd) to get information such as st_size (also uses lstat and stat).
- Check if file is likely to be sparse. That is the file has holes etc.
copy.c:1010 /* Use a heuristic to determine whether SRC_NAME contains any sparse * blocks. If the file has fewer blocks than would normally be * needed for a file of its size, then at least one of the blocks in * the file is a hole. */ sparse_src = is_probably_sparse (&src_open_sb);As
stat
reports file to have zero blocks it is categorized as sparse.- Tries to read file by extent-copy (a more efficient way to copy normal sparse files), and fails.
- Copy by sparse-copy.
- Starts out with max read size of MAXINT.
Typically18446744073709551615
bytes on a 32 bit system.- Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.)
- A short string is read i.e. 18 bytes.
- Check if a hole is needed, nope.
- Write buffer to target.
- Subtract 18 from max read size.
- Ask; read 4096 bytes.
- 0 bytes as all got consumed in first read.
- Return success.
- All OK. Update flags for file.
- FINE.
,
Might be related, but extended attribute calls will fail on sysfs:[root@hypervisor eth0]# lsattr address
lsattr: Inappropriate ioctl for device While reading flags on address
[root@hypervisor eth0]#
Looking at my strace it looks like rsync tries to pull in extended attributes by default:
22964 <... getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available)
I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn't able to find anything (
--xattrs
turns them on at the destination).
Aug 28, 2017 | ubuntuforums.org
View Full Version : [ubuntu] Rsync doesn't copy everyting
Scormen May 31st, 2009, 10:09 AM Hi all,I'm having some trouble with rsync. I'm trying to sync my local /etc directory to a remote server, but this won't work.
The problem is that it seems he doesn't copy all the files.
The local /etc dir contains 15MB of data, after a rsync, the remote backup contains only 4.6MB of data.Rsync is running by root. I'm using this command:
rsync --rsync-path="sudo rsync" -e "ssh -i /root/.ssh/backup" -avz --delete --delete-excluded -h --stats /etc [email protected]:/home/kris/backup/laptopkris
I hope someone can help.
Thanks!Kris
Scormen May 31st, 2009, 11:05 AM I found that if I do a local sync, everything goes fine.
But if I do a remote sync, it copies only 4.6MB.Any idea?
LoneWolfJack May 31st, 2009, 05:14 PM never used rsync on a remote machine, but "sudo rsync" looks wrong. you probably can't call sudo like that so the ssh connection needs to have the proper privileges for executing rsync.just an educated guess, though.
Scormen May 31st, 2009, 05:24 PM Thanks for your answer.In /etc/sudoers I have added next line, so "sudo rsync" will work.
kris ALL=NOPASSWD: /usr/bin/rsync
I also tried without --rsync-path="sudo rsync", but without success.
I have also tried on the server to pull the files from the laptop, but that doesn't work either.
LoneWolfJack May 31st, 2009, 05:30 PM in the rsync help file it says that --rsync-path is for the path to rsync on the remote machine, so my guess is that you can't use sudo there as it will be interpreted as a path.so you will have to do --rsync-path="/path/to/rsync" and make sure the ssh login has root privileges if you need them to access the files you want to sync.
--rsync-path="sudo rsync" probably fails because
a) sudo is interpreted as a path
b) the space isn't escaped
c) sudo probably won't allow itself to be called remotelyagain, this is not more than an educated guess.
Scormen May 31st, 2009, 05:45 PM I understand what you mean, so I tried also:rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc [email protected]:/home/kris/backup/laptopkris
Then I get this error:
sending incremental file list
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/pap": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/provider": Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.crt" -> "/etc/ssl/certs/ssl-cert-snakeoil.pem" failed: Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.key" -> "/etc/ssl/private/ssl-cert-snakeoil.key" failed: Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ppp/peers/provider": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ssl/private/ssl-cert-snakeoil.key": Permission denied (13)sent 86.85K bytes received 306 bytes 174.31K bytes/sec
total size is 8.71M speedup is 99.97
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1058) [sender=3.0.5]And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.
Scormen June 1st, 2009, 09:00 AM Sorry for this bump.
I'm still having the same problem.Any idea?
Thanks.
binary10 June 1st, 2009, 10:36 AM I understand what you mean, so I tried also:rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc [email protected]:/home/kris/backup/laptopkris
Then I get this error:
And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.Maybe there's a nicer way but you could place /usr/bin/rsync into a private protected area and set the owner to root place the sticky bit on it and change your rsync-path argument such like:
# on the remote side, aka [email protected]
mkdir priv-area
# protect it from normal users running a priv version of rsync
chmod 700 priv-area
cd priv-area
cp -p /usr/local/bin/rsync ./rsync-priv
sudo chown 0:0 ./rsync-priv
sudo chmod +s ./rsync-priv
ls -ltra # rsync-priv should now be 'bold-red' in bashLooking at your flags, you've specified a cvs ignore factor, ignore files that are updated on the target, and you're specifying a backup of removed files.
rsync -Cavuhzb --rsync-path="/home/kris/priv-area/rsync-priv" -e "ssh -i /root/.ssh/backup" /etc [email protected]:/home/kris/backup/laptopkris
From those qualifiers you're not going to be getting everything sync'd. It's doing what you're telling it to do.
If you really wanted to perform a like for like backup.. (not keeping stuff that's been changed/deleted from the source. I'd go for something like the following.
rsync --archive --delete --hard-links --one-file-system --acls --xattrs --dry-run -i --rsync-path="/home/kris/priv-area/rsync-priv" --rsh="ssh -i /root/.ssh/backup" /etc/ [email protected]:/home/kris/backup/laptopkris/etc/
Remove the --dry-run and -i when you're happy with the output, and it should do what you want. A word of warning, I get a bit nervous when not seeing trailing (/) on directories as it could lead to all sorts of funnies if you end up using rsync on softlinks.
Scormen June 1st, 2009, 12:19 PM Thanks for your help, binary10.I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...
Thanks.
binary10 June 1st, 2009, 01:22 PM Thanks for your help, binary10.I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...
Thanks.
Ok so I've gone back and looked at your original post, how are you calculating 15MB of data under etc - via a du -hsx /etc/ ??
I do daily drive to drive backup copies via rsync and drive to network copies.. and have used them recently for restoring.
Sure my du -hsx /etc/ reports 17MB of data of which 10MB gets transferred via an rsync. My backup drives still operate.
rsync 3.0.6 has some fixes to do with ACLs and special devices rsyncing between solaris. but I think 3.0.5 is still ok with ubuntu to ubuntu systems.
Here is my test doing exactly what you you're probably trying to do. I even check the remote end..
binary10@jsecx25:~/bin-priv$ ./rsync --archive --delete --hard-links --one-file-system --stats --acls --xattrs --human-readable --rsync-path="~/bin/rsync-priv-os-specific" --rsh="ssh" /etc/ [email protected]:/home/kris/backup/laptopkris/etc/
Number of files: 3121
Number of files transferred: 1812
Total file size: 10.04M bytes
Total transferred file size: 10.00M bytes
Literal data: 10.00M bytes
Matched data: 0 bytes
File list size: 109.26K
File list generation time: 0.002 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 10.20M
Total bytes received: 38.70Ksent 10.20M bytes received 38.70K bytes 4.09M bytes/sec
total size is 10.04M speedup is 0.98binary10@jsecx25:~/bin-priv$ sudo du -hsx /etc/
17M /etc/
binary10@jsecx25:~/bin-priv$And then on the remote system I do the du -hsx
binary10@lenovo-n200:/home/kris/backup/laptopkris/etc$ cd ..
binary10@lenovo-n200:/home/kris/backup/laptopkris$ sudo du -hsx etc
17M etc
binary10@lenovo-n200:/home/kris/backup/laptopkris$
Scormen June 1st, 2009, 01:35 PM ow are you calculating 15MB of data under etc - via a du -hsx /etc/ ??
Indeed, on my laptop I see:root@laptopkris:/home/kris# du -sh /etc/
15M /etc/If I do the same thing after a fresh sync to the server, I see:
root@server:/home/kris# du -sh /home/kris/backup/laptopkris/etc/
4.6M /home/kris/backup/laptopkris/etc/On both sides, I have installed Ubuntu 9.04, with version 3.0.5 of rsync.
So strange...
binary10 June 1st, 2009, 01:45 PM it does seem a bit odd.I'd start doing a few diffs from the outputs find etc/ -printf "%f %s %p %Y\n" | sort
And see what type of files are missing.
- edit - Added the %Y file type.
Scormen June 1st, 2009, 01:58 PM Hmm, it's going stranger.
Now I see that I have all my files on the server, but they don't have their full size (bytes).I have uploaded the files, so you can look into them.
Laptop: http://www.linuxontdekt.be/files/laptop.files
Server: http://www.linuxontdekt.be/files/server.files
binary10 June 1st, 2009, 02:16 PM If you look at the files that are different aka the ssl's they are links to local files else where aka linked to /usr and not within /etc/aka they are different on your laptop and the server
Scormen June 1st, 2009, 02:25 PM I understand that soft links are just copied, and not the "full file".But, you have run the same command to test, a few posts ago.
How is it possible that you can see the full 15MB?
binary10 June 1st, 2009, 02:34 PM I was starting to think that this was a bug with du.The de-referencing is a bit topsy.
If you rsync copy the remote backup back to a new location back onto the laptop and do the du command. I wonder if you'll end up with 15MB again.
Scormen June 1st, 2009, 03:20 PM Good tip.On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.If I go on the laptop to that new directory and do a du, it says 15MB.
binary10 June 1st, 2009, 03:34 PM Good tip.On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.If I go on the laptop to that new directory and do a du, it says 15MB.
I think you've now confirmed that RSYNC DOES copy everything.. just tht du confusing what you had expected by counting the end link sizes.
It might also think about what you're copying, maybe you need more than just /etc of course it depends on what you are trying to do with the backup :)
enjoy.
Scormen June 1st, 2009, 03:37 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?
binary10 June 1st, 2009, 04:23 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?The links were copied as links as per the design of the --archive in rsync.
The contents of the pointing links were different between your two systems. These being that that reside outside of /etc/ in /usr And so DU reporting them differently.
Scormen June 1st, 2009, 05:36 PM Okay, I got it.
Many thanks for the support, binarty10!
Scormen June 1st, 2009, 05:59 PM Just to know, is it possible to copy the data from these links as real, hard data?
Thanks.
binary10 June 2nd, 2009, 09:54 AM Just to know, is it possible to copy the data from these links as real, hard data?
Thanks.Yep absolutely
You should then look at other possibilities of:
-L, --copy-links transform symlink into referent file/dir
--copy-unsafe-links only "unsafe" symlinks are transformed
--safe-links ignore symlinks that point outside the source tree
-k, --copy-dirlinks transform symlink to a dir into referent dir
-K, --keep-dirlinks treat symlinked dir on receiver as dirbut then you'll have to start questioning why you are backing them up like that especially stuff under /etc/. If you ever wanted to restore it you'd be restoring full files and not symlinks the restore result could be a nightmare as well as create future issues (upgrades etc) let alone your backup will be significantly larger, could be 150MB instead of 4MB.
Scormen June 2nd, 2009, 10:04 AM Okay, now I'm sure what its doing :)
Is it also possible to show on a system the "real disk usage" of e.g. that /etc directory? So, without the links, that we get a output of 4.6MB.Thank you very much for your help!
binary10 June 2nd, 2009, 10:22 AM What does the following respond with.sudo du --apparent-size -hsx /etc
If you want the real answer then your result from a dry-run rsync will only be enough for you.
sudo rsync --dry-run --stats -h --archive /etc/ /tmp/etc/
Partclone is a program similar to the well-known backup utility "Partition Image" a.k.a partimage. Partclone provides utilities to save and restore used blocks on a partition and is designed for higher compatibility of the file system by using existing libraries, e.g. e2fslibs is used to read and write the ext2 partition.
Partclone now supports ext2, ext3, ext4, hfs+, reiserfs, reiser4, btrfs, vmfs3, vmfs5, xfs, jfs, ufs, ntfs, fat(12/16/32), exfat, f2fs, nilfs. The available programs are:
partclone.btrfs partclone.ext2, partclone.ext3, partclone.ext4 partclone.fat32, partclone.fat12, partclone.fat16 partclone.ntfs partclone.exfat partclone.hfsp partclone.jfs partclone.reiserfs partclone.reiser4 partclone.ufs (with SU+J) partclone.vmfs (v3 and v5) partclone.xfs partclone.f2fs partclone.nilfs2Our Team InformationProject admins: Thomas_Tsai
Co-developer: Jazz Yao-Tsung Wang, Steven Shiau, Ceasar Sun
Operating system: GNU/Linux
License: GNU General Public License (GPL)
Other related projects developed by us: Diskless Remote Boot in Linux (a.k.a DRBL), Clonezilla, Tuxboot, DRBL-winroll, Tux2live, Cloudboot.
DRBL (Diskless Remote Boot in Linux) is free software, open source solution to managing the deployment of the GNU/Linux operating system across many clients. Imagine the time required to install GNU/Linux on 40, 30, or even 10 client machines individually! DRBL allows for the configuration all of your client computers by installing just one server (remember, not just any virtual private server) machine.DRBL provides a diskless or systemless environment for client machines. It works on Debian, Ubuntu, Red Hat, Fedora, CentOS and SuSE. DRBL uses distributed hardware resources and makes it possible for clients to fully access local hardware. It also includes Clonezilla, a partitioning and disk cloning utility similar to True Image® or Norton Ghost®.
The features of DRBL:
- Peacefully coexists with other OS
- Simply install DRBL on a single server and all your clients are taken care of
- Save on hardware, budget, and maintenance fees
Peacefully coexists with other OS
DRBL uses PXE/etherboot, NFS, and NIS to provide services to client machines so that it is not necessary to install GNU/Linux on the client hard drives individually. Once the server is ready to be a DRBL server, the client machines can boot via PXE/etherboot (diskless). "DRBL" does NOT touch the client hard drives, therefore, other operating systems (e.g. MS Windows) installed on the client machines will be unaffected. This could be useful in, for example, during a phased deployment of GNU/Linux where users still want to have the option of booting to Windows and running some applications only available on MS windows. DRBL allows great flexibility in the deployment of GNU/Linux.
Simply install DRBL on a single server and all your clients are taken care of
Using a standard PC, you can transform a group of client PCs into a working GNU/Linux network in some steps:
In about 30 minutes, all the client machines will be ready to run GNU/Linux and all associated packages. No more deploying client machines one by one. Just use DRBL!
- Download the DRBL package
- Run the scripts
Save on hardware, budget, and maintenance fees
Hard drives are optional for a DRBL client. Actually, the hard drive is just another moving part that creates more noise and is susceptible to failure. If a hard drive is present, the client can be configured to use it as swap space while GNU/Linux is installed and configured on the centralized boot server.
A lot of time can be saved by configuring the client settings at the boot server when using the DRBL centralized boot environment. This gives the system administrator more control over what software configurations are running on each client.
get=
user323419Yes, this is completely possible. First and foremost, you will need at least 2 USB ports available, or 1 USB port and 1 CD-Drive.You start by booting into a Live-CD version of Ubuntu with your hard-drive where it is and the target device plugged into USB. Mount your internal drive and target USB to any paths you like.
Open up a terminal and enter the following commands:
tar cp --xattrs /path/to/internal | tar x /path/to/target/usb
You can also look into doing this through a live installation and a utility called CloneZilla, but I am unsure of exactly how to use CloneZilla. The above method is what I used to copy my 128GB hard-drive's installation of Ubuntu to a 64GB flash drive.
2) Clone again the internal or external drive in its entirety to another drive:
Use the "Clonezilla" utility, mentioned in the very last paragraph of my original answer, to clone the original internal drive to another external drive to make two such external bootable drives to keep track of. v>
Feb 20, 2017 | opensource.com
Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The --link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create.Specify the previous day's target directory with this option and a new directory for today. rsync then creates today's new directory and a hard link for each file in yesterday's directory is created in today's directory. So we now have a bunch of hard links to yesterday's files in today's directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links . After creating the target directory for today with this set of hard links to yesterday's target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target.
So now our command looks like the following.
rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir
There are also times when it is desirable to exclude certain directories or files from being synchronized. For this, there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this.
rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir
Note that each file pattern you want to exclude must have a separate exclude option.
rsync can sync files with remote hosts as either the source or the target. For the next example, let's assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this.
rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir
This is the final form of my rsync backup command.
rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here.
Feb 12, 2017 | www.mikerubel.org
page last modified 2004.01.04
Updates: As of
Contentsrsync-2.5.6
, the--link-dest
option is now standard! That can be used instead of the separatecp -al
andrsync
stages, and it eliminates the ownerships/permissions bug. I now recommend using it. Also, I'm proud to report this article is mentioned in Linux Server Hacks , a new (and very good, in my opinion) O'Reilly book by compiled by Rob Flickenger.Abstract
- Abstract
- Motivation
- Using
rsync
to make a backup
- Basics
- Using the
--delete
flag- Be lazy: use
cron
- Incremental backups with
rsync
- Review of hard links
- Using
cp -al
- Putting it all together
- I'm used to
dump
ortar
! This seems backward!- Isolating the backup from the rest of the system
- The easy (bad) way
- Keep it on a separate partition
- Keep that partition on a separate disk
- Keep that disk on a separate machine
- Making the backup as read-only as possible
- Bad:
mount
/unmount
- Better:
mount
read-only most of the time- Tempting but it doesn't seem to work: the 2.4 kernel's
mount --bind
- My solution: using NFS on localhost
- Extensions: hourly, daily, and weekly snapshots
- Keep an extra script for each level
- Run it all with
cron
- Known bugs and problems
- Maintaining Permissions and Owners in the snapshots
mv
updates timestamp bug- Windows-related problems
- Appendix: my actual configuration
- Listing one:
make_snapshot.sh
- Listing two:
daily_snapshot_rotate.sh
- Sample output of
ls -l /snapshot/home
- Contributed codes
- References
- Frequently Asked Questions
This document describes a method for generating automatic rotating "snapshot"-style backups on a Unix-based system, with specific examples drawn from the author's GNU/Linux experience. Snapshot backups are a feature of some high-end industrial file servers; they create the illusion of multiple, full backups per day without the space or processing overhead. All of the snapshots are read-only, and are accessible directly by users as special system directories. It is often possible to store several hours, days, and even weeks' worth of snapshots with slightly more than 2x storage. This method, while not as space-efficient as some of the proprietary technologies (which, using special copy-on-write filesystems, can operate on slightly more than 1x storage), makes use of only standard file utilities and the common rsync program, which is installed by default on most Linux distributions. Properly configured, the method can also protect against hard disk failure, root compromises, or even back up a network of heterogeneous desktops automatically.
MotivationNote: what follows is the original sgvlug DEVSIG announcement.
Ever accidentally delete or overwrite a file you were working on? Ever lose data due to hard-disk failure? Or maybe you export shares to your windows-using friends--who proceed to get outlook viruses that twiddle a digit or two in all of their .xls files. Wouldn't it be nice if there were a
/snapshot
directory that you could go back to, which had complete images of the file system at semi-hourly intervals all day, then daily snapshots back a few days, and maybe a weekly snapshot too? What if every user could just go into that magical directory and copy deleted or overwritten files back into "reality", from the snapshot of choice, without any help from you? And what if that/snapshot
directory were read-only, like a CD-ROM, so that nothing could touch it (except maybe root, but even then not directly)?Best of all, what if you could make all of that happen automatically, using only one extra, slightly-larger, hard disk ? (Or one extra partition, which would protect against all of the above except disk failure).
In my lab, we have a proprietary NetApp file server which provides that sort of functionality to the end-users. It provides a lot of other things too, but it cost as much as a luxury SUV. It's quite appropriate for our heavy-use research lab, but it would be overkill for a home or small-office environment. But that doesn't mean small-time users have to do without!
I'll show you how I configured automatic, rotating snapshots on my $80 used Linux desktop machine (which is also a file, web, and mail server) using only a couple of one-page scripts and a few standard Linux utilities that you probably already have.
I'll also propose a related strategy which employs one (or two, for the wisely paranoid) extra low-end machines for a complete, responsible, automated backup strategy that eliminates tapes and manual labor and makes restoring files as easy as "cp".
Usingrsync
to make a backupThe
Basicsrsync
utility is a very well-known piece of GPL'd software, written originally by Andrew Tridgell and Paul Mackerras. If you have a common Linux or UNIX variant, then you probably already have it installed; if not, you can download the source code from rsync.samba.org . Rsync's specialty is efficiently synchronizing file trees across a network, but it works fine on a single machine too.Suppose you have a directory called
source
, and you want to back it up into the directorydestination
. To accomplish that, you'd use:rsync -a source/ destination/(Note: I usually also add the
-v
(verbose) flag too so thatrsync
tells me what it's doing). This command is equivalent to:cp -a source/. destination/except that it's much more efficient if there are only a few differences.
Just to whet your appetite, here's a way to do the same thing as in the example above, but with
destination
on a remote machine, over a secure shell:rsync -a -e ssh source/ [email protected]:/path/to/destination/Trailing Slashes Do Matter...SometimesThis isn't really an article about
rsync
, but I would like to take a momentary detour to clarify one potentially confusing detail about its use. You may be accustomed to commands that don't care about trailing slashes. For example, ifa
andb
are two directories, thencp -a a b
is equivalent tocp -a a/ b/
. However,rsync
does care about the trailing slash, but only on the source argument. For example, leta
andb
be two directories, with the filefoo
initially inside directorya
. Then this command:rsync -a a bproduces
b/a/foo
, whereas this command:rsync -a a/ bproduces
Using theb/foo
. The presence or absence of a trailing slash on the destination argument (b
, in this case) has no effect.--delete
flagIf a file was originally in both
source/
anddestination/
(from an earlierrsync
, for example), and you delete it fromsource/
, you probably want it to be deleted fromdestination/
on the nextrsync
. However, the default behavior is to leave the copy atdestination/
in place. Assuming you wantrsync
to delete any file fromdestination/
that is not insource/
, you'll need to use the--delete
flag:rsync -a --delete source/ destination/Be lazy: usecron
One of the toughest obstacles to a good backup strategy is human nature; if there's any work involved, there's a good chance backups won't happen. (Witness, for example, how rarely my roommate's home PC was backed up before I created this system). Fortunately, there's a way to harness human laziness: make
cron
do the work.To run the rsync-with-backup command from the previous section every morning at 4:20 AM, for example, edit the root
cron
table: (as root)
crontab -eThen add the following line:
20 4 * * * rsync -a --delete source/ destination/Finally, save the file and exit. The backup will happen every morning at precisely 4:20 AM, and root will receive the output by email. Don't copy that example verbatim, though; you should use full path names (such as
Incremental backups with/usr/bin/rsync
and/home/source/
) to remove any ambiguity.rsync
Since making a full copy of a large filesystem can be a time-consuming and expensive process, it is common to make full backups only once a week or once a month, and store only changes on the other days. These are called "incremental" backups, and are supported by the venerable old
dump
andtar
utilities, along with many others.However, you don't have to use tape as your backup medium; it is both possible and vastly more efficient to perform incremental backups with
rsync
.The most common way to do this is by using the
Review of hard linksrsync -b --backup-dir=
combination. I have seen examples of that usage here , but I won't discuss it further, because there is a better way. If you're not familiar with hard links, though, you should first start with the following review.We usually think of a file's name as being the file itself, but really the name is a hard link . A given file can have more than one hard link to itself--for example, a directory has at least two hard links: the directory name and
.
(for when you're inside it). It also has one hard link from each of its sub-directories (the..
file inside each one). If you have thestat
utility installed on your machine, you can find out how many hard links a file has (along with a bunch of other information) with the command:stat filenameHard links aren't just for directories--you can create more than one link to a regular file too. For example, if you have the file
a
, you can make a link calledb
:ln a bNow,
a
andb
are two names for the same file, as you can verify by seeing that they reside at the same inode (the inode number will be different on your machine):ls -i a 232177 a ls -i b 232177 bSo
ln a b
is roughly equivalent tocp a b
, but there are several important differences:
- The contents of the file are only stored once, so you don't use twice the space.
- If you change
a
, you're changingb
, and vice-versa.- If you change the permissions or ownership of
a
, you're changing those ofb
as well, and vice-versa.- If you overwrite
a
by copying a third file on top of it, you will also overwriteb
, unless you tellcp
to unlink before overwriting. You do this by runningcp
with the--remove-destination
flag. Notice thatrsync
always unlinks before overwriting!! . Note, added 2002.Apr.10: the previous statement applies to changes in the file contents only, not permissions or ownership.But this raises an interesting question. What happens if you
Usingrm
one of the links? The answer is thatrm
is a bit of a misnomer; it doesn't really remove a file, it just removes that one link to it. A file's contents aren't truly removed until the number of links to it reaches zero. In a moment, we're going to make use of that fact, but first, here's a word aboutcp
.cp -al
In the previous section, it was mentioned that hard-linking a file is similar to copying it. It should come as no surprise, then, that the standard GNU coreutils
cp
command comes with a-l
flag that causes it to create (hard) links instead of copies (it doesn't hard-link directories, though, which is good; you might want to think about why that is). Another handy switch for thecp
command is-a
(archive), which causes it to recurse through directories and preserve file owners, timestamps, and access permissions.Together, the combination
cp -al
makes what appears to be a full copy of a directory tree, but is really just an illusion that takes almost no space. If we restrict operations on the copy to adding or removing (unlinking) files--i.e., never changing one in place--then the illusion of a full copy is complete. To the end-user, the only differences are that the illusion-copy takes almost no disk space and almost no time to generate.2002.05.15: Portability tip: If you don't have GNU
Putting it all togethercp
installed (if you're using a different flavor of *nix, for example), you can usefind
andcpio
instead. Simply replacecp -al a b
withcd a && find . -print | cpio -dpl ../b
. Thanks to Brage Fřrland for that tip.We can combine
rsync
andcp -al
to create what appear to be multiple full backups of a filesystem without taking multiple disks' worth of space. Here's how, in a nutshell:rm -rf backup.3 mv backup.2 backup.3 mv backup.1 backup.2 cp -al backup.0 backup.1 rsync -a --delete source_directory/ backup.0/If the above commands are run once every day, then
backup.0
,backup.1
,backup.2
, andbackup.3
will appear to each be a full backup ofsource_directory/
as it appeared today, yesterday, two days ago, and three days ago, respectively--complete, except that permissions and ownerships in old snapshots will get their most recent values (thanks to J.W. Schultz for pointing this out). In reality, the extra storage will be equal to the current size ofsource_directory/
plus the total size of the changes over the last three days--exactly the same space that a full plus daily incremental backup withdump
ortar
would have taken.Update (2003.04.23): As of
rsync-2.5.6
, the--link-dest
flag is now standard. Instead of the separatecp -al
andrsync
lines above, you may now write:mv backup.0 backup.1 rsync -a --delete --link-dest=../backup.1 source_directory/ backup.0/This method is preferred, since it preserves original permissions and ownerships in the backup. However, be sure to test it--as of this writing some users are still having trouble getting
--link-dest
to work properly. Make sure you use version 2.5.7 or later.Update (2003.05.02): John Pelan writes in to suggest recycling the oldest snapshot instead of recursively removing and then re-creating it. This should make the process go faster, especially if your file tree is very large:
mv backup.3 backup.tmp mv backup.2 backup.3 mv backup.1 backup.2 mv backup.0 backup.1 mv backup.tmp backup.0 cp -al backup.1/. backup.0 rsync -a --delete source_directory/ backup.0/2003.06.02: OOPS! Rsync's link-dest option does not play well with J. Pelan's suggestion--the approach I previously had written above will result in unnecessarily large storage, because old files in backup.0 will get replaced and not linked. Please only use Dr. Pelan's directory recycling if you use the separate
I'm used tocp -al
step; if you plan to use--link-dest
, start with backup.0 empty and pristine. Apologies to anyone I've misled on this issue. Thanks to Kevin Everets for pointing out the discrepancy to me, and to J.W. Schultz for clarifying--link-dest
's behavior. Also note that I haven't fully tested the approach written above; if you have, please let me know. Until then, caveat emptor!dump
ortar
! This seems backward!The
dump
andtar
utilities were originally designed to write to tape media, which can only access files in a certain order. If you're used to their style of incremental backup,rsync
might seem backward. I hope that the following example will help make the differences clearer.Suppose that on a particular system, backups were done on Monday night, Tuesday night, and Wednesday night, and now it's Thursday.
With
dump
ortar
, the Monday backup is the big ("full") one. It contains everything in the filesystem being backed up. The Tuesday and Wednesday "incremental" backups would be much smaller, since they would contain only changes since the previous day. At some point (presumably next Monday), the administrator would plan to make another full dump.With rsync, in contrast, the Wednesday backup is the big one. Indeed, the "full" backup is always the most recent one. The Tuesday directory would contain data only for those files that changed between Tuesday and Wednesday; the Monday directory would contain data for only those files that changed between Monday and Tuesday.
A little reasoning should convince you that the
Isolating the backup from the rest of the systemrsync
way is much better for network-based backups, since it's only necessary to do a full backup once, instead of once per week. Thereafter, only the changes need to be copied. Unfortunately, you can't rsync to a tape, and that's probably why thedump
andtar
incremental backup models are still so popular. But in your author's opinion, these should never be used for network-based backups now thatrsync
is available.If you take the simple route and keep your backups in another directory on the same filesystem, then there's a very good chance that whatever damaged your data will also damage your backups. In this section, we identify a few simple ways to decrease your risk by keeping the backup data separate.
The easy (bad) wayIn the previous section, we treated
/destination/
as if it were just another directory on the same filesystem. Let's call that the easy (bad) approach. It works, but it has several serious limitations:
- If your filesystem becomes corrupted, your backups will be corrupted too.
- If you suffer a hardware failure, such as a hard disk crash, it might be very difficult to reconstruct the backups.
- Since backups preserve permissions, your users--and any programs or viruses that they run--will be able to delete files from the backup. That is bad. Backups should be read-only.
- If you run out of free space, the backup process (which runs as root) might crash the system and make it difficult to recover.
- The easy (bad) approach offers no protection if the root account is compromised.
Fortunately, there are several easy ways to make your backup more robust.
Keep it on a separate partitionIf your backup directory is on a separate partition, then any corruption in the main filesystem will not normally affect the backup. If the backup process runs out of disk space, it will fail, but it won't take the rest of the system down too. More importantly, keeping your backups on a separate partition means you can keep them mounted read-only; we'll discuss that in more detail in the next chapter.
Keep that partition on a separate diskIf your backup partition is on a separate hard disk, then you're also protected from hardware failure. That's very important, since hard disks always fail eventually, and often take your data with them. An entire industry has formed to service the needs of those whose broken hard disks contained important data that was not properly backed up.
Important : Notice, however, that in the event of hardware failure you'll still lose any changes made since the last backup. For home or small office users, where backups are made daily or even hourly as described in this document, that's probably fine, but in situations where any data loss at all would be a serious problem (such as where financial transactions are concerned), a RAID system might be more appropriate.
RAID is well-supported under Linux, and the methods described in this document can also be used to create rotating snapshots of a RAID system.
Keep that disk on a separate machineIf you have a spare machine, even a very low-end one, you can turn it into a dedicated backup server. Make it standalone, and keep it in a physically separate place--another room or even another building. Disable every single remote service on the backup server, and connect it only to a dedicated network interface on the source machine.
On the source machine, export the directories that you want to back up via read-only NFS to the dedicated interface. The backup server can mount the exported network directories and run the snapshot routines discussed in this article as if they were local. If you opt for this approach, you'll only be remotely vulnerable if:
- a remote root hole is discovered in read-only NFS, and
- the source machine has already been compromised.
I'd consider this "pretty good" protection, but if you're (wisely) paranoid, or your job is on the line, build two backup servers. Then you can make sure that at least one of them is always offline.
If you're using a remote backup server and can't get a dedicated line to it (especially if the information has to cross somewhere insecure, like the public internet), you should probably skip the NFS approach and use
rsync -e ssh
instead.It has been pointed out to me that
rsync
operates far more efficiently in server mode than it does over NFS, so if the connection between your source and backup server becomes a bottleneck, you should consider configuring the backup machine as an rsync server instead of using NFS. On the downside, this approach is slightly less transparent to users than NFS--snapshots would not appear to be mounted as a system directory, unless NFS is used in that direction, which is certainly another option (I haven't tried it yet though). Thanks to Martin Pool, a lead developer ofrsync
, for making me aware of this issue.Here's another example of the utility of this approach--one that I use. If you have a bunch of windows desktops in a lab or office, an easy way to keep them all backed up is to share the relevant files, read-only, and mount them all from a dedicated backup server using SAMBA. The backup job can treat the SAMBA-mounted shares just like regular local directories.
Making the backup as read-only as possibleIn the previous section, we discussed ways to keep your backup data physically separate from the data they're backing up. In this section, we discuss the other side of that coin--preventing user processes from modifying backups once they're made.
We want to avoid leaving the
snapshot
backup directory mounted read-write in a public place. Unfortunately, keeping it mounted read-only the whole time won't work either--the backup process itself needs write access. The ideal situation would be for the backups to be mounted read-only in a public place, but at the same time, read-write in a private directory accessible only by root, such as/root/snapshot
.There are a number of possible approaches to the challenge presented by mounting the backups read-only. After some amount of thought, I found a solution which allows root to write the backups to the directory but only gives the users read permissions. I'll first explain the other ideas I had and why they were less satisfactory.
It's tempting to keep your backup partition mounted read-only as
Bad:/snapshot
most of the time, but unmount that and remount it read-write as/root/snapshot
during the brief periods while snapshots are being made. Don't give in to temptation!.mount
/umount
A filesystem cannot be unmounted if it's busy--that is, if some process is using it. The offending process need not be owned by root to block an unmount request. So if you plan to
Better:umount
the read-only copy of the backup andmount
it read-write somewhere else, don't--any user can accidentally (or deliberately) prevent the backup from happening. Besides, even if blocking unmounts were not an issue, this approach would introduce brief intervals during which the backups would seem to vanish, which could be confusing to users.mount
read-only most of the timeA better but still-not-quite-satisfactory choice is to remount the directory read-write in place:
mount -o remount,rw /snapshot [ run backup process ] mount -o remount,ro /snapshotNow any process that happens to be in
Tempting but doesn't seem to work: the 2.4 kernel's/snapshot
when the backups start will not prevent them from happening. Unfortunately, this approach introduces a new problem--there is a brief window of vulnerability, while the backups are being made, during which a user process could write to the backup directory. Moreover, if any process opens a backup file for writing during that window, it will prevent the backup from being remounted read-only, and the backups will stay vulnerable indefinitely.mount --bind
Starting with the 2.4-series Linux kernels, it has been possible to mount a filesystem simultaneously in two different places. "Aha!" you might think, as I did. "Then surely we can mount the backups read-only in
/snapshot
, and read-write in/root/snapshot
at the same time!"Alas, no. Say your backups are on the partition
/dev/hdb1
. If you run the following commands,mount /dev/hdb1 /root/snapshot mount --bind -o ro /root/snapshot /snapshotthen (at least as of the 2.4.9 Linux kernel--updated, still present in the 2.4.20 kernel),
mount
will report/dev/hdb1
as being mounted read-write in/root/snapshot
and read-only in/snapshot
, just as you requested. Don't let the system mislead you!It seems that, at least on my system, read-write vs. read-only is a property of the filesystem, not the mount point. So every time you change the mount status, it will affect the status at every point the filesystem is mounted, even though neither
/etc/mtab
nor/proc/mounts
will indicate the change.In the example above, the second
mount
call will cause both of the mounts to become read-only, and the backup process will be unable to run. Scratch this one.Update: I have it on fairly good authority that this behavior is considered a bug in the Linux kernel, which will be fixed as soon as someone gets around to it. If you are a kernel maintainer and know more about this issue, or are willing to fix it, I'd love to hear from you!
My solution: using NFS on localhostThis is a bit more complicated, but until Linux supports
mount --bind
with different access permissions in different places, it seems like the best choice. Mount the partition where backups are stored somewhere accessible only by root, such as/root/snapshot
. Then export it, read-only, via NFS, but only to the same machine. That's as simple as adding the following line to/etc/exports
:/root/snapshot 127.0.0.1(secure,ro,no_root_squash)then start
nfs
andportmap
from/etc/rc.d/init.d/
. Finally mount the exported directory, read-only, as/snapshot
:mount -o ro 127.0.0.1:/root/snapshot /snapshotAnd verify that it all worked:
mount ... /dev/hdb1 on /root/snapshot type ext3 (rw) 127.0.0.1:/root/snapshot on /snapshot type nfs (ro,addr=127.0.0.1)At this point, we'll have the desired effect: only root will be able to write to the backup (by accessing it through
/root/snapshot
). Other users will see only the read-only/snapshot
directory. For a little extra protection, you could keep mounted read-only in/root/snapshot
most of the time, and only remount it read-write while backups are happening.Damian Menscher pointed out this CERT advisory which specifically recommends against NFS exporting to localhost, though since I'm not clear on why it's a problem, I'm not sure whether exporting the backups read-only as we do here is also a problem. If you understand the rationale behind this advisory and can shed light on it, would you please contact me? Thanks!
Extensions: hourly, daily, and weekly snapshotsWith a little bit of tweaking, we make multiple-level rotating snapshots. On my system, for example, I keep the last four "hourly" snapshots (which are taken every four hours) as well as the last three "daily" snapshots (which are taken at midnight every day). You might also want to keep weekly or even monthly snapshots too, depending upon your needs and your available space.
Keep an extra script for each levelThis is probably the easiest way to do it. I keep one script that runs every four hours to make and rotate hourly snapshots, and another script that runs once a day rotate the daily snapshots. There is no need to use rsync for the higher-level snapshots; just cp -al from the appropriate hourly one.
Run it all withcron
To make the automatic snapshots happen, I have added the following lines to root's
crontab
file:0 */4 * * * /usr/local/bin/make_snapshot.sh 0 13 * * * /usr/local/bin/daily_snapshot_rotate.shThey cause
make_snapshot.sh
to be run every four hours on the hour anddaily_snapshot_rotate.sh
to be run every day at 13:00 (that is, 1:00 PM). I have included those scripts in the appendix.If you tire of receiving an email from the
cron
process every four hours with the details of what was backed up, you can tell it to send the output ofmake_snapshot.sh
to/dev/null
, like so:0 */4 * * * /usr/local/bin/make_snapshot.sh >/dev/null 2>&1Understand, though, that this will prevent you from seeing errors if
make_snapshot.sh
cannot run for some reason, so be careful with it. Creating a third script to check for any unusual behavior in the snapshot periodically seems like a good idea, but I haven't implemented it yet. Alternatively, it might make sense to log the output of each run, by piping it throughtee
, for example. mRgOBLIN wrote in to suggest a better (and obvious, in retrospect!) approach, which is to send stdout to /dev/null but keep stderr, like so:0 */4 * * * /usr/local/bin/make_snapshot.sh >/dev/nullPresto! Now you only get mail when there's an error. :)
Appendix: my actual configurationI know that listing my actual backup configuration here is a security risk; please be kind and don't use this information to crack my site. However, I'm not a security expert, so if you see any vulnerabilities in my setup, I'd greatly appreciate your help in fixing them. Thanks!
I actually use two scripts, one for every-four-hours (hourly) snapshots, and one for every-day (daily) snapshots. I am only including the parts of the scripts that relate to backing up
/home
, since those are relevant ones here.I use the NFS-to-localhost trick of exporting
/root/snapshot
read-only as/snapshot
, as discussed above.The system has been running without a hitch for months.
Listing one:make_snapshot.sh
#!/bin/bash # ---------------------------------------------------------------------- # mikes handy rotating-filesystem-snapshot utility # ---------------------------------------------------------------------- # this needs to be a lot more general, but the basic idea is it makes # rotating backup-snapshots of /home whenever called # ---------------------------------------------------------------------- unset PATH # suggestion from H. Milz: avoid accidental use of $PATH # ------------- system commands used by this script -------------------- ID=/usr/bin/id; ECHO=/bin/echo; MOUNT=/bin/mount; RM=/bin/rm; MV=/bin/mv; CP=/bin/cp; TOUCH=/bin/touch; RSYNC=/usr/bin/rsync; # ------------- file locations ----------------------------------------- MOUNT_DEVICE=/dev/hdb1; SNAPSHOT_RW=/root/snapshot; EXCLUDES=/usr/local/etc/backup_exclude; # ------------- the script itself -------------------------------------- # make sure we're running as root if (( `$ID -u` != 0 )); then { $ECHO "Sorry, must be root. Exiting..."; exit; } fi # attempt to remount the RW mount point as RW; else abort $MOUNT -o remount,rw $MOUNT_DEVICE $SNAPSHOT_RW ; if (( $? )); then { $ECHO "snapshot: could not remount $SNAPSHOT_RW readwrite"; exit; } fi; # rotating snapshots of /home (fixme: this should be more general) # step 1: delete the oldest snapshot, if it exists: if [ -d $SNAPSHOT_RW/home/hourly.3 ] ; then \ $RM -rf $SNAPSHOT_RW/home/hourly.3 ; \ fi ; # step 2: shift the middle snapshots(s) back by one, if they exist if [ -d $SNAPSHOT_RW/home/hourly.2 ] ; then \ $MV $SNAPSHOT_RW/home/hourly.2 $SNAPSHOT_RW/home/hourly.3 ; \ fi; if [ -d $SNAPSHOT_RW/home/hourly.1 ] ; then \ $MV $SNAPSHOT_RW/home/hourly.1 $SNAPSHOT_RW/home/hourly.2 ; \ fi; # step 3: make a hard-link-only (except for dirs) copy of the latest snapshot, # if that exists if [ -d $SNAPSHOT_RW/home/hourly.0 ] ; then \ $CP -al $SNAPSHOT_RW/home/hourly.0 $SNAPSHOT_RW/home/hourly.1 ; \ fi; # step 4: rsync from the system into the latest snapshot (notice that # rsync behaves like cp --remove-destination by default, so the destination # is unlinked first. If it were not so, this would copy over the other # snapshot(s) too! $RSYNC \ -va --delete --delete-excluded \ --exclude-from="$EXCLUDES" \ /home/ $SNAPSHOT_RW/home/hourly.0 ; # step 5: update the mtime of hourly.0 to reflect the snapshot time $TOUCH $SNAPSHOT_RW/home/hourly.0 ; # and thats it for home. # now remount the RW snapshot mountpoint as readonly $MOUNT -o remount,ro $MOUNT_DEVICE $SNAPSHOT_RW ; if (( $? )); then { $ECHO "snapshot: could not remount $SNAPSHOT_RW readonly"; exit; } fi;As you might have noticed above, I have added an excludes list to the
Listing two:rsync
call. This is just to prevent the system from backing up garbage like web browser caches, which change frequently (so they'd take up space in every snapshot) but would be no loss if they were accidentally destroyed.daily_snapshot_rotate.sh
#!/bin/bash # ---------------------------------------------------------------------- # mikes handy rotating-filesystem-snapshot utility: daily snapshots # ---------------------------------------------------------------------- # intended to be run daily as a cron job when hourly.3 contains the # midnight (or whenever you want) snapshot; say, 13:00 for 4-hour snapshots. # ---------------------------------------------------------------------- unset PATH # ------------- system commands used by this script -------------------- ID=/usr/bin/id; ECHO=/bin/echo; MOUNT=/bin/mount; RM=/bin/rm; MV=/bin/mv; CP=/bin/cp; # ------------- file locations ----------------------------------------- MOUNT_DEVICE=/dev/hdb1; SNAPSHOT_RW=/root/snapshot; # ------------- the script itself -------------------------------------- # make sure we're running as root if (( `$ID -u` != 0 )); then { $ECHO "Sorry, must be root. Exiting..."; exit; } fi # attempt to remount the RW mount point as RW; else abort $MOUNT -o remount,rw $MOUNT_DEVICE $SNAPSHOT_RW ; if (( $? )); then { $ECHO "snapshot: could not remount $SNAPSHOT_RW readwrite"; exit; } fi; # step 1: delete the oldest snapshot, if it exists: if [ -d $SNAPSHOT_RW/home/daily.2 ] ; then \ $RM -rf $SNAPSHOT_RW/home/daily.2 ; \ fi ; # step 2: shift the middle snapshots(s) back by one, if they exist if [ -d $SNAPSHOT_RW/home/daily.1 ] ; then \ $MV $SNAPSHOT_RW/home/daily.1 $SNAPSHOT_RW/home/daily.2 ; \ fi; if [ -d $SNAPSHOT_RW/home/daily.0 ] ; then \ $MV $SNAPSHOT_RW/home/daily.0 $SNAPSHOT_RW/home/daily.1; \ fi; # step 3: make a hard-link-only (except for dirs) copy of # hourly.3, assuming that exists, into daily.0 if [ -d $SNAPSHOT_RW/home/hourly.3 ] ; then \ $CP -al $SNAPSHOT_RW/home/hourly.3 $SNAPSHOT_RW/home/daily.0 ; \ fi; # note: do *not* update the mtime of daily.0; it will reflect # when hourly.3 was made, which should be correct. # now remount the RW snapshot mountpoint as readonly $MOUNT -o remount,ro $MOUNT_DEVICE $SNAPSHOT_RW ; if (( $? )); then { $ECHO "snapshot: could not remount $SNAPSHOT_RW readonly"; exit; } fi;Sample output ofls -l /snapshot/home
total 28 drwxr-xr-x 12 root root 4096 Mar 28 00:00 daily.0 drwxr-xr-x 12 root root 4096 Mar 27 00:00 daily.1 drwxr-xr-x 12 root root 4096 Mar 26 00:00 daily.2 drwxr-xr-x 12 root root 4096 Mar 28 16:00 hourly.0 drwxr-xr-x 12 root root 4096 Mar 28 12:00 hourly.1 drwxr-xr-x 12 root root 4096 Mar 28 08:00 hourly.2 drwxr-xr-x 12 root root 4096 Mar 28 04:00 hourly.3Notice that the contents of each of the subdirectories of
Bugs Maintaining Permissions and Owners in the snapshots/snapshot/home/
is a complete image of/home
at the time the snapshot was made. Despite thew
in the directory access permissions, no one--not even root--can write to this directory; it's mounted read-only.The snapshot system above does not properly maintain old ownerships/permissions; if a file's ownership or permissions are changed in place, then the new ownership/permissions will apply to older snapshots as well. This is because
rsync
does not unlink files prior to changing them if the only changes are ownership/permission. Thanks to J.W. Schultz for pointing this out. Using his new--link-dest
option, it is now trivial to work around this problem. See the discussion in the Putting it all together section of Incremental backups withrsync
, above.mv
updates timestamp bugApparently, a bug in some Linux kernels between 2.4.4 and 2.4.9 causes
mv
to update timestamps; this may result in inaccurate timestamps on the snapshot directories. Thanks to Claude Felizardo for pointing this problem out. He was able to work around the problem my replacingmv
with the following script:MV=my_mv; ... function my_mv() { REF=/tmp/makesnapshot-mymv-$$; touch -r $1 $REF; /bin/mv $1 $2; touch -r $REF $2; /bin/rm $REF; }Windows-related problemsI have recently received a few reports of what appear to be interaction issues between Windows and rsync.
One report came from a user who mounts a windows share via Samba, much as I do, and had files mysteriously being deleted from the backup even when they weren't deleted from the source. Tim Burt also used this technique, and was seeing files copied even when they hadn't changed. He determined that the problem was modification time precision; adding --modify-window=10 caused rsync to behave correctly in both cases. If you are rsync'ing from a SAMBA share, you must add --modify-window=10 or you may get inconsistent results. Update: --modify-window=1 should be sufficient. Yet another update: the problem appears to still be there. Please let me know if you use this method and files which should not be deleted are deleted.
Also, for those who use rsync directly on cygwin, there are some known problems, apparently related to cygwin signal handling. Scott Evans reports that rsync sometimes hangs on large directories. Jim Kleckner informed me of an rsync patch, discussed here and here , which seems to work around this problem. I have several reports of this working, and two reports of it not working (the hangs continue). However, one of the users who reported a negative outcome, Greg Boyington, was able to get it working using Craig Barrett's suggested sleep() approach, which is documented here .
Memory use in rsync scales linearly with the number of files being sync'd. This is a problem when syncing large file trees, especially when the server involved does not have a lot of RAM. If this limitation is more of an issue to you than network speed (for example, if you copy over a LAN), you may wish to use mirrordir instead. I haven't tried it personally, but it looks promising. Thanks to Vladimir Vuksan for this tip!
Contributed codesSeveral people have been kind enough to send improved backup scripts. There are a number of good ideas here, and I hope they'll save you time when you're ready to design your own backup plan. Disclaimer: I have not necessarily tested these; make sure you check the source code and test them thoroughly before use!
References
- Art Mulder's original shell script
- Art Mulder's improved snapback Perl script, and a sample snapback.conf configuration file
- Henry Laxen's perl script
- J. P. Stewart's shell script
- Sean Herdejurgen's shell_script
- Peter Schneider-Kamp's shell script
- Rob Bos' versatile, GPL'd shell script . Update! 2002.12.13: check out his new package that makes for easier configuration and fixes a couple of bugs.
- Leland Elie's very nice GPL'd Python script, roller.py (2004.04.13: note link seems to be down). Does locking for safety, has a
/etc/roller.conf
control script which can pull from multiple machines automatically and independently.- William Stearns' rsync backup for the extremely security-conscious . I haven't played with this yet, but it looks promising!
- Geordy Kitchen's shell script , adapted from Rob Bos' above
- Tim Evans' python rbackup functions and the calling script . You'll have to rename them before using.
- Elio Pizzottelli's improved version of make_snapshot.sh , released under the GPL
- John Bowman's rlbackup utility, which (in his words) provides a simple secure mechanism for generating and recovering linked backups over the network, with historical pruning. This one makes use of the --link-dest patch, and keeps a geometric progression of snapshots instead of doing hourly/daily/weekly.
- Ben Gardiner's much-improved boglin script
- Darrel O'Pry contributes a script modified to handle mysql databases. Thanks, Darrel! He also contributes a restore script which works with Geordy Kitchen's backup script.
- Craig Jones contributes a modified and enhanced version of make_snapshot.sh.
- Here is a very schnazzy perl script from Bart Vetters with built-in POD documentation
- Stuart Sheldon has contributed mirror.dist , a substantial improvement to the original shell script.
- Aaron Freed contributed two scripts from his KludgeKollection page, snapback and snaptrol .
Frequently Asked Questions
- Rsync main site
- rdiff-backup , Ben Escoto's remote incremental backup utility
- The GNU coreutils package (which includes the part formerly known as fileutils, thanks to Nathan Rosenquist for pointing that out to me).
- dirvish , a similar but slightly more sophisticated tool from J.W. Schultz.
- rsback , a backup front-end for rsync, by Hans-Juergen Beie.
- ssync , a simple sync utility which can be used instead of rsync in certain cases. Thanks to Patrick Finerty Jr. for the link.
- bobs , the Browseable Online Backup System, with a snazzy web interface; I look forward to trying it! Thanks to Rene Rask.
- LVM , the Logical Volume Manager for Linux. In the context of LVM, snapshot means one image of the filesystem, frozen in time. Might be used in conjunction with some of the methods described on this page.
- glastree , a very nice snapshot-style backup utility from Jeremy Wohl
- mirrordir , a less memory-intensive (but more network-intensive) way to do the copying.
- A filesystem-level backup utility, rumored to be similar to Glastree and very complete and usable: storebackup . Thanks to Arthur Korn for the link!
- Gary Burd has posted a page which discusses how to use this sort of technique to back up laptops. He includes a very nice python script with it.
- Jason Rust implemented something like this in a php script called RIBS. You can find it here . Thanks Jason!
- Robie Basak pointed out to me that debian's fakeroot utility can help protect a backup server even if one of the machines it's backing up is compromised and an exploitable hole is discovered in rsync (this is a bit of a long shot, but in the backup business you really do have to be paranoid). He sent me this script along with this note explaining it.
- Michael Mayer wrote a handy and similar tutorial which is rather nicer than this one--has screenshots and everything! You can find it here .
- The rsnapshot project by Nathan Rosenquist which provides several extensions and features beyond the basic script here, and is really organized--it seems to be at a level which makes it more of a real package than a do-it-yourself hack like this page is. Check it out!
- Abe Loveless has written a howto for applying the rsync/hardlink backup strategy on the e-smith distribution .
- Mike Heins wrote Snapback2 , a highly improved adapation of Art Mulder's original script, which includes (among other features) an apache-style configuration file, multiple redundant backup destinations, and safety features.
- Poul Petersen's Wombat backup system, written in Perl, supports threading for multiple simultaneous backups.
- Q: What happens if a file is modified while the backup is taking place?
- A: In rsync, transfers are done to a temporary file, which is cut over atomically, so the transfer either happens in its entirety or not at all. Basically, rsync does "the right thing," so you won't end up with partially-backed-up files. Thanks to Filippo Carletti for pointing this out. If you absolutely need a snapshot from a single instant in time, consider using Sistina's LVM (see reference above).
- Q: I really need the original permissions and ownerships in the snapshots, and not the latest ones. How can I accomplish that?
- A: J.W. Schultz has created a --link-dest patch for rsync which takes care of the hard-linking part of this trick (instead of cp -al). It can preserve permissions and ownerships. As of
rsync-2.5.6
, it is now standard. See the discussion above.
- Q: I am backing up a cluster of machines (clients) to a backup server (server). What's the best way to pull data from each machine in the cluster?
- A: Run sshd on each machine in the cluster. Create a passwordless key pair on the server, and give the public key to each of the client machines, restricted to the rsync command only (with PermitRootLogin set to forced-commands-only in the sshd_config file).
- Q: I am backing up many different machines with user accounts not necessarily shared by the backup server. How should I handle this?
- A: Be sure to use the
--numeric-ids
option to rsync so that ownership is not confused on the restore. Thanks to Jon Jensen for this tip!
- Q: Can I see a nontrivial example involving rsync include an exclude rules?
- A: Martijn Kruissen sent in an email which includes a nice example; I've posted part of it here .
Feb 04, 2017 | www.cyberciti.biz
Build directory trees in a single commandYou can create directory trees one at a time using mkdir command by passing the -p option:
mkdir -p /jail/{dev,bin,sbin,etc,usr,lib,lib64} ls -l /jail
March 26, 2016 OSTechNix
Data is the backbone of a Company. So, performing backup on regular intervals is one of the vital role of a system administrator. Here is my favourite five backup tools that I use mostly. I won't say these are the best, but these are the backup tools which I considered first when it comes to data backup.
Let me explain some of my preferred backup tools.
1. BACULA2. FWBACKUPSBACULA is a power full backup tool . It is easy to use and efficient in recovering of loss data and damaged files in the local system and remotely. It having rich user interface( UI ) . It works on different cross platforms like windows, and Mac OS X.
Concerning about BACULA features, I can list the following:
- SD-SD replication.
- Enterprise binaries avaliable for univention.
- Restore performance improved for hard data files.
- Periodic status on running jobs in Director status report.
BACULA has the following components.
- Director – This is the application that supervises the complete bacula.
- Console – It is the communication run with BACULA Director.
- File – Used to backup the files.
- Storage – This component performs read and write operations to the storage space.
- Catlog – This application is responsible for the database log details.
- Monitor – This application allows the admin to keep an eye of the various BACULA tools.
FWBACKUPS is the easiest of all backup tools in linux. It having the rich user interface, and also it is a cross platform tool.
One of the notable feature of FWBACKUPS is remote backup. We can backup data from various systems remotely.
FWBACKUPS having some features are listed below.
3. RSYNC
- Simple Interface – Backup and restoring the documents is simple for user.
- Cross – platform – It's supports different platforms like windows, and Mac OS X. It restores the data on one system and restores into another system.
- Remote backup – All types of files can handle remotely.
- Scheduled Backups – Run a backup once or periodically.
- Speed – Backups moves faster by copying only the changes.
- Organized and clean – It takes care about organized data and removal of expired one. It list the backup to restore from which list of date.
RSYNC is a widely used tool for backups in linux. It is a command line backup tool. RSYNC is used to collect data remotely and locally. It is mainly used for automated backup. We can automate backup jobs with scripts.
Some of the notable features are listed below:
4. URBACKUP
- It can update whole directory trees and filesystems.
- It uses ssh, rsh or direct sockets as the transport.
- Supports anonymous rsync which is ideal for mirroring.
- We can set bandwidth limit and file size.
URBACKUP is a client/server backup system. It's efficient in client/server backup system for both windows and linux environments. File and image backups are made while the system is running without interrupting current process.
Here is the some features of this tool:
5. BACKUP PC
- whole partition can be saved as single directory.
- Image and file backup are made while system is running.
- Fast file and image transmission.
- Clients have the flexibility to change the settings like backup frequency. Next to no configuration.
- Web interface of URBACKUP is good in showing the status of the clients, current status of backup issues.
BACKUP PC is high performance, enterprise-grade backup tool. It is a high configurable and easy to install, use and maintain.
It reduces the cost of the disks and raid system. BACKUP PC is written in perl language and extracts data using Samba service.
It is robust, reliable, well documented and freely available as open source on Sourceforge .
Features:
- No client side software needed. The standard smb protocol is used to extract backup data.
- A powerful web interface provides log details to view log files, configuration, current status and allows user to initiate and cancelled backups and browse and restore files from backups.
- It supports mobile environment where laptops are only intermittently connected to the network and have dynamic IP address.
- Users will receive email remainders if their pc has not recently been backed up.
- Open source and freely available under GPL.
These are the top backup tools that I use mostly. What's your favourite? Let us know in the comment section below.
Thanks for stopping by.
Cheers!
Nov 05, 2016 | freecode.com
Relax-and-Recover (Rear) is a bare metal disaster recovery and system migration solution, similar to AIX mksysb or HP-UX ignite. It is composed of a modular framework and ready-to-go workflows for many common situations to produce a bootable image and restore from backup using this image. It can restore to different hardware, and can therefore be used as a migration tool as well. It supports various boot media (including tape, USB, or eSATA storage, ISO, PXE, etc.), a variety of network protocols (including SFTP, FTP, HTTP, NFS, and CIFS), as well as a multitude of backup strategies (including IBM TSM, HP DataProtector, Symantec NetBackup, Bacula, and rsync). It was designed to be easy to set up, requires no maintenance, and is there to assist when disaster strikes. Recovering from disaster is made very straight-forward by a 2-step recovery process so that it can be executed by operational teams when required. When used interactively (e.g. when used for migrating systems), menus help make decisions to restore to a new (hardware) environment.
- 1.14
- 01 Oct 2012 18:55
Release Notes: Integrated with duply/duplicity support. systemd support has been added. Various small fixes and improvements to tape support, Xen, PPC, Gentoo, Fedora, multi-arch, storage ... layout configuration, and serial console integration.
(more)
- 1.13.0
- 21 Jun 2012 16:45
Release Notes: This release adds support for multipathing, adds several improvements to distribution backward compatibility, improves ext4 support, makes various bugfixes, migrates HWADDR ... after rescovery, and includes better systemd support.
(more)
- 1.12.0
- 01 Dec 2011 23:51
Release Notes: Multi-system and multi-copy support on USB storage devices. Basic rsync backup support. More extensive exclude options. The new layout code is enabled by default. Support ... for Arch Linux. Improved multipath support. Experimental btrfs support.
(more)
- 1.11.0
- 21 Nov 2011 14:25
Release Notes: Standardization of the command line. The default is quiet output; use the option -v for the old behavior. Boot images now have a comprehensive boot menu. Support for IPv6 ... addresses. Restoring NBU backup from a point in time is supported. Support for Fedora 15 (systemd) and RHEL6/SL6. Improved handling of HP SmartArray. Support for ext4 on RHEL5/SL5. Support for Xen paravirtualization. Integration with the local GRUB menu. Boot images can now be centralized through network transfers. Support for udev on RHEL4. Many small improvements and performance enhancements.
(more)
- 1.6
- 11 Dec 2007 22:07
Release Notes: This release supports many recent distributions, including "upstart" (Ubuntu 7.10). It has more IA-64 support (RHEL5 only at the moment), better error reporting and catching, ... Debian packages (mkdeb), and improved TSM support.
(more)
www.howdididothat.info
21 August 2014
Start a backup on the CentOS machine
Add the following lines to /etc/rear/local.conf:
OUTPUT=ISO
BACKUP=NETFS
BACKUP_TYPE=incremental
BACKUP_PROG=tar
FULLBACKUPDAY="Mon"
BACKUP_URL="nfs://NFSSERVER/path/to/nfs/export/servername"
BACKUP_PROG_COMPRESS_OPTIONS="--gzip"
BACKUP_PROG_COMPRESS_SUFFIX=".gz"
BACKUP_PROG_EXCLUDE=( '/tmp/*' '/dev/shm/*' )
BACKUP_OPTIONS="nfsvers=3,nolock"OUTPUT=ISO
BACKUP=NETFS
BACKUP_TYPE=incremental
BACKUP_PROG=tar
FULLBACKUPDAY="Mon"
BACKUP_URL="nfs://NFSSERVER/path/to/nfs/export/servername"
BACKUP_PROG_COMPRESS_OPTIONS="--gzip"
BACKUP_PROG_COMPRESS_SUFFIX=".gz"
BACKUP_PROG_EXCLUDE=( '/tmp/*' '/dev/shm/*' )
BACKUP_OPTIONS="nfsvers=3,nolock"
Now make a backup[root@centos7 ~]# rear mkbackup -v
Relax-and-Recover 1.16.1 / Git
Using log file: /var/log/rear/rear-centos7.log
mkdir: created directory '/var/lib/rear/output'
Creating disk layout
Creating root filesystem layout
TIP: To login as root via ssh you need to set up /root/.ssh/authorized_keys or SSH_ROOT_PASSWORD in your configuration file
Copying files and directories
Copying binaries and libraries
Copying kernel modules
Creating initramfs
Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-centos7.iso (90M)
Copying resulting files to nfs location
Encrypting disabled
Creating tar archive '/tmp/rear.QnDt1Ehk25Vqurp/outputfs/centos7/2014-08-21-1548-F.tar.gz'
Archived 406 MiB [avg 3753 KiB/sec]OK
Archived 406 MiB in 112 seconds [avg 3720 KiB/sec]Now look on your NFS server
You'll see all the files you'll need to perform the disaster recovery.
total 499M
drwxr-x- 2 root root 4.0K Aug 21 23:51 .
drwxr-xr-x 3 root root 4.0K Aug 21 23:48 ..
-rw--- 1 root root 407M Aug 21 23:51 2014-08-21-1548-F.tar.gz
-rw--- 1 root root 2.2M Aug 21 23:51 backup.log
-rw--- 1 root root 202 Aug 21 23:49 README
-rw--- 1 root root 90M Aug 21 23:49 rear-centos7.iso
-rw--- 1 root root 161K Aug 21 23:49 rear.log
-rw--- 1 root root 0 Aug 21 23:51 selinux.autorelabel
-rw--- 1 root root 277 Aug 21 23:49 VERSION
Author: masterdam79You can also connect with me on Google+ View all posts by masterdam79
Author masterdam79/
Posted on 21 August 2014/dheeraj says:
31 August 2016 at 02:26
is it possible to give list of directories or mount points while giving mkbackup to exclude from backup. Like giving a file with list of all directories that need to be excluded ??masterdam79 says:
26 September 2016 at 21:50
Have a look at https://github.com/rear/rear/issues/216
Should be possible if you ask me.
Obnam is an easy, secure backup program. Backups can be stored on local hard disks, or online via the SSH SFTP protocol. The backup server, if used, does not require any special software, on top of SSH.
Some features that may interest you:
- Snapshot backups. Every generation looks like a complete snapshot, so you don't need to care about full versus incremental backups, or rotate real or virtual tapes.
- Data de-duplication, across files, and backup generations. If the backup repository already contains a particular chunk of data, it will be re-used, even if it was in another file in an older backup generation. This way, you don't need to worry about moving around large files, or modifying them. (However, the current implementation has some limitations: see dedup).
- Encrypted backups, using GnuPG.
See the tutorial, and below for links to the manual, which has examples of how to use Obnam.
Obnam can do push or pull backups, depending on what you need. You can run Obnam on the client, and push backups to the server, or on the server, and pull from the client over SFTP. However, access to live data over SFTP is currently somewhat limited and fragile, so it is not recommended.
Documentation
- The full manual (currently a work in progress): available as a web page and PDF.
- README (updated at release time)
- NEWS (updated at release time)
- obnam manual page
- FAQ
- Development stuff
Links
By Phil on Wednesday, July 17 2013, 16:41 - Computing
I recently discovered bittorrent-sync (btsync) and I'm in love! This post details how I've implemented this fantastic tool to build a resilient, distributed and FREE backup solution for my servers.
December 4, 2012 It's not Acronis True Image that is a problem. It's mostly the users ;-),
This review is inspired by the discussion initiated by user Dave in comments to the negative review of Acronis True Image 2013 by Tech Guy.
I would like to thank Dave for his contribution to this discussion. He managed to convert a rant into something that is interesting and educational to read. Again, thanks a lot Dave. Your efforts are greatly appreciated.
To counter negative reviews of this (actually decent and pretty valuable product) I would like to state that I am a long time user of Acronis. I regularly (daily) use the product from late 2009 to 2012 (I switched from Norton Ghost, because Symantec destroyed the product). I used Acronis 10 until recently and now I switched to Acronis 13 as I got a Windows 8 PC (never used versions in between). For all those years (and it took me probably a year to learn the features as well as strong and weak points of the program, including the different reliability of restore from the boot disk and Windows) it failed me only once. And in this case I was probably the culprit as much as the program. Acronis image is monolithic and failure of one part makes image unusable. So it is suitable only for small images, say below 60 GB where buying an additional disk drives for 1:1 copy would be expensive and not practical. I think this is a side effect of compression (format is proprietary). In any case this is a serious weakness of a product and should be taken into account by any user. IMHO image should consist of logical blocks so that if one block failed the other still can be restored. And like in real filesystems key directory data should be duplicated in the backup in several places. Currently the image is "all or nothing" proposition and that means that it is unsuitable for valuable data without verification (or several verifications) and "dry run" restorations. This "feature" also implies that you should have several "generations" of image for your safety, not a single ("the last") image.
To those who are in the field for a long time, it is clear that backup of huge amount of data is a serious, pretty difficult and expensive business. Decent equipment for backups of large amount of data is very expensive. Look at the corporate market for backup devices. So users can benefit from investment into better hardware. IMHO an enclosure with mirrored two 7200 RPM drives and iSata/USB3.0 interface is a must, if your backup size in Acronis is between 20 and 60GB (less that that can be backed up via USB 2.0). In case of 1TB drive you can store several dosens of generations of your backup without running out of space. You also need religiously verify your backups from a second computer including running "test restores" to another USB 3.0 or iSATA drive to ensure that your data are safe without loading your main computer.
If the image is bigger then 60GB I would not use imaging at all. It is safer to buy several identical harddrives (three or more if data are valuable) and copy 1:1 your partition with data into it using Acronis clone the partition feature. This way you will not face the problem of corrupt image. If drive is the same as in your laptop or desktop is also can serve as a replacement if you main drive fails (actually for heavily used laptops I recommend replacing drive each three years even if it did not fail).
Another advantage is that in this case after the first copy you can just resync the backup partition with the primary partition which is much faster and safer that pushing hundreds of gigabytes via eSata or USB3 channel.
Please note that recovery of data from a 80GB drive with, say 30GB of data from a harddrive that physically failed and does not have a usable backup can run $2-3K quite easily. There is no free lunch in this business. From this point of view $150 eSata/USB3 enclosure with two mirrored 1TB 7200 RPM (or better 10K RPM) drives for your backups is just a bargain.
This "there is no free lunch" situation has another side: any sloppy or semi-competent approach to backup will inevitably be punished. This is a typical Greek tragedy situation when the hero who considers himself invincible is destroyed by exactly same forces that led to his success in the first place. In this case this is his ability to accumulate huge amount of data.
And typically a user projects their flaws on the program. It should be this, it should be that. That's terribly naive. Program exists on a marketplace and its the customers demands which shape the program. And currently all those programs are compared on features. So its the customers who are destroying this and some other great software products by their unreasonable demands requesting various features that they are actually unable to use due to excessive complexity of the product and which are generally incompatible with an imaging program basic architecture. This is an imaging program, not a file backup program, but they try to do both.
Resulting complexity interferes with the basic and the most valuable functionality: the ability to create disk images and restore them on a partitions of different sizes.
But still, while far from perfect, the program has an OK reliability and the fact that free version is supplied both with Seagate and WD disks tells something about its quality.
My impression is that in probably 80% of backup failures the key reason of failure is the user approach to backup. So it is the user not the program that is the culprit. Only 20% are somehow related to problems with the backup program you use.
As for reliability, all Acronis users should once and forever understand that you can't have all those wonderful, complex features and have the same reliability as Unix dd.
This is a very complex program which can do amazing things such as "Try and forget". And its key feature is not what Tech Guy want it to be. The key feature that sells Acronis is the ability to restore images on partitions of different size. It is a very reliable, excellent implementation. It is able to restore NTFS partitions into different size partitions even if a user is "dumb as a doorknob" and did nothing to run chkdsk, clean NTFS, defragment files, delete junk (at least on the level of CCcleaner), remove duplicates and do other sensible things before the backup to shrink the size of the dataset and increase chances that he can restore the data. After all, this is Microsoft NTFS partitions -- a very complex modern filesystem with a lot of amazing capabilities and undocumented features. Moreover, most often users do not have a separate data partition and use just disk C for everything which is a big no, if you have a lot of data. Windows 7 actually has ability to shrink the C drive and create such a data partition out of the box. This way you can backup your data separately from your OS.
Actually heroes that backup 300GB of data to a single 1TB USB 2.0 drive using compressed Acronis image and then never verify integrity of those images before its too late represent an interesting subtype of Windows users. In a way, they get what they deserve. Kind of side effect of technology revolution that we have which creates an illusion that the restrictions of physical world no longer exists. They never lived in a world of unreliable media and failing hardrives professional sysadmins live in and thus are unable (and unwilling) to understand dangers and tradeoffs inherent in creating of a compressed image of a huge drive. Especially, if this drive is starting to have problems or OS is infected with malware and creating a backup is/was the last effort to save the data. So the first failure, when valuable data vanish, comes as a huge shock. It is actually important to get a proper lesson from such cases and do not blame the shoes when the problem is with the dancer.
Nov 04, 2016 | github.com
Relax-and-Recover is written in Bash (at least bash version 3 is needed), a language that can be used in many styles. We want to make it easier for everybody to understand the Relax-and-Recover code and subsequently to contribute fixes and enhancements.
Here is a collection of coding hints that should help to get a more consistent code base.
Don't be afraid to contribute to Relax-and-Recover even if your contribution does not fully match all this coding hints. Currently large parts of the Relax-and-Recover code are not yet in compliance with this coding hints. This is an ongoing step by step process. Nevertheless try to understand the idea behind this coding hints so that you know how to break them properly (i.e. "learn the rules so you know how to break them properly").
The overall idea behind this coding hints is:
Make yourself understoodMake yourself understood to enable others to fix and enhance your code properly as needed.
From this overall idea the following coding hints are derived.
For the fun of it an extreme example what coding style should be avoided:
#!/bin/bash for i in `seq 1 2 $((2*$1-1))`;do echo $((j+=i));done
Try to find out what that code is about - it does a useful thing.
Code must be easy to readCode should be easy to understand
- Variables and functions must have names that explain what they do, even if it makes them longer. Avoid too short names, in particular do not use one-letter-names (like a variable named
i
- just try to 'grep' for it over the whole code to find code that is related toi
). In general names should consist of two parts, a generic part plus a specific part to make them meaningful. For exampledev
is basically meaningless because there are so many different kind of device-like thingies. Use names likeboot_dev
or even betterboot_partition
versusbootloader_install_device
to make it unambiguous what that thingy actually is about. Use different names for different things so that others can 'grep' over the whole code and get a correct overview what actually belongs to a particular name.- Introduce intermediate variables with meaningful names to tell what is going on.
For example instead of running commands with obfuscated arguments like
rm -f $( ls ... | sed ... | grep ... | awk ... )
which looks scaring (what the heck gets deleted here?) better usefoo_dirs="..." foo_files=$( ls $foo_dirs | sed ... | grep ... ) obsolete_foo_files=$( echo $foo_files | awk ... ) rm -f $obsolete_foo_files
that tells the intent behind (regardless whether or not that code is the best way to do it - but now others can easily improve it).- Use functions to structure longer programs into code blocks that can be understood independently.
- Don't use
||
and&&
one-liners, write proper if-then-else-fi blocks.
Exceptions are simple do-or-die statements like
COMMAND || Error "meaningful error message"
and only if it aids readability compared to a full if-then-else clause.- Use
$( COMMAND )
instead of backticks`COMMAND`
- Use spaces when possible to aid readability like
output=( $( COMMAND1 OPTION1 | COMMAND2 OPTION2 ) )
instead of
output=($(COMMAND1 OPTION1|COMMAND2 OPTION2))
Do not only tell what the code does (i.e. the implementation details) but also explain what the intent behind is (i.e. why ) to make the code maintainable.
- Provide meaningful comments that tell what the computer should do and also explain why it should do it so that others understand the intent behind so that they can properly fix issues or adapt and enhance it as needed.
- If there is a GitHub issue or another URL available for a particular piece of code provide a comment with the GitHub issue or any other URL that tells about the reasoning behind current implementation details.
Here the initial example so that one can understand what it is about:
#!/bin/bash # output the first N square numbers # by summing up the first N odd numbers 1 3 ... 2*N-1 # where each nth partial sum is the nth square number # see https://en.wikipedia.org/wiki/Square_number#Properties # this way it is a little bit faster for big N compared to # calculating each square number on its own via multiplication N=$1 if ! [[ $N =~ ^[0-9]+$ ]] ; then echo "Input must be non-negative integer." 1>&2 exit 1 fi square_number=0 for odd_number in $( seq 1 2 $(( 2 * N - 1 )) ) ; do (( square_number += odd_number )) && echo $square_number done
Now the intent behind is clear and now others can easily decide if that code is really the best way to do it and easily improve it if needed.
Try to care about possible errorsBy default bash proceeds with the next command when something failed. Do not let your code blindly proceed in case of errors because that could make it hard to find the root cause of a failure when it errors out somewhere later at an unrelated place with a weird error message which could lead to false fixes that cure only a particular symptom but not the root cause.
Maintain Backward Compatibility
- In case of errors better abort than to blindly proceed.
- At least test mandatory conditions before proceeding. If a mandatory condition is not fulfilled abort with
Error "meaningful error message"
, see 'Relax-and-Recover functions' below.- Preferably in new scripts use
set -ue
to die from unset variables and unhandled errors and useset -o pipefail
to better notice failures in a pipeline. When leaving the script restore the Relax-and-Recover default bash flags and options withapply_bash_flags_and_options_commands "$DEFAULT_BASH_FLAGS_AND_OPTIONS_COMMANDS"
see usr/sbin/rear .- TODO Use
set -eu
andset -o pipefail
also in existing scripts, see make rear working with ''set -ue -o pipefail" .Implement adaptions and enhancements in a backward compatible way so that your changes do not cause regressions for others.
Dirty hacks welcome
- One same Relax-and-Recover code must work on various different systems. On older systems as well as on newest systems and on various different Linux distributions.
- Preferably use simple generic functionality that works on any Linux system. Better very simple code than oversophisticated (possibly fragile) constructs. In particular avoid special bash version 4 features (Relax-and-Recover code should also work with bash version 3).
- When there are incompatible differences on different systems distinction of cases with separated code is needed because it is more important that the Relax-and-Recover code works everywhere than having generic code that sometimes fails.
When there are special issues on particular systems it is more important that the Relax-and-Recover code works than having nice looking clean code that sometimes fails. In such special cases any dirty hacks that intend to make it work everywhere are welcome. But for dirty hacks the above listed coding hints become mandatory rules:
- Provide explanatory comments that tell what a dirty hack does together with a GitHub issue or any other URL that tell about the reasoning behind the dirty hack to enable others to properly adapt or clean up a dirty hack at any time later when the reason for it had changed or gone away.
- Try as good as you can to foresee possible errors or failures of a dirty hack and error out with meaningful error messages if things go wrong to enable others to understand the reason behind a failure.
- Implement the dirty hack in a way so that it does not cause regressions for others.
For example a dirty hack like the following is perfectly acceptable:
# FIXME: Dirty hack to make it work # on "FUBAR Linux version 666" # where COMMAND sometimes inexplicably fails # but always works after at most 3 attempts # see http://example.org/issue12345 # Retries should have no bad effect on other systems # where the first run of COMMAND works. COMMAND || COMMAND || COMMAND || Error "COMMAND failed."
Character EncodingUse only traditional (7-bit) ASCII charactes. In particular do not use UTF-8 encoded multi-byte characters.
Text Layout
- Non-ASCII characters in scripts may cause arbitrary unexpected failures on systems that do not support other locales than POSIX/C. During "rear recover" only the POSIX/C locale works (the ReaR rescue/recovery system has no support for non-ASCII locales) and /usr/sbin/rear sets the C locale so that non-ASCII characters are invalid in scripts. Have in mind that basically all files in ReaR are scripts. E.g. also /usr/share/rear/conf/default.conf and /etc/rear/local.conf are sourced (and executed) as scripts.
- English documentation texts do not need non-ASCII characters. Using non-ASCII characters in documentation texts makes it needlessly hard to display the documentation correctly for any user on any system. When non-ASCII characters are used but the user does not have the exact right matching locale set arbitrary nonsense can happen, cf. https://en.opensuse.org/SDB:Plain_Text_versus_Locale
Variables
- Indentation with 4 blanks, not tabs.
- Block level statements in same line:
if CONDITION ; then
Functions
- Curly braces only where really needed:
$FOO
instead of${FOO}
, but${FOO:-default_foo}
.- All variables that are used in more than a single script must be all-caps:
$FOO
instead of$foo
or$Foo
.- Variables that are used only locally should be lowercased and should be marked with
local
like:
local $foo="default_value"
Relax-and-Recover functions
- Use the
function
keyword to define a function.- Function names are lower case, words separated by underline (
_
).Use the available Relax-and-Recover functions when possible instead of re-implementing basic functionality again and again. The Relax-and-Recover functions are implemented in various lib/*-functions.sh files .
test, [, [[, ((
is_true
andis_false
:
See lib/global-functions.sh how to use them.
For example instead of using
if [[ ! "$FOO" =~ ^[yY1] ]] ; then
use
if ! is_true "$FOO" ; then
Paired parenthesis
- Use
[[
where it is required (e.g. for pattern matching or complex conditionals) and[
ortest
everywhere else.((
is the preferred way for numeric comparison, variables don't need to be prefixed with$
there.See also
- Use paired parenthesis for
case
patterns as in
case WORD in (PATTERN) COMMANDS ;; esac
so that editor commands (like '%' in 'vi') that check for matching opening and closing parenthesis work everywhere in the code.
About: dobackup.pl is a flexible Perl script to handle unattended incremental backups of multiple servers. It handles multiple media sets with automatic media preparation and rotation, configurable 'what-to-backup', global per-host exclusion patterns, and user settable 'don't-back-this-up' metafiles. Its design goal is zero-maintenance, nothing to do except change the media when told.
Changes: Fixed the options the were broken during the recent Getopt::Long switchover, as well as base-2 numbers and SI (KiB, MiB, etc.) unit reporting. Some previously unimplemented options have had code added, and a usable man page has been included, along with revised documentation.
July 15, 2010 > | Channel Register
Posted in Enterprise, 10:04 GMT
You can't rely on disk drive manufactures to tell you much at all about new higher-capacity drives coming our way, but you can get hints from their OEMs. ProStor CEO Frank Herbist recently gave out a loud one.ProStor makes removable RDX disk drives, 2.5-inch drives inside a rugged case, and the InfiniVault RDX autoloader and library appliance with up to 100 slots. The company's products are a substitute for tape and it offers a 750GB RDX currently with a 1TB one coming by the end of September.
Herbist told SearchStorage that he'd have a 1.5TB RDX by the end of the year. No hard disk drive vendor has a 1.5TB 2.5-inch product publicly-announced. With stories of 3TB 3.5-inch drives being evaluated by OEMs, and 3.5-inch drives having twice the capacity of 2.5-inch ones, then a 1.5TB 2.5-inch drive in this timescale makes sense.A 1.5T RDX would equal the capacity of an LTO-5 tape and RDX products would be a viable replacement in capacity terms for all DAT and LTO installations that use single drives, autoloaders and smaller libraries.
It wouldn't be viable for 100 slot-plus tape library installations unless much bigger InfiniVaults are planned. They probably are.
A 1.5TB 2.5-inch drive would be good news for notebook users and external storage products. It would also surely find a role in enterprise storage arrays as well. At this stage all we know is that one hard disk drive OEM customer has said it's coming. It's hearsay
Here's a great HOWTO for making incremental backups with rsync over ssh. I'm using this method at my home network, as well as for some clients. Works great and has never failed me. This same HOWTO figures in O'Reilly's "Linux Hacks".
March 8, 2010 | Enterprise Networking Planet
We've all seen countless articles, blog and forum posts explaining how to back up a server with rsync and other tools. While I've cringed when people talked about using non-scalable methods, there actually is a place for quick and dirty backup mechanisms. Small companies running just a few virtual machines in the cloud, or even enterprises with test instances, may wish for a quick and effective backup.
Here is a great way to backup a one-off server, including its MySQL database. To function best with hosted virtual machines, it is important to not store backup data. The script below compresses all data and ships it across the wire to a backup server in real time, implementing bandwidth throttles to avoid pummeling the remote server. This will work on any Linux server, especially a recent Debian install.
Considerations
In Unix-land, we often worry about how various archival tools will handle different types of files. Will sparse files be preserved, or will the archive tool copy all the zeroes? Will permissions (and extended ACLs) be preserved? Will hardlinks result in two copies? All good questions, and all handled fairly well with both rsync and tar using the right options, as we will see in a moment. Next is the issue of incremental backups. The great thing about centrally-managed backup software is that it generally handles incremental backups quite well. Scripting something yourself requires you do this manually, but not to worry, I've got a few tricks to show you.
Finally, we need to decide which backup methods to use. You can take a whole disk image if your hosting provider allows it, but that makes restoring files annoying and it also results in many copies of the same data.
Using rsync for backup has problems. If we don't use --delete, which tells rsync to delete files in that archive that have been deleted on the server, then we get an ever-growing archive of all files that have ever been created. Even if they have been deleted, the backup will always have them. If we use --delete, then it may be impossible to restore accidentally deleted files the next day. Bummer. Some people work around this by starting a new backup and then deleting the old after a week or so, but that's annoying to manage.
Ideally, we'd have both the simplicity and convenience of an rsync's file system at our fingertips, along with nightly snapshots. I prefer to rsync critical file systems with --delete nightly, which usually happens very fast, and also tar up the file system for archiving.
Doing It
First, there are some strange tricks I'd like to show you with regards to shipping these backups off-site. I'm not going to provide a copy and paste script, because your paths will be different and it won't work, but I will use a script I wrote yesterday to explain every hurdle I had to overcome. This script runs on the backup server, and backs up critical file systems with rsync and a nightly tar, as well as a MySQL database. It also implements bandwidth throttling on all commands that ship data.
Related Articles
- Do Automated Cross-Platform Network Backups The Easy Way (Part 2)
- Do Automated Cross-Platform Network Backups The Easy Way
- rsync: A Backup Strategy for Modern Times
First, it is important to set some variables to avoid typos and writing confusing, redundant commands below.
#!/bin/bash
The backup user and hostname. I've configured my backup server to accept connections to my account from the root SSH key on the remote server, because this backup script will have to run as root.
BACKUP_HOST="[email protected]"
For rsync commands, use these options. I am enabling archive mode, compression, and hardlink preservation, as well as capping the bandwidth used at around 20Mb/s.
RSYNC_CMD="/usr/bin/rsync -azH --delete --bwlimit=2400"
This command is used within rsync's -e option, which is the only way to tell rsync to connect to a remote server on another port, which is required for my situation.
REMOTE_CMD="/usr/bin/ssh -p 2022"
When running tar backups, use the following options: compress, and don't use absolute paths.
TAR_CMD="/bin/tar czfP"
When I'm sending tar files over ssh, use this command to wrap the ssh command in 'trickle' to cap the bandwidth, and also connect to my special ssh port:
TAR_SSH="/usr/bin/trickle -s -u 2400 ssh -p2022"
Where backups will be stored on the remote server:
DESTDIR="/remote/path/to/backup/storage"
Echo the date and time, so that if we're logging this script output, we have a sense of order:
/bin/date
For rsync backups, the following is all that is required. The first line prints what it's about to do, for logging purposes. This will create a /etc/ directory in the specified remote backup directory, which gets synced up.
echo "running /etc backup, destination: $BACKUP_HOST"
$RSYNC_CMD -e "${REMOTE_CMD}" /etc ${BACKUP_HOST}:$DESTDIR
You can run the same commands to backup /home, /var, and /root. These are the most critical file systems, as everything else should be managed by the operating system. It may also be wise to spit out a package list and write it to a remote file in case you need to rebuild your virtual machine from scratch.
However, /var/ takes some careful consideration. I did not want to backup the MySQL directory with these file archive methods, since I was going to take a database dump anyway. Here is how to exclude it, assuming it lives in /var/lib/mysql. Notice rsync requires a relative path for --exclude:
Note: lines ending with a \ are continued on the next line; it's all one line in reality.
echo "running /var backup, destination: $BACKUP_HOST"
$RSYNC_CMD -e "${REMOTE_CMD}" --exclude="lib/mysql" /var \
${BACKUP_HOST}:$DESTDIR
Now, to get those nightly snapshots of the critical directories with tar.
First check to see if any archives older than 7 days need to be deleted:
echo "deleting old tar FS backups"
/usr/bin/ssh $BACKUP_HOST -p2022 <<HERE
find /path/to/tarfiles/ -name '*.tar.gz' -and -mtime +7 \
| xargs rm -f
HEREA heredoc probably wasn't necessary, but if you want to add more stringent checking or other commands, it's nice to simply add another line in. That 'find' command will return all files in the tar backup directory ending in .tar.gz and older than 7 days, feeding them to rm. Now we can start the real tar backup.
This next command inserts our tar command with arguments, and then provides two arguments: '-' instructing tar to send the output to stdout, and '/etc' for the directory to archive. It then pipes it to ssh, which accepts a final argument that is the command to run on the remote server. The remote server command does this: stdin is redirected to our backup directory, plus "/tars" and a file name that indicates the date. The resulting file will be called: etc-2010_03_07.tar.gz.
echo "tar /etc backup starting"
$TAR_CMD - /etc | $TAR_SSH $BACKUP_HOST \
"> ${DESTDIR}tars/etc-$(date +%Y_%m_%d).tar.gz"To ignore the potentially huge MySQL directory, which is pointless to backup when MySQL is running anyway, use these tar arguments for your /var backup:
$TAR_CMD - /var --exclude "/var/lib/mysql" | $TAR_SSH ...
For our database backups, we first check to see if any need deleting, the same way as before:
echo "deleting old tar DB backups"
/usr/bin/ssh $BACKUP_HOST -p2022 <<HERE
find /path/to/db_backups -name '*.sql.gz' -and -mtime +7 \
| xargs rm -f
HEREThen take a dump, gzip it on the fly, and write it to the remote backup location:
echo "running full DB backup"
/usr/bin/mysqldump --user=root --password='foooo' \
--all-databases | /bin/gzip | $TAR_SSH $BACKUP_HOST \
"> ${DESTDIR}db_backups/$(date +%Y_%m_%d).sql.gz"
You'll want to run this from cron, of course after you've added any other file systems or special items you need backed up.
BackerUpper is a tool similar to Apple's TimeMachine. It is intended to create snapshot-backups of selected directories or even your full hard drive. From the BackerUpper project page: "Backerupper is a simple program for backing up selected directories over a local network. Its main intended purpose is backing up a user's personal data." This article shows how to install and use BackerUpper on Ubuntu 9.04 (Jaunty Jackalope).
July 21, 2009
Whether you're in the IT industry or you're a computer power user, you need to have a backup tool at the ready. With this tool, you will need scheduled backups, one-time backups, local backups, remote backups, and many other features.
Plenty of proprietary solutions are out there. Some of them are minimal and cost effective, while others are feature-rich and costly. The open source community is no stranger to the world of backups. Here are 10 excellent backup solutions for the Linux operating system. In fact, some of these are actually cross platform and will back up Linux, Windows, and/or Mac.
Note: This article is also available as a PDF download.
- Mondorescue Mondorescue is one of those tools you have around for disaster recovery because one of its strengths is backing up an entire installation. Another strength of Mondorescue is that it can back up to nearly any medium: CD, DVD, tape, NFS, hard disk, etc. And Mondo supports LVM 1/2, RAID, ext2, ext3, ext4, JFS, XFS, ReiserFS, and VFAT. If your file system isn't listed, there is a call on the Mondo Web site to email the developers for a file system request and they will make it work. Mondo is used by large companies, such as Lockheed-Martin, so you know it's reliable
- fwbackups This is, by far, the easiest of all the Linux backup solutions. It is cross platform, has a user-friendly interface, and can do single backups or recurring scheduled backups. The fwbackups tool allows you to do backups either locally or remotely in tar, tar.gz, tar.bZ, or rsync format. You can back up an entire computer or a single file. Unlike many backup utilities, fwbackups is easy to install because it will most likely be found in your distribution's repository. Both backing up and restoring are incredibly easy (even scheduling a remote, recurring scheduled backup). You can also do incremental or differential backups to speed the process.
- Bacula Bacula is a powerful Linux backup solution, and it's one of the few Linux open source backup solutions that's truly enterprise ready. But with this enterprise readiness comes a level of complexity you might not find in any other solution. Unlike many other solutions, Bacula contains a number of components:
- Director - This is the application that supervises all of Bacula.
- Console - This is how you communicate with the Bacula Director.
- File - This is the application that's installed on the machine to be backed up.
- Storage - This application performs the reading and writing to your storage space.
- Catalog - This application is responsible for the databases used.
- Monitor - This application allows the administer to keep track of the status of the various Bacula tools.
Bacula is not the easiest backup solution to configure and use. It is, however, one of the most powerful. So if you are looking for power and aren't concerned about putting in the time to get up to speed with the configuration, Bacula is your solution.
- Rsync Rsync is one of the most widely used Linux backup solutions. With rsync, you can do flexible incremental backups, either locally or remotely. Rsync can update whole directory trees and file systems; preserve links, ownerships, permissions, and privileges; use rsh, ssh, or direct sockets for connection; and support anonymous connections. Rsync is a command-line tool, although front ends are available (such as Grsync<http://freshmeat.net/projects/grsync/>). But the front ends defeat the flexibility of having a simple command-line backup tool. One of the biggest pluses of using a command-line tool is that you can create simple scripts to use, in conjunction with cron, to create automated backups. For this, rsync is perfect.
- Simple Backup Solutio Simple Backup Solution is primarily targeted at desktop backup. It can back up files and directories and allows regular expressions to be used for exclusion purposes. Because Simple Backup Solution uses compressed archives, it is not the best solution for backing up large amounts of pre-compressed data (such as multimedia files). One of the beauties of Simple Backup Solution is that it includes predefined backup solutions that can be used to back up directories, such as /var/, /etc/, /usr/local. SBS is not limited to predefined backups. You can do custom backups, manual backups, and scheduled backups. The user interface is user friendly. One of the downfalls of SBS is that it does not include a restore solution like fwbackups does.
- Amanda Amanda allows an administrator to set up a single backup server and back up multiple hosts to it. It's robust, reliable, and flexible. Amanda uses native Linux dump and/or tar to facilitate the backup process. One nice feature is that Amanda can use Samba to back up Windows clients to the same Amanda server. It's important to note that with Amanda, there are separate applications for server and client. For the server, only Amanda is needed. For the client, the Amanda-client application must be installed.
- Arkeia Arkeia is one of the big boys in the backup industry. If you are looking for enterprise-level backup-restore solutions (and even replication server solutions) and you don't mind paying a premium, Arkeia is your tool. If you're wondering about price, the Arkeia starter pack is $1,300.00 USD - which should indicate the seriousness of this solution. Although Arkeia says it has small to midsize solutions, I think Arkeia is best suited for large business to enterprise-level needs.
- Back In Time Back In Time allows you to take snapshots of predefined directories and can do so on a schedule. This tool has an outstanding interface and integrates well with GNOME and KDE. Back In Time does a great job of creating dated snapshots that will serve as backups. However, it doesn't use any compression for the backups, nor does it include an automated restore tool. This is a desktop-only tool.
- Box Backup Box Backup is unique in that not only is it fully automated but it can use encryption to secure your backups. Box Backup uses both a client daemon and server daemon, as well as a restore utility. Box Backup uses SSL certificates to authenticate clients, so connections are secure. Although Box Backup is a command-line solution, it is simple to configure and deploy. Data directories are configured, the daemon scans those directories, and if new data is found, it is uploaded to the server. There are three components to install: bbstored (backup server daemon), bbackupd (client daemon), and bbackupquery (backup query and restore tool). Box Backup is available for Linux, OpenBSD, Windows (Native only), NetBSD, FreeBSD, Darwin (OS X), and Solaris.
January 1, 2008 | http://www2.backup-manager.org/
This is a backup program, designed to help you make daily archives of your file system.
Written in bash and perl, it can make tar, tar.gz, tar.bz2, and zip archives and can be run in a parallel mode with different configuration files. Other archives are possible: MySQL or SVN dumps, incremental backups…
Archives are kept for a given number of days and the upload system can use FTP, SSH or RSYNC to transfer the generated archives to a list of remote hosts.
The program is designed to be as easy to use as possible and is popular with desktop users and sysadmins. The whole backup process is defined in one full-documented configuration file which needs no more than 5 minutes to tune for your needs. It just works.
rdiff-backup backs up one directory to another. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special directory so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup can also operate in a bandwidth- efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back up to a remote location, and only the differences will be transmitted. It can also handle symlinks, device files, permissions, ownership, etc., so it can be used on the entire file system.
rsync user
by Richard Harris - Aug 28th 2005 15:08:19I am the author of Walker, a Python script for uploading sites via ftp and scp. Walker is very good for maintaining sites of moderate size, for use over slow connections, for users with limited resources, and for users who need customized control over the upload.
For some time I maintained two and three sites using Walker. Now I am maintaining over ten sites and their related project files. I use rsync exclusively, called from python and ruby scripts which handle mirrored and standardized directory structures across the sites, in other word the sites and dev dirs are all to the same pattern. In this way, I am able to easily maintain HTML, data, and cgi-bin files and to back-up and restore the web sites and the project development files.
DirSync is a directory synchronizer that takes a source and destination directory as arguments and recursively ensures that the two directories are identical. It can be used to create incremental copies of large chunks of data. For example, if your file server's contents are in the directory /data, you can make a copy in a directory called /backup with the command "dirsync /data /backup." The first time you run it, all data will be copied. On subsequent runs, only the changed files are copied.
Posted by admin on October 4th, 2008
rsnapshot is a filesystem backup utility based on rsync. Using rsnapshot, it is possible to take snapshots of your filesystems at different points in time. Using hard links, rsnapshot creates the illusion of multiple full backups, while only taking up the space of one full backup plus differences. When coupled with ssh, it is possible to take snapshots of remote filesystems as well.rsnapshot is written in Perl, and depends on rsync. OpenSSH, GNU cp, GNU du, and the BSD logger program are also recommended, but not required. rsnapshot is written with the lowest common denominator in mind. It only requires at minimum Perl 5.004 and rsync. As a result of this, it works on pretty much any UNIX-like system you care to throw at it.
rsnapshot can run almost out of the box with very little configuration changes although advanced configurations can be done with little more effort.
About: GNU ddrescue is a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying hard to rescue data in case of read errors. GNU ddrescue does not truncate the output file if not asked to. So, every time you run it on the same output file, it tries to fill in the gaps. The basic operation of GNU ddrescue is fully automatic. That is, you don't have to wait for an error, stop the program, read the log, run it in reverse mode, etc. If you use the logfile feature of GNU ddrescue, the data is rescued very efficiently (only the needed blocks are read). Also you can interrupt the rescue at any time and resume it later at the same point.
Changes: The new option "--domain-logfile" has been added. This release is also available in lzip format. To download the lzip version, just replace ".bz2" with ".lz" in the tar.bz2 package name.
2006-09-20
I hate making backups by hand. It costs a lot of time and usually I have far better things to do. Long ago (in the Windows 98 era) I made backups to CD only before I needed to reďnstall the OS, which was about once every 18 months, and my code projects maybe twice as often. A lot has changed since those dark times though. My single PC expanded into a network with multiple desktops and a server, I installed a mix of Debian an Ubuntu and ditched Windows, and I have a nice broadband link - just as my friends do. Finally a lazy git like me can set up a decent backup system that takes care of itself, leaving me time to do the "better" things (such as writing about it :-)
There are already quite a few tutorials on the internet explaining various ways to backup your Linux system using built-in commands and a script of some sorts, but I could not find one that suited me so I decided to write another one - one that takes care of backing up my entire network.
ns4 is a configuration management tool which allows the automated backup of just about anything, but it was designed for routers and switches. If you are able to log into it through a CLI, you can back it up. Commands are defined within a configuration file, and when they are executed, the output is sent to a series of FTP servers for archiving. As well as archiving configurations, it allows scripts to be run on nodes; this allows configurations to be applied en masse and allows conditional logic so different bits of scripts are run on different nodes.
HowTo: Install SystemRescueCD on a Dedicated Hard Disk Partition.
Introduction: Downloading and burning SystemRescueCD provides a bootable Gentoo-based distro on a CD. The installed applications focus on restoring disabled Linux/windows distros on the hard drives, or retrieving data if things go terribly wrong. You can operate direct from the booted CD. It's great. But SystemRescueCD also contains PartImage, the powerful free Linux alternative to Norton Ghost. So it's a too-easy tool for backing up single or multiple partitions, or whole drives.
OR
HowTo: No-effort Backup Solution for Partitions and Hard DrivesHere I recount HowTo install SystemRescueCD onto a dedicated partition. I include a script for creating backup images of your hard drive partitions. Once you go through this tutorial as a practical exercise, you'll have the knowledge and confidence to customise all manner of backup solutions so very easily.
This tutorial is for the middle ground reader, too hard for new Linux users and too simple for Gurus. It's drawn from material in the On Line Manual at the System Rescue CD Site.
Summary of the steps for installing SystemRescueCD on a dedicated hard disk partition:
- Prepare a separate SystemRescue partition
- Download the SystemRescueCD ISO file
- Extract bootable image files from the ISO to the boot partitiion
- Edit Suse's GRUB configuration to facilitate booting the SystemRescue partition
- Prepare and place your scripts, if any
- Boot with Suse's loader --> select item SystemRescueCd
Step 1: Prepare a separate SystemRescue partition: I leave it to you to make the partition. You need about 160Mb plus any extra storage you might need. I use 400Mb. Note that this partition becomes the root of the cdrom after SysRescueCD boots from it, so its filesystem becomes read-only. This means you will select writeable workspaces on other partitions.
Suppose for illustration that you have prepared partition hda13 for the installation. Now make a directory to mount hda13 into openSUSE, e.g. /SysRescCD. You can mount hda13 with this command:
mount /dev/hda13 /SysRescCDBUT I use the more convenient permanent mount created by placing this line into /etc/fstab:
/dev/hda13 /SysRescCD ext3 defaults 1 2You can do that with Yast --> System --> Partitioner OR more simply by issuing this command in a console: kdesu kwrite /etc/fstab and then typing the line in.
Step 2: Download the SystemRescueCD ISO file: You can download the CD ISO for the SystemRescueCD by following this project download link. The ISO filename looks like this: systemrescuecd-x86-x.y.z.iso. Place it anywhere on your hard drives, at e.g. /path_to/systemrescuecd-x86-x.y.z.iso
Step 3: Extract bootable image files from the ISO and place them in boot partition: You can mount the ISO file for viewing the files on the CD. First create a folder to mount the ISO in, e.g. /iso. Then mount the ISO with this command in a root terminal:
mount -o loop -t iso9660 /path_to/systemrescuecd-x86-0.3.6.iso /isoYou'll find these three files of special interest on these paths inside the mount folder:
- /iso/sysrcd.dat
- /iso/isolinux/rescuecd
- /iso/isolinux/rescuecd.igz
Create the folder sysrcd in the root of hda13, using the mount point /SysRescCD to place it at /SysRescCD/sysrcd. Then copy the three files into folder sysrcd. The name sysrcd is immutable.
The partition hda13 is now configured with the bootable Gentoo distro and all that remains is to point Suse's bootloader at it.
Step 4: Edit Suse's GRUB configuration to facilitate booting the SystemRescue partition: You can open the Grub configuration file in a text editor with commands like this one for Kwrite:
kdesu kwrite /boot/grub/menu.lstEdit/add these lines at the bottom of the file, one blank line below the last entry:
title SystemRescueCd
root (hd0,12)
kernel /sysrcd/rescuecd root=/dev/ram0 init=/linuxrc looptype=squashfs loop=/sysrcd/sysrcd.dat splash=silent nosound subdir=sysrcd cdroot=/dev/hda13 setkmap=us vga=0x31a
initrd /sysrcd/rescuecd.igz
bootRemember to adapt my (hd0,12) which is for my hda13, across to your situation. Also, note that the sequence beginning "kernel" and ending "0x31a" is all the same/one line. I've included three parameters at the end: cdroot=/dev/hda13 setkmap=us vga=0x31a. These set the distro upto have hda13 at the root of the cd (on /mnt/cdrom), to have the US keyboard and for a vga screen that suits me. If you wanted to boot into the Window Manager Desktop Environment to access GUI tools, you would use this line instead:
kernel /sysrcd/rescuecd root=/dev/ram0 init=/linuxrc looptype=squashfs loop=/sysrcd/sysrcd.dat splash=silent nosound subdir=sysrcd cdroot=/dev/hda13 setkmap=us vga=0 dostartxA list of boot options can be seen on this link at the SystemRescueCD site.
Step 5: Prepare and place your scripts, if any: Pre defined sites [like the floppy disk, the root of the CDROM, the root of the installation partition] are searched straight after booting for scripts which if found are executed. You can have one or many scripts. See the SystemRescueCD site for full details. I'll deal with only one location here: the root of the installation partition. It's really simple. Rust create a script called autorun and lodge it in the root of the installation partition, hda13. It will run just after the system boots to a console.
Step 6: Boot with Suse's loader: Reboot and select item SystemRescueCd. The root of partition hda13 automounts at location /mnt/cdrom in the booted-up virtual filesystem. All files and scripts placed on the partition are thus available at /mnt/cdrom.
Backup Script: I constantly change the filesystems on my primary hard drive and it's hard to prevent damage to them. So I back them up regularly. This takes a long time. I use a script called autorun in the root partition of hda13 and simply boot to SystemRescueCD on hda13 and walk away to let the job proceed. Here's my scenario and script. You could easily modify the script for your scenario.
Scenario: I have a Suse root partition at hda5 and a /home partition at hda6. These have to be backed up when they're not being used, i.e. from within another operating system. The Gentoo installation on the SystemRescueCD partition contains a script, "autorun", which employs "partimage", the Linux free version of "Ghost". It is perfect for the task. I have prepared a folder called "partimage" on partition hdb2 on IDE2 drive. The script mounts hdb2 into Gentoo/SystemRescueCD's filesystem, generates a date-coded folder on hdb2 and copies image files across from the Suse root and home partitions.
Script
#!/bin/sh
# mount the target directory
mkdir /mnt/hdb2
mount /dev/hdb2 /mnt/hdb2
# assign today's date to xx, example 071130 on 30Nov2007
xx=`date +%y%m%d`
# make a directory in the folder "partimage" on hdb2 and name it for the date
mkdir /mnt/hdb2/partimage/$xx
cd /mnt/hdb2/partimage/$xx
# write start time to a logfile
zz=$xx'logfile'
echo 'start at: '`date`>$zz
# make an image of suse102_root options: -z1=gzip -d=no description save=save_image -b=batch(not gui) -f3=quit when finished
partimage -z1 -d save -b -f3 /dev/hda5 /mnt/hdb2/partimage/$xx/hda5.partimg.gz
# make an image of /home options: -z1=gzip -d=no description save=save_image -b=batch(not gui) -f3=quit when finished
partimage -z1 -d save -b -f3 /dev/hda6 /mnt/hdb2/partimage/$xx/hda6.partimg.gz
# write contents of file autorun to a file in the target directory
cat /mnt/cdrom/autorun >>script.used
# write end time to the logfile
echo 'end at: '`date`>>$zz
# write the contents of the backup directory into the logfile
ls -l>>$zz
reboot
These are the things to customise: Change hdb2 to match your target storage partition. Change hda5 to match your root partition. Change hda6 to match your home partition. Everything else should match your system.
This is the key line:
partimage -z1 -d save -b -f3 /dev/hda5 /mnt/hdb2/partimage/$xx/hda5.partimg.gzIf you have six partitions, duplicate this line six times, replacing hda5 with your correct partition designations and hdb2 with your target storage/backup partition.
That's all there is folks, enjoy.
FauBackupWritten in c and as such maintainable only by C programmers.This Program uses a filesystem on a hard drive for incremental and full backups. All Backups can easily be accessed by standard filesystem tools (ls, find, grep, cp, ...) Re: Using Perl to help backup Linux server
|
||||||
It's written in perl, and available in Debian for sure. Faubackup is really slick in that 1) when it does a new backup, it creates links for unchanged files, saving mucho disk space and 2) it comes pre-configured (on Debian at least) to keep something like 2 yearly images, 12 monthlies, 4 weeklies, and 14 dailys. It's designed to go disk-->disk, but you can then take those backup dirs and put them on media of your choice. Very slick and small little utility -- one of my favorites. |
Google matched content |
...
CD and DVD Archiving: Quick Reference Guide for Care and Handling (NIST): http://www.itl.nist.gov/div895/carefordisc/disccare.html
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March, 29, 2020