Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Mirroring and Synchronization Tools

News Recommended Links Dir Tree Comparison and Synchronization Wget LFTP -- Scriptable FTP client Sysadmin Horror Stories
rsync Pavuk Cheap Web hosting with SSH access Humor Random Findings Etc

Mirroring is a standard tool used it to synchronize Website trees from staging to production servers and to backup key areas of the filesystems. It can be scheduled via cron and/or invoked via a CGI script.  Here is how Wekipedia defined the term:

In computing, a mirror is an exact copy of a data set. On the Internet, a mirror site is an exact copy of another Internet site. Mirror sites are most commonly used to provide multiple sources of the same information, and are of particular value as a way of providing reliable access to large downloads. Mirroring is a type of file synchronization.

A live mirror is automatically updated as soon as the original is changed.

Mirroring can be done via several protocols:

For pure Windows user Pavuk  is a good choice due to its user friendly interface.

NEWS CONTENTS

Old News ;-)

[Jun 23, 2009] LFTP - sophisticated file transfer program

LFTP is scriptable and is generally very well designed. It support wide variety of protocols beside FTP and supports mirroring. It is available on most Linux distributions and Cygwin. Can be installed from precompiled version of Solaris and other Unixes.
LFTP is sophisticated file transfer program with command line interface. It supports FTP, HTTP, FISH, SFTP, HTTPS and FTPS protocols. GNU Readline library is used for input.

Every operation in lftp is reliable, that is any non-fatal error is handled and the operation is retried automatically. So if downloading breaks, it will be restarted from the point automatically. Even if ftp server does not support REST command, lftp will try to retrieve the file from the very beginning until the file is transferred completely. This is useful for dynamic-ip machines which change their IP addresses quite often, and for sites with very bad internet connectivity.

If you exit lftp when some jobs are not finished yet, lftp will move itself to nohup mode in background. The same happens when you have a real modem hangup or when you close an xterm.

lftp has shell-like command syntax allowing you to launch several commands in parallel in background (&). It is also possible to group commands within () and execute them in background. All background jobs are executed in the same single process. You can bring a foreground job to background with ^Z (c-z) and back with command `wait' (or `fg' which is alias to `wait'). To list running jobs, use command `jobs'. Some commands allow redirecting their output (cat, ls, ...) to file or via pipe to external command. Commands can be executed conditionally based on termination status of previous command (&&, ||).

Examples:

	lftp> cat file | gzip > file.gz
	lftp> get file &
	lftp> (cd /path && get file) &
The first command retrieves file from ftp server and passes its contents to gzip which in turn stores compressed data to file.gz. Other commands show how to start commands or command groups in background.

lftp has builtin mirror which can download or update a whole directory tree. There is also reverse mirror (mirror -R) which uploads or updates a directory tree on server.

There is command `at' to launch a job at specified time in current context, command `queue' to queue commands for sequential execution for current server, and much more.

LFTP supports IPv6 for both FTP and HTTP protocols. For FTP protocol it uses method described in RFC2428.

Other low level stuff supported: ftp proxy, http proxy, ftp over http, opie/skey, fxp transfers, socks.

LFTP supports secure versions of the protocols FTP and HTTP: FTPS (explicit and implicit) and HTTPS. LFTP needs to be libked with an SSL library to support them. GNU TLS and OpenSSL are both supported as SSL backend.

If lftp was compiled with OpenSSL library, then it includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (http://www.openssl.org/)

See FEATURES for more detailed list of features.

See man page lftp(1) for more details.

[Mar 25, 2008] GNU Wget 1.11.1 by Micah J. Cowan

About: GNU Wget is a utility for non-interactive download of files from the Web. It supports HTTP and FTP protocols, as well as retrieval through HTTP proxies. It can follow HTML links, download many pages, and convert the links for local viewing. It can also mirror FTP hierarchies or only those files that have changed. Wget has been designed for robustness over slow network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved.

Changes: Interrupted downloads no longer result in renaming of the file. The progress bar now displays correctly in non-English locales (and a related assertion failure was fixed). Wget no longer issues a GET request over HTTP for files it should know it's not going to download.

[Jan 9, 2006] Pavuk - Home

Pavuk 0.9.34 Released through Sourceforge.

Alexander Bednyakov - Mass Downloader

Mass Downloader is another member of the Web Downloader family of programs. Concept and initial development were carried out by Oleg Chernavin. Alexander Bednyakov is now responsible for Mass Downloader's development.

MetaProducts Mass Downloader is a Windows 9x/NT/2000 program that allows you to download files from the Web and FTP sites at the maximum available speed. Multiple downloading channels technology significantly decreases the time necessary to download files.
Mass Downloader also allows you to browse Zip archives before loading them and to choose only the desired files to download. It has an excellent Internet Explorer-like user interface.

freshmeat.net Project details for auto_ftp.pl

auto_ftp.pl is an FTP client daemon that watches a shared folders and transfers anything put into that folder to a remote FTP site defined for that folder. It also features recursive transfers of all subdirectories. It will automatically transfer files in ASCII or binary.

freshmeat.net Project details for webcrawl

webcrawl is a program which downloads entire web sites, following links in HTML documents. Download

cURL and libcurl works for Windows, Linux, UNIX, Amiga, and OS/2.

w3mir - all purpose HTTP-copying and mirroring tool -- Perl based

The author home page w3mir homepage

Linux.DaveCentral: ECLipt Mirroring Tool

Free download manager Net Vampire -- seems unable to download the whole site -- you need to create a list of files manually. after you do this NV is a very powerful tool for retrieval. It can be sceduled, etc,

WebCopier v2.1a -- download entire WEB sites (Windows 98)

WebCopier is a first-rate, speedy offline Web browser that makes it a cinch to download entire or partial Web sites then view them offline with its integrated browser. This handsome program is very easy to use: Create a new project, select from a range of download options, and click the Start Download button. WebCopier downloads multiple files simultaneously and creates an easily navigable tree of the Web site. You can control the amount of information downloaded by overall size, number of pages, page size, and time. You can also limit the download to a single host or let it roam to other domains. WebCopier includes file filters and supports JavaScript, Java Classes, and Macromedia Flash files. It's impressively thorough in tracing URLs, parsing JavaScript to locate links often missed by other offline browsers. This fine program is free but includes Web-based advertising banners. It also lets you audition the Premium options for 14 days, an advanced edition that can be purchased for $24.95.
Reviewed on Feb 22 2001.

Folder Synchronization Examples

File Dog - Automated Internet File Transfers $39

Now, YOU can stop uploading and downloading files! File Dog is the fully automated file transfer program that does it for you.


Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites


Random Findings


Articles

PC World Online

The rsync algorithm

This report presents an algorithm for updating a file on one machine to be identical to a file on another machine. We assume that the two machines are connected by a low-bandwidth high-latency bi-directional communications link. The algorithm identifies parts of the source file which are identical to some part of the destination file, and only sends those parts which cannot be matched in this way. Effectively, the algorithm computes a set of differences without having both files on the same machine. The algorithm works best when the files are similar, but will also function correctly and reasonably efficiently when the files are quite different.

WebGrab 1.0 has been released. WebGrab is a web site mirroring utility. It can copy complete web sites including pictures, pages, software and whatever else you want to copy. WebGrab was designed to be used from batch. WebGrab can be downloaded from http://slo-lijn.slo.nl/~gerhard/software

mirror - an FTP mirroring program in Perl the mirror package, written in PERL. It's a must, and AFAIK, the one used by most ftp sites' administrators around the world. Easy to install, and does a really great job (troubleless). The latest version of mirror is available from: src.doc.ic.ac.uk [146.169.2.1] directory: computing/archiving/mirror (shortcut packages/mirror) ftp.th-darmstadt.de [130.83.55.75] directory: pub/networking/mirror ftp.sun.ac.za [146.232.213.2] directory: pub/packages/mirror archive.orst.edu [128.193.4.2] directory: pub/mirrors/src.doc.ic.ac .uk/computing/archiving/mirr or (shortcut pub/packages/mirror). see also Mirror.

w3mir version 1.05 (Perl)

w3mir homepage (see also [fm]w3mir)

w3mir is a all purpose HTTP copying and mirroring tool. The main focus of w3mir is to create and maintain a browseable copy of one, or several, remote WWW site(s). Used to the max w3mir can retreive the contents of several related sites and leave the mirror browseable via a local web server, or from a filesystem, such as directly from a CDROM.

w3mir's goal is to be able to make usefull mirrors of any reasonable WWW site. It specifically preserves link integrity within themirrored documents as well as the integrety of links outside the mirror, following redirects as needed. If you want it to. w3mir has a powerfull ``multi scope'' mechanism enabeling the user to make mirrors of several related sites and have links between them refer to the mirrored documents rather than the original site. w3mir has several features directed at getting mirrors for CDROM burning and handling of some not too often seen problems when mirroring. w3mir supports HTML4, and has partial support for CSS, Java, ActiveX and Adobe Acrobat (PDF) files. And it works on Win32
machines.


w3mir is a all purpose HTTP copying and mirroring tool. The main focus of w3mir is to create and maintain a browseable copy of one, or several, remote WWW site(s). Used to the max w3mir can retreive the contents of several related sites and leave the mirror browseable via a local web server, or from a filesystem, such as directly from a CDROM.

Language: Perl Platform: Unix

View Product Homepage

Download Complete Source Code, 0.020M bytes

Click file name to view online:
Artistic, 6243 bytes
Changes, 872 bytes
example.cfg, 3968 bytes
htmlop.pm, 20491 bytes
INSTALL, 2536 bytes
INSTALL.w32, 1590 bytes
Makefile.PL, 2322 bytes
MANIFEST, 175 bytes
multiscope.cfg, 1633 bytes
README, 5821 bytes
w3http.pm, 24716 bytes
w3mfix.PL, 28874 bytes
w3mir-HOWTO.html, 33575 bytes
w3mir.PL, 102450 bytes
w3pdfuri.pm, 1383 bytes

Wget

GNU Web and ftp mirroring program.(Rick Niles)

Harvest version 1.5


Harvest is an integrated set of tools to gather, extract, organize, search, cache, and replicate relevant information across the Internet. With modest effort users can tailor Harvest to digest information in many different formats from many different machines, and offer custom search services on the web.

Language: C/C++ Platform: Unix

View Product Homepage

Blackwidow

BlackWidow. Offline browser, download web site, site ripper, FTP, site mapping tool, site mirroring tool and a site scanner, off-line browser.

BlackWidow is a web site scanner, a site mapping tool, a site ripper, a site mirroring tool, and an offline browser program. Use it to scan a site and create a complete profile of the site's structure, files, E-mail addresses, external links and even link errors. Then use it to download the site, with its structure and files intact, to use as a site mirror or to be converted by BlackWidow into a locally linked site for offline browsing and long-term reference. Or use it to scan for and download any selection of files: from 'JPG' to 'CGI' to 'HTM' to MIME types, from small to large files, in part of a site or in a group of sites. These pre-scan filtering options can save you countless on-line hours of searching and sorting. BlackWidow will scan HTTP sites, SSL sites (HTTPS) and FTP sites.

You can download BlackWidow for free-trial or purchase it for only $39.95.

RxMirror

Native mirroring software for OS/2 RxMirror 1.2 is the mirroring software for OS/2. It is REXX-based, takes advantage of HPFS (long filenames) and RxFTP interface. This program can serve two purposes: - maintaining mirror site; - copying remote directory tree via ftp. Requirements: - OS/2 2.x, 3.x with TCP/IP protocol stack (TCP/IP 2.0, IAK, Warp CONNECT); - REXX installed; - HPFS-formatted disk. New in this version: - more close to conventional mirror packages: rxmirror reproduces timestamps and lengths exactly - no more stupid "bigger file is better"; - total size of files in the directory is reported before downloading; - option "ignoretimestamp" is now in effect; - wildcards can be used in skiplists; - "*" can be used as password (you'll be prompted for an actual password) - Russian version (can be obtained from ftp.sai.msu.su; see below). Where to get it: - Mar 07 it was uploaded into incoming directories of ftp-os2.cdrom.com, hobbes.nmsu.edu and ftp.leo.org. Filename is "rxmir12.zip". - home of RxMirror: ftp: ftp.sai.msu.su:/pub/os2/network/tcpip/ftp/rxmir12.zip ftp.sai.msu.su:/pub/os2/network/tcpip/ftp/rxmir12r.zip WWW: http://crydee.sai.msu.su/public/software/rxmirror/rxmir12.zip http://crydee.sai.msu.su/public/software/rxmirror/rxmir12r.zip Sergey Ayukov ([email protected])

Programmers Site Updater

XFiles



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater�s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright � 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March 12, 2019