Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Recovery of lost files using DD

News Unix dd command Recommended Links Netcat Remote backups using dd DD Reference
Cloning harddrives and partitions using dd DD Rescue Linux Tips  Sysadmin Horror Stories Humor Etc

Deleted files in ext2, ext3, UFS and other filesystems are usually still present of the disk after deletion as blocks were not reclaimed.  Also sometimes FAT32 partitions suffer from corrupted data structures and some files are lost (some are recovered in FOUND.00 or similar folder but often the most valuable are not, demonstrating the Murphy law in action :-(.

That means that they can be extracted from the DD image using grep and similar tools.  This is often simpler that to rely tools that scan disk directly. If the disk is close to going south you still can use DD Rescue to create an image.  There is great power in the concept "everything is a file" and this is one non-trivial demonstration of this power.  Of course scanning image with grep is just the simplest approach. Special tools called file carvers can investigate the disk image to locate known headers and footers.

The most common case is recovery of text file. In this case we can create dump of text strings from the disk and search it using grep

dd if=/dev/sdb2 | strings > recover.txt

and then grep with -A and -B options

`-A num'
`--after-context=num'
Print num lines of trailing context after matching lines.

Context should be large enough to get less false positive, but even with false positives. In you know that the file contains Unix command they are usually a good candidates for context search:

grep -A 1000 -B 50 "cpush sourcefile" recover.txt | tee cpush_page.txt

The first requirement for this operation in to remount the affected partition as read-only -- any addition writes tot he partition can be fatal to your data.  All recovery operation should be performed on the image which can be mounted via loopback interface on another computer.

To find the file often simple grep is sufficient. After that it can be extracted by using regular editor or sed. But you can go more fancy using special program specifically designed for this role called file carvers. Good description of file carvers is available at  Recovering Deleted Files - Linux Magazine Online. Here is a relevant quote:

IT experts and investigators have many reasons for reconstructing deleted files. Whether an intruder has deleted a log to conceal an attack or a user has destroyed a digital photo collection with an accidental rm -rf, you might someday face the need to recover deleted data. In the past, recovery experts could easily retrieve a lost file because an earlier generation of filesystems simply deleted the directory entry. The meta information that described the physical location of the data on the disk was preserved, and tools like The Coroner's Toolkit (TCT [1]) and The Sleuth Kit (TSK [2]) could uncover the information necessary for restoring the file.

Today, many filesystems delete the full set of meta information, leaving the data blocks. Putting these pieces together correctly is called file carving – forensic experts carve the raw data off the disk and reconstruct the files from it. The more fragmented the filesystem, the harder this task become.

Many open source tools automate the carving process: The list is headed by Foremost [3] and its derivative Scalpel [4], but other tools include PhotoRec [5] and FTimes [6]. PhotoRec does not support generic carving for any file type, and FTimes is so hard to use it is not worthwhile for most users.

Foremost and Scalpel are not interested in the underlying filesystem. They simply expect the data blocks of the files to reside sequentially in the image under investigation. The tools will find images in dd dumps, RAM dumps, or swap files. Carving will help to identify and reconstruct files on corrupt filesystems, in slack space, or even after installation of a new operating system, as long as the required data blocks still exist.

Of course, none of these tools can perform miracles, and they are not designed to retrieve data from physically damaged hard disks. Also, the carving process cannot access data blocks that have been overwritten.

Because carving tools do not rely on the filesystem, they need other sources of information to discover where a file starts and ends. Fortunately, many file types have known structures. The header and footer are often all that is needed to identify the file type and location. The Linux file command also uses header and footer information to identify file types.

File carvers investigate the whole hard disk, or disk image, to locate known headers and footers. They then carve out the blocks between the header and footer and store the data as a new file. Some file types do not possess unique footers. Carvers will at least guess where the file ends on the knowledge of where the next header starts. Of course, any amount of unidentified data could reside between the end of the file and the next header.

To avoid collecting unnecessary junk data, carving programs allow users to set maximum file sizes. Unfortunately, headers and footers are often short, which leads to numerous false positives.

Image formats are an exception. For example, each JPEG file starts with a byte sequence of 0xFFD8, typically followed by 0xFFE00010. File carvers are thus very good at identifying JPEG images. However, if some blocks have been overwritten, or if the file is fragmented, the tools will restore only a part of the file at best

  1. The Coroner's Toolkit: http://www.porcupine.org/forensics/tct.html
  2. The Sleuth Kit: http://www.sleuthkit.org
  3. Foremost: http://foremost.sf.net
  4. Scalpel: http://www.digitalforensicssolutions.com/Scalpel/
  5. PhotoRec: http://www.cgsecurity.org/wiki/PhotoRec
  6. FTimes: http://ftimes.sourceforge.net/FTimes/
  7. Foremost on the Forensics Wiki: http://www.forensicswiki.org/wiki/Foremost
  8. OCFA, The carve path zero-storage library and filesystem: http://ocfa.sourceforge.net/libcarvpath/
  9. DFRWS carving challenge: http://www.dfrws.org/2006/challenge/

Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Jan 05, 2012] Scalpel: A Frugal, High Performance File Carver

digitalforensicssolutions

Scalpel is a fast file carver that reads a database of header and footer definitions and extracts matching files or data fragments from a set of image files or raw device files. Scalpel is filesystem-independent and will carve files from FATx, NTFS, ext2/3, HFS+, or raw partitions. It is useful for both digital forensics investigation and file recovery.

Notes on Platforms

Linux

The preferred platform for using Scalpel is Linux.

Windows

Scalpel will also compile under Windows (32 or 64-bit) using mingw. If you'd like to try Scalpel on Windows without the bother of compiling it yourself, an executable and appropriate libraries are included in the distribution--just untar and go. Note that under Windows, the pthreads DLL must be present in the same directory as the Scalpel executable. Carving physical and logical devices directly under Windows (e.g., using \\.\physicaldrive0 as a target) is not supported in the current release.

Mac OS X

As of v1.53, Scalpel is supported on Mac OS X.

All platforms

As of v1.54, Scalpel supports carving files larger than 4GB on all platforms.

As of v1.60, Scalpel supports preview carving and other new carving modes. See the distribution for details.

As for v2.0, Scalpel supports regular expressions for headers and footers, minimum carve sizes, multithreading and asynchronous I/O, and beta-level support for GPU-accelerated file carving.

[Jun 08, 2011] Thomas Rude - DD and Computer Forensics

In the most basic sense, the DD command is used for copying in the UNIX environment. For simplicity, we will consider 'copy' to mean 'to duplicate exactly.' The DD command is used in the Forensics Arena to perform a physical backup of the evidence. DD can be thought of as tool - in the sense that using it is a means of building an evidence file. There are other tools which can be used when making a physical backup, such as EnCase and SafeBack. However, the intent of this paper is to give some insight on what DD is and how to use it.

What is special about the DD copy command is that it has special flags available to it that make it suitable for copying block-oriented devices, such as tapes. DD is capable of addressing these block devices sequentially. We will discuss this later. But, for now, it is good to note that this is why DD can be a powerful tool when acquiring and copying tapes for cases.

I do not want to describe each and every flag option available to DD ('man DD' can show you them). I do, however, want to detail some key flags that are very useful when copying evidence. Before we can get into these, it is imperative to understand the basic syntax of the DD command:

dd if=/*source* of=/*destination*
where:
if = infile, or evidence you are copying (a hard disk, tape, etc.)
source = source of evidence
of = outfile, or copy of evidence
destination = where you want to put the copy

For example, if our acquired evidence is /dev/hda, the following would produce an exact copy with the name of 'case10img1':
dd if=/dev/hda of=/dev/case10img1

Now that we see the basic use of DD we can look at the options which make it very suitable for copying in the UNIX environment.
As mentioned earlier, DD is very useful when copying and/or restoring block-oriented devices, such as tapes. (NOTE: DD is an excellent tool to use when copying hard disks as well. I am stressing the usage with regards to tapes because it has proved quite useful in reducing the amount of time required to copy tapes of large sizes.) There are a few options available when copying tapes (or any device). Of the options available, I have found some more useful than others. These are shown below:
ibs = input block size
obs = output block size
count = number of blocks to copy
skip = number of blocks to skip at start of input
seek = number of blocks to skip at start of output
conv = conversion

Let's say we have a 2GB hard disk seized as evidence. We will use DD to make a complete physical backup of the hard disk:
dd if=/dev/hda of=/dev/case5img1

Now let's say we have an unknown tape to examine. If we are unsure of the block size used on the tape, we could use the ibs/obs flags to find the correct size. Finding the correct size speeds up the copying process - sometimes dramatically!
dd if=/dev/st0 ibs=128 of=/dev/case10img1 obs=1 count=1
The above usage will attempt to take 1 block with size of 128 from 'st0' and create 'case10img1' output with a block size of 1. The 'count' flag is used so that only 1 block is read. We do this because we want to limit DD to just the 1 block. If we did not set a count size DD would continue on and a whole lot of time would be wasted! What this example attempts to show is that by setting the input block size to 128 we can effectively find what the real block size is (unless, of course, it is 128!). With 512 as the standard block size, assuming 128 is virtually a failproof way to find the real block size. The output of the above command would most likely be an 'error' message (which was our intent) with the real block size revealed (say 1024, for example).

Another example of DD usage is the following. Let's say we have an image which we need to chop up into smaller pieces. Perhaps our backup media is limited to 4 1GB discs and the evidence is 4GB in size. We could use DD with the flags below to create 4 images of the evidence, each 1GB in size.
dd if=/dev/st0 count=1000000 of=/dev/case10img1
dd if=/dev/st0 count=1000000 skip=1000000 of=/dev/case10img2
dd if=/dev/st0 count=1000000 skip=2000000 of=/dev/case10img3
dd if=/dev/st0 count=1000000 skip=3000000 of=/dev/case10img4
Now, we have taken the 4GB evidence tape and chopped it into 4 separate 1GB images. Each image is 1GB in size. Let's look at this example more closely. Notice that the first command takes 1GB (count=1000000) an naming the copy 'case10img1.' The second command skips the first 1GB (skip=1000000) and then copies the next 1GB (count=1000000), naming this image 'case10img2.' We can now see exactly what the 'count' and 'skip' flags do.

As you can see, DD is a very resourceful tool to use when performing physical backups of evidence. It is especially useful when working with large hard disks and/or tapes. The examples above were created to show you different ways you can get DD to work for you. As you become more familiar with it, you will find that you can do more than what I've shown above. You may even find out that DD is also quite useful when restoring evidence! I recommend that you create some evidence disks and tapes and play with DD. Read the man page on it and try the different flags. The learning curve is not steep, and the cost (free) can't be beat!

[Mar 10, 2010] mount_dd

freshmeat.net

mount_dd is a small tool for mounting a dd image with a GUI. You can mount it in read-write or read-only mode.

[Aug 30, 2003] forensics 2003-08 Using dd.exe to make forensic images of NTFS

Hi everyone,

I have tried time and time again to make images of my NTFS drives via the
dd command in windows. I use the FIRE cd forensic shell on the windows box and:

dd.exe if=\\.\f: |nc.exe <forensic machine IP> <port>

On my linux box I run:

nc -l -p <port> |dd of=/home/user/ntfs.dd

That all works fine and it makes and transfers the file but then I try to add the file in autopsy and it tells me its not an NTFS image and consequently doesn't add it.

I tried conv=noerrors and I tried just dumping the file on the linux box without dd on the of= side. I tried different NTFS partitions of different sizes as well. My linux box has the NTFS support kernel mod and everything else about autopsy works fine. Just these NTFS images. I have no probs using dd with linux partitions at all. I'd like to find a solution to this because commerical ware like Encase is outrageously expensive and dd is free making it perfect for my situation.

Thanks,
Sakaba

Wonders of 'dd' and 'netcat' Cloning OS harddrives

Anytime we think about installing OS on more than one system 'cloning' comes to our attention. Because we are too lazy :-). Well that is one of the important characteristics of Systems Administrator so that he/she is forced to automate. In this document we will try to exploit the power of low level data transfer command popularly known as 'dd' and netcat. These programs are available for all major UNIX, Linux and Windows platforms. These commands are fairly popular among Forensics Analysis professionals.

Problem Description:

You got more than 1 machine with almost same hardware. i.e. same hard drive, SCSI card, graphics etc. You setup one single machine from top to bottom and now it is time to replicate OS setup on other machines. Commercial Software such as Ghost does a good job in cloning Windows based machines and now many of these software support Linux ext2 file system also. dd although very crude but gives you enough flexibility to manipulate cloning as you wish. We have demonstrated cloning of hard drives in machines loaded with Linux, Win2K, Solaris, HPUX machines using dd . This document is not a single click solution so you may have to juggle through here first. Once you get a hold of this process then it is very powerful to create your own disk cloning schemes save lots of time and hassle.

Basic concept:

'dd' command can copy any data bit by bit from one location to another location. So a simple command
dd if=<src> of=<dst>
where, <src> and <dst> can be a file, file system partition or whole hard drive so anything which can be read/write in binary form, dd can handle it. dd however is not a network program. In order to support dd with networking feature we use another nice command 'netcat'. netcat can be used to connect any TCP/UDP servers and a very good tool for diagnostics also. A typical netcat can run both into client server mode. such as:

server% nc -l -p 30000 ==> (Listen for port 30000 on <server> )

client% nc <server> 30000 ==> (Connect to <server> at port 30000, ready to communicate)

This document will explain cloning under Linux, but concept is very similar for all other operating systems also for which 'dd' and 'netcat' binaries are available.

Operating System Cloning (Using STANDALONE machine):

Let us assume we have two drives (sda) and (sdb) attached to the system ( Example: Linux box, but can be any other OS). (sda) is drive with Master OS (let's call it Master OS drive) and (sdb) is a drive (slave drive) where we have to clone data from (sda).

Let's assume we have to clone a harddrive (sda). Which has a partition table setup below. It has 1 NTFS partition loaded with WinNT/Win2K and rest Linux partition. (swap, Linux and Raid partition). Assuming second (slave) harddrive (sdb) is also attached to the same system.

Device Boot Start End Blocks Id System
/dev/sda1 1 9 72261 83 HPFS/NTFS
/dev/sda2 10 75 530145 82 Linux swap
/dev/sda3 76 467 3148740 fd Linux raid autodetect
/dev/sda4 468 2200 13920322+ 83 Linux

A simple way to clone this drive (/dev/sda) to another drive attached to this system (/dev/sdb) is to use dd command.

dd if=/dev/sda of=/dev/sdb

This command will copy each bit from sda (Master drive) to sdb (Slave drive) including MBR (Master Boot record). Thus after cloning new drive (sdb) is ready for deployment. This will also copy any information like File System IDs etc.

Since these days drive size is getting bigger and may run upto 100+ GB, this whole dd process may take long time and obviously there is no point in cloning Linux swap area or empty partitions which doesn't contain any useful data yet. Hence in this situation it is best to clone only relevant partitions. For this you need to partition second drive beforehand.

Note: Both drives are partitioned exactly same. If you have different brand harddrives, make sure each partition on second drive must be equal to or greater than first drive partitions. Also make sure File system ID should match for second drive also.

Device Boot Start End Blocks Id System
/dev/sdb1 1 9 72261 83 HPFS/NTFS
/dev/sdb2 10 75 530145 82 Linux swap
/dev/sdb3 76 467 3148740 fd Linux raid autodetect
/dev/sdb4 468 2200 13920322+ 83 Linux

Now cloning process partition by partition will look like:

Note: here we are using whole drive sda and sdb as input and output arguments of dd. (This process of making Solaris, HPUX drives bootable may be different but they allow you to setup boot record also just like PC's MBR)

dd if=/dev/sda of=/dev/sdb bs=446 count=1

dd if=/dev/sda1 of=/dev/sdb1 ==> Clone NTFS partition
dd if=/dev/sda3 of=/dev/sdb3 ==> Clone RAID-1 partition having ext2 FS or some other.

Operating System Cloning over network:

One major bottleneck in above process is we have to physically open boxes, connect harddrive to Master box and the run clone process. This is easier in case of desktops where you have a liberty to connect external drives (IDE, SCSI bus). But Laptop can hardly house one IDE drive in general and there are no easy way to open and connect second drive for cloning. Thus above process will be highly useful if cloning process can be used over network. There are several possible combination presented here. Idea here is we have Master Linux box up and running over network and we boot slave box having harddrive which is to be cloned but we use some alternate media such as boot CD and boot slave linux using root file system on CD itself *NOT* on harddrive so that we are free to write on slave hardrive.

Master Box-----------network-----------Slave box
[ ] [NOT * using slave drive]

One of Following 3 methods can be used to boot slave box using alternative media.

Method [1] Making your own root filesystem on ext2 CDROM. (Not Scalable )

One can make a small Linux distribution (less than 650MB) which can fit into CDROM. Burn this CDROM with ext2 filesystem (not ISO9660) and then use Linux boot floppy to boot from and use CDROM ext2 file system as / (root) file system (read only) (instead of root file system on Harddrive). This process although is doable but has issues like you need to have all possible drivers for network, SCSI etc. Making your custom ext2 read only file system on CD and booting from it would be quite a trial and error issue. If you are interested in making such Cds or bootable CDs see reference section for links. I once did that to clone HP Omnibook 6000 laptops loaded with Linux+Win2K OS together and it worked pretty okay but this is not a scalable solution though.

Method [2] Using popular Linux distribution and floppy combination.

On a similar line Linux distribution such as RedHat/SuSe boot CDROM at OS install time will allow you to boot into some kind of rescue system. In case of RedHat boot from RedHat OS CD and at initial OS install prompt type 'linux rescue ' at the boot time and this will let you use CDROM as root file system and provide you a shell prompt. Linux distribution uses this facility to repair problematic Linux install but we will use this for getting just shell prompt. Great thing about this is most Linux distribution comes up with lots of popular SCSI, network drivers so you don't have to worry about cooking your custom bootable CD.

Many common utilities including 'dd' command usually available in rescue mode. However you need netcat (static binary not dynamically linked) command. You can download netcat distribution and recompile it as a static binary (use -static flag). When I compiled it is small enough to fit into one floppy. So you can copy this into floppy. (I formatted floppy in ext2 format and then mounted in Linux system, copied netcat binary there.)

mkfs /dev/fd0
mount /dev/fd0 /mnt/floppy
cp nc /mnt/floppy
umount /mnt/floppy

So with 'linux rescue' mode and netcat binary on floppy you can use dd and netcat to clone your system over network. As we will see below.

Method [3] Modifying popular Linux distribution CDs and recreating your personal bootable ISO image:

If for some reason netcat won't fit in 1 floppy or you need more utility/binaries. Then you can change Linux distribution (SuSe/RedHat CD). This is a little hack but works.

NOTE: ISO images are read-only file systems. Even if you have an iso image (Say by using dd command )
dd if=/dev/cdrom of=redhat-boot-cd.iso
and if you try to mount this iso file using loopback device with option read/write (-o rw) (you need to have loopback device support (CONFIG_BLK_DEV_LOOP=y) compiled in kernel to do that)
mount -o loop -o rw ./redhat-boot-cd.iso /mnt/cdrom
This won't allow you to write/modify ISO filesystem.

I haven't found any good solutions to edit iso image directly , One such tool is winISO (http://www.winiso.com ) this is a shareware package so you have to pay for it. But you can use this to add more files in your ISO image and burn new image back to new CD. If you know any better solution let me know also :-)

Following steps are useful for adding additional files in RedHat bootable ISO image and burning a new CDs with additional files as of your choice.

mkisofs -r -b images/boot.img -c boot.catalog -o /tmp/redhat-bootcd.iso ./

Now the Real drill:

Whatever method you choose to boot slave machine ( RedHat bootable CD + floopy or custom bootable RedHat CD), ultimate aim is to obtain shell, dd and netcat binary after 'linux rescue'. After you get shell you can access files stored on boot CD by changing directory to /mnt/sources/mystuff .
Hopefully your ethernet card has been detected by now. (as most Linux distributions allow OS install over network) if not then you have to load drivers for your ethernet card. Linux distribution documentation usually tells that how and sometimes they provide extra drivers floppy. In case of RedHat these floppy images are generally stored under directory images/ and you can copy these images to your floppy using commands like
dd if=<floppy-image> of=/dev/fd0

On Slave machine :
Run netcat command first on slave linux box (that to be cloned and booted using Linux boot CDROM as 'linux rescue' (See also Shell script case [1] in automation section below). Once ethernet card has been detected. (Use ifconfig -a command to check) assign IP address to this interface now on slave machine. Define loopback interface also. (You may choose different IP address for eth0). Also you may need to define /etc/hosts file before you can assign IP address. Use following commands to create your new /etc/hosts. (These are actually created in ram file system RAMFS).

rm /etc/hosts
echo "127.0.0.1 localhost" > /etc/hosts
echo "192.168.0.254 fakehost" >> /etc/hosts

ifconfig lo 127.0.0.1 up
ipconfig eth0 192.168.0.254 up

Assuming Master Linux box (from where you want to clone) is up and running with IP 192.168.0.1.

slave% nc -l -p 9000 | dd of=/dev/sda (Replace /dev/sda with actual drive on your slave machine)

This will listen at port 9000 and whatever it gets at port 9000 will hand over to dd command and dd will simply write that to on slave harddrive (sda) bit by bit. Here I am assuming dd and netcat (nc) are available either through floppy (/mnt/floppy/nc or through /mnt/sources/mystuff/nc). In case of floppy you need to mount floppy first using command:
mount /dev/fd0 /mnt/floppy

On Master machine:
Now Login on master linux box and run following command. (It is advisable that Master Linux box should be in calm state , i.e no major jobs running on the machine). This command below will read master disk bit by bit and throw this bit stream to netcat command which is connected to netcat command at port 9000 on <slave> box.

master% dd if=/dev/sda | nc 192.168.0.254 9000

That's it. You may have to wait for long time depending upon network speed and size of your harddrive. Typically 36GB drive may take 50 minutes over 100Mbps link. Again rather than cloning complete drive we can clone only relevant partitions and MBR only. That will make cloning much faster like we saw in above section.

Automating process and Case studies:

One of the primary reason for using dd and netcat way of cloning OS instead of using commercial software such as Ghost is we have a liberty to automate process as we like. Following scripts may help in automating cloning process.

Case [1]: Script for Slave machine (netcat and dd cloning) on the fly.

=================================================
cloneme.sh :: Shell script for slave machine.
=================================================

#!/bin/sh
############### Edit variables below ######################
FLOPPY_PATH=/mnt/floppy
MYSTUFF_PATH=/mnt/sources/mystuff

# Uncomment only One of the options below.
#### OPTION ==> 1 if using floppy ################
#NC=$FLOPPY_PATH/nc
#### OPTION ==> 2 if using mystuff/ on CD #########
NC=$MYSTUFF_PATH/nc

LPORT=9000
DEST=/dev/sda
SRC=$DEST
############# No need to edit after this in general ###########

if [ $# -eq 1 ]
then
IPADDR=$1
echo "###############################################################"
echo " If there are no errors here. You need to run following"
echo " command on Master Box."
echo ""
echo "dd if=$SRC | nc $IPADDR $LPORT"
echo "###############################################################"

echo ""
echo "##>> Preparing /etc/hosts ##"
rm /etc/hosts
echo "127.0.0.1 localhost" > /etc/hosts
echo "$IPADDR fakehost" >> /etc/hosts

echo "#===================================================================="
echo "NOTE:: If you need to create routes"
echo " #route add -net <DEST_NET> netmask 255.255.255.0 gw $IPADDR metric 0"
echo "#===================================================================="

echo "##>> Preparing interfaces lo and eth0 ##"
ifconfig lo 127.0.0.1 up
ifconfig eth0 $IPADDR up

echo ""
echo ">>> Now start listening(at $LPORT) for traffic from Master :-)"
echo "$NC -l -p $LPORT | dd of=$DEST"
$NC -l -p $LPORT | dd of=$DEST

echo ""
echo "%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%"
echo " Cloning Process completed..... :-) Reboot Now"
echo "%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%"

else
echo "Usage:: cloneme <IP_ADDR_OF_THIS_MACHINE>"
fi

Case [2] Saving Disk Images (Export Image for later use):

Although you can clone running machine over the network anytime. But it is sometimes desirable to store base installation as a reference image and you may want to clone from this pristine image later. With the help of dd you can image disks also. But let's discuss some issues first.
Most 32 bit operating system (Linux for IA32 , Windows etc.) will have physical limitation on max file size. In general practical limit is 2GB as a max. file size. 64 bit OS (Solaris8, HPUX 11.0, Linux for IA64, etc.) will not have this limitation. So if you use dd to copy harddrive image you can maximum image 2GB harddrive. That is pretty useless these days. Fortunately dd can image in chunks and you can specify start and end blocks, skip blocks etc. So idea here is to image your big harddrives in chunks of approx. 2GB files over network. Although I noticed RedHat 7.1 with Linux 2.4.x kernels will allow fie size even bigger than 4GB on ext2 FS.
Also if you want to store images in compressed format (to save space) it is desirable to have each image file size not too large.
Following perl script (export-image.pl) can be used to image local Linux harddrive /dev/hda to remote machine over NFS using dd. If you are not running NFS you can implement same thing using dd and netcat. For now that would be a manual process. If somebody knows a better way to run netcat and transfer multiple files automatically between two machines please let me know and I will cook up some automation script here.

This perl script is actually use dd command something as described below. This is imaging your big harddrive into chunks of 1950 MB files named (1, 2, 3, 4, .....) over NFS to remote machine.
($NFS is NFS destination directory on another server having plenty of space)

For 1st Image:
dd if=/dev/hda of=$NFS/1 bs=1024k count=1950 skip=0
For 2nd image: (Skipping the part of harddrive used for 1st image.)
dd if=/dev/hda of=$NFS/2 bs=1024k count=1950 skip=1950
For 3rd image: (Skipping the part of Harddrive used for 1st+2nd image)
dd if=/dev/hda of=$NFS/3 bs=1024k count=1950 skip=3900
and so on.

In case you want to use netcat you can simply pipe above dd commands manually to netcat and listen using netcat and dd on remote machine, just like we used netcat and dd to clone hardisks above. For example imaging harddrive on machineA and saving image on machineB.

For 1st image:
machineB% nc -l -p 9000 | dd of=1
machineA(master)% dd if=/dev/hda bs=1024k count=1950 skip=0 | nc machineB 9000
For 2nd image:
machineB% nc -l -p 9001 | dd of=2
machineA(master)% dd if=/dev/hda bs=1024k count=1950 skip=1950 | nc machineB 9001
For 3rdimage:
machineB% nc -l -p 9002 | dd of=2
machineA(master)% dd if=/dev/hda bs=1024k count=1950 skip=3900 | nc machineB 9002
and so on.

Once you have images (1, 2, 3, 4 ....) stored on network then you can boot your slave Linux box using bootable CD and pull these images to slave box as described in case [3].

========================================================
export-image.pl :: Perl script to image big harddrive using dd and NFS.
========================================================
#!/usr/bin/perl
#####################################################
#This script will run dd command (in serial) and dump
#1950 blocks (1.9GB) file for each.
#Run script as perl export-image.pl
#####################################################

################ Edit variables below #########################
#device is raw device name for harddrive to be cloned (imaged).
$device="/dev/hda";
#mount NFS file system with large space available which can hold images.
$nfs_path="/nfs/remote/home/tmp";
#Image name (read from user) (Make sure you have $nfs_path/$image directory)
#on remote machine.
$image="ob6000";
############################################################
$dd="/bin/dd";
#For compressing image
$bzip2="/usr/bin/bzip2";
$suffix=".bz2";
############## No need to edit after this #########################

$bs="1024k";
$block_count=1950;
$image_dir="$nfs_path/$image";
$compress=$bzip2;

$proceed=0;

if(!(-d $image_dir) )
{ die "\nOops!! Image Directory $image_dir must exist with chmod 777 permission\n"; }

system("clear");
print <<MSG1;
###########################################################
NOTE:: COMPRESSION TAKE TOO MUCH TIME(Many HOURS) OVER NFS.
So better compress manually latter on server itself.
###########################################################
\n\n Do you want to compress images using $compress [y/n] (Default n) = \t
MSG1

$compress_flag=<STDIN>;
if(($compress_flag eq "y") or ($compress_flag eq "Y"))
{ $compress_flag=1; }
else
{ $compress_flag=0; }

print "\n\n";
print "***************************************************\n";
print " Local Device = $device [SOURCE] \n";
print " Image Dir = $image_dir [TARGET] \n";
print "***************************************************\n\n\n";
print "Dude! I hope you understand what are you doing by pressing [y/Y] here :-) \n";
print " Press [y/Y] if you want to continue .. ";
$con=<STDIN>; chomp($con);

if(($con eq "y") or ($con eq "Y"))
{
$i=0;
$image_size=1; #Some fake value greater than zero.

print "\n\nDisk Imaging starts...\n";
system("date");
while($image_size > 0)
{
$image_name="$image_dir/$i";
print "##############################################\n";
print "Creating Image $image_name\n";
print "##############################################\n";
$skip=$i*$block_count;
print "$dd if=$device of=$image_name bs=$bs count=$block_count skip=$skip \n";
system("$dd if=$device of=$image_name bs=$bs count=$block_count skip=$skip");
if($compress_flag)
{
print "Compressing Image: $bzip2 $image_name => $image_name$suffix\n";
system("$bzip2 $image_name");
$image_name .= "$suffix";
}
++$i;
$image_size=(stat($image_name))[7];
system("date");
}
}
else
{
print "Bye Bye ...\n";
}

Case [3] Importing Disk Images (1, 2, 3, 4 ...) created in Case [2] using netcat, dd and cat

This part is little tricky in the sense we want all images (1, 2, 3, 4, ...) to be imported on slave machine and use dd to write these images serially on slave drive. A very simple set of commands can be used as below.

On Slave machine : (booted through linux rescue). Run following netcat command to capture incoming data stream.

machineC(slave)% nc -l -p 9000 | dd of=/dev/hda

On machineB machine : (where images 1, 2, 3, 4 .... are stored). Run following cat and netcat command. Make sure you cat images in the same sequence as they were imported in case [2]. cat command will simply join these images and throw data stream to netcat which slave machine will pick up and copy bit by bit on slave harddrive.

machineB% cat 1 2 3 4 .... | nc machineC 9000

Case [4] Importing Disk images created in Case[2]:
Most likely 'linux rescue' system won't have NFS support. Which means when you boot slave box using such method you can not access resources over NFS. But if you cook your own CD and that has NFS support and perl the following perl script can be used to fetch images stored earlier from machineB using NFS. This script is actually doing:
($NFS is NFS source directory on another server machineB where you have images 1, 2, 3, 4, ... stored earlier)

For image 1:
dd if=$NFS/1 of=/dev/hda bs=1024k conv=notrunc seek=0
For image 2:
dd if=$NFS/2 of=/dev/hda bs=1024k conv=notrunc seek=1950
For image 3:
dd if=$NFS/3 of=/dev/hda bs=1024k conv=notrunc seek=3900

In any case if you are interested in using perl script below (if you have perl and NFS client support on slave linux box).

import-image.pl
#!/usr/bin/perl
#####################################################
#This script will run dd command (in serial) and dump
#and import image.
#####################################################

##############################################################################
#device is target raw device name for harddrive to be cloned.
$device="/dev/hda";
#mount NFS file system with large space available which can hold images.
$nfs_path="/mnt/images";
#Image name (read from user)
$image="ob6000";
###############################################################################
$dd="/bin/dd";
#$bzcat="/usr/bin/bzcat";
#$suffix=".bz2";

$bs="1024k";
$block_count=1950;
###############################################################################
$image_dir="$nfs_path/$image";

$proceed=0;

if(!(-d $image_dir) )
{ die "\nOops!! No Image Directory $image_dir\n"; }

system("clear");
print "***************************************************\n";
print " Local Device = $device [TARGET]\n";
print " Image Dir = $image_dir [SOURCE]\n";
print "***************************************************\n\n\n";
print "Dude! I hope you understand what are you doing by pressing [y/Y] here :-) \n";
print " Press [y/Y] if you want to continue .. ";
$con=<STDIN>; chomp($con);
print " Once Again!!! Press [y/Y] if you want to continue .. ";
$con=<STDIN>; chomp($con);

system("date");
if(($con eq "y") or ($con eq "Y"))
{
print "\n\nDisk Imaging import starts...\n";

$i=0;
$image_name="$image_dir/$i";
while(-f $image_name )
{
print "##############################################\n";
print "Importing Image $image_name\n";
print "##############################################\n";
$seek=$i*$block_count;
print "##############################################\n";
$seek=$i*$block_count;
print "$dd if=$image_name of=$device bs=$bs conv=notrunc seek=$seek \n";
#system("$bzcat $image_name | $dd of=$device bs=$bs conv=notrunc seek=$seek");
system("$dd if=$image_name of=$device bs=$bs conv=notrunc seek=$seek");
++$i;
$image_name="$image_dir/$i";
system("date");
}
}
else
{
print "Bye Bye ...\n";
}

Other Operating Systems Tips:

You can pretty much do same in other operating systems also. This section quickly list few tips that may be useful.

Windows:

dd if=\\.\PhysicalDrive0 of=<target>

Solaris:

Others: (Make disk bootable)

Conclusion:

Few possible uses of netcat and dd shown in this document. Although methods presented here are very simple and easy to use but have few pros and cons also. This technique is very good for on the fly OS cloning. When we image the whole drive we need the equivalent harddrive space on other machine. This may not be very much practical. You can try compressing those images which will save lots of space. I noticed dd image can be compressed upto 30-80% depending upon real data on the drive using gzip/compress program. This cloning and imaging method may be very effective in forensic analysis where sometimes you need an exact snapshot of harddrive including swap space partitions. You can always break your images in small pieces (may be compress them) transfer over network to somewhere else and reproduce data. As mentioned above one of the great advantage here is to custom your own cloning scheme.

References:

  1. GNU utilities for Win32. http://unxutils.sourceforge.net/
  2. netcat for Windows. http://www.l0pht.com/~weld/netcat
  3. First Attempt at Creating a Bootable Live Filesystem on a CDROM http://www.linuxgazette.com/issue54/nielsen.html
  4. Good Site for Windows utilities such as newsid.exe: http://www.sysinternals.com
  5. Modifying ISO image http://www.winiso.com
  6. Solaris Bootable CD creation: http://www.lka.ch/projects/solcdburn/solcdburn.html
  7. Sun Blueprint: http://www.sun.com/software/solutions/blueprints/0301/BuildBoot.pdf
  8. Linux on Floppy: http://www.toms.net/rb/
  9. Static binaries for Linux.

Take Command dd -- http://www.linuxjournal.com/article.php?sid=1320

The ' dd ' command is one of the original Unix utilities and should be in everyone's tool box. It can strip headers, extract parts of binary files and write into the middle of floppy disks; it is used by the Linux kernel Makefiles to make boot images. It can be used to copy and convert magnetic tape formats, convert between ASCII and EBCDIC, swap bytes, and force to upper and lowercase. For blocked I/O, the dd command has no competition in the standard tool set.

One could write a custom utility to do specific I/O or formatting but, as dd is already available almost everywhere, it makes sense to use it. Like most well-behaved commands, dd reads from its standard input and writes to its standard output, unless a command line specification has been given. This allows dd to be used in pipes, and remotely with the rsh remote shell command.

Unlike most commands, dd uses a keyword=value format for its parameters. This was reputedly modeled after IBM System/360 JCL, which had an elaborate DD 'Dataset Definition' specification for I/O devices. A complete listing of all keywords is available from GNU dd with

$ dd --help

Some people believe dd means ``Destroy Disk'' or ``Delete Data'' because if it is misused, a partition or output file can be trashed very quickly. Since dd is the tool used to write disk headers, boot records, and similar system data areas, misuse of dd has probably trashed many hard disks and file systems. In essence, dd copies and optionally converts data. It uses an input buffer, conversion buffer if conversion is specified, and an output buffer. Reads are issued to the input file or device for the size of the input buffer, optional conversions are applied, and writes are issued for the size of the output buffer. This allows I/O requests to be tailored to the requirements of a task. Output to standard error reports the number of full and short blocks read and written.

Example 1

A typical task for dd is copying a floppy disk. As the common geometry of a 3.5" floppy is 18 sectors per track, two heads and 80 cylinders, an optimized dd command to read a floppy is:

Example 1-a : Copying from a 3.5" floppy

dd bs=2x80x18b if=/dev/fd0 of=/tmp/floppy.image
1+0 records in
1+0 records out

The 18b specifies 18 sectors of 512 bytes, the 2x multiplies the sector size by the number of heads, and the 80x is for the cylinders--a total of 1474560 bytes. This issues a single 1474560-byte read request to /dev/fd0 and a single 1474560 write request to /tmp/floppy.image, whereas a corresponding cp command

cp /dev/fd0 /tmp/floppy.image

issues 360 reads and writes of 4096 bytes. While this may seem insignificant on a 1.44MB file, when larger amounts of data are involved, reducing the number of system calls and improving performance can be significant.

This example also shows the factor capability in the GNU dd number specification. This has been around since before the Programmers Work Bench and, while not documented in the GNU dd man page, is present in the source and works just fine, thank you.

To finish copying a floppy, the original needs to be ejected, a new diskette inserted, and another dd command issued to write to the diskette:

Example 1-b : Copying to a 3.5" floppy
dd bs=2x80x18b < /tmp/floppy.image > /dev/fd0
1+0 records in
1+0 records out

Here is shown the stdin/stdout usage, in which respect dd is like most other utilities.

Example 2

The original need for dd came with the 1/2" tapes used to exchange data with other systems and boot and install Unix on the PDP/11. Those days are gone, but the 9-track format lives. To access the venerable 9-track, 1/2" tape, dd is superior. With modern SCSI tape devices, blocking and unblocking are no longer a necessity, as the hardware reads and writes 512-byte data blocks.

However, the 9-track 1/2" tape format allows for variable length blocking and can be impossible to read with the cp command. The dd command allows for the exact specification of input and output block sizes, and can even read variable length block sizes, by specifying an input buffer size larger than any of the blocks on the tape. Short blocks are read, and dd happily copies those to the output file without complaint, simply reporting on the number of complete and short blocks encountered.

Then there are the EBCDIC datasets transferred from such systems as MVS, which are almost always 80-character blank-padded Hollerith Card Images! No problem for dd, which will convert these to newline-terminated variable record length ASCII. Making the format is just as easy and dd again is the right tool for the job.

Example 2 : Converting EBCDIC 80-character fixed-length record to ASCII variable-length newline-terminated record
dd bs=10240 cbs=80 conv=ascii,unblock if=/dev/st0 of=ascii.out
40+0 records in
38+1 records out

The fixed record length is specified by the cbs=80 parameter, and the input and output block sizes are set with bs=10240. The EBCDIC-to-ASCII conversion and fixed-to-variable record length conversion are enabled with the conv=ascii,noblock parameter.

Notice the output record count is smaller than the input record count. This is due to the padding spaces eliminated from the output file and replaced with newline characters.

Example 3

Sometimes data arrives from sources in unusual formats. For example, every time I read a tape made on an SGI machine, the bytes are swapped. The dd command takes this in stride, swapping the bytes as required. The ability to use dd in a pipe with rsh means that the tape device on any *nix system is accessible, given the proper rlogin setup.

Example 3 : Byte Swapping with Remote Access of Magnet Tape
rsh sgi.with.tape dd bs=256b if=/dev/rmt0 conv=swab | tar xvf -

The dd runs on the SGI and swaps the bytes before writing to the tar command running on the local host.

Example 4

Murphy's Law was postulated long before digital computers, but it seems it was specifically targeted for them. When you need to read a floppy or tape, it is the only copy in the universe and you have a deadline past due, that is when you will have a bad spot on the magnetic media, and your data will be unreadable. To the rescue comes dd, which can read all the good data around the bad spot and continue after the error is encountered. Sometimes this is all that is needed to recover the important data.

Example 4 : Error Handling
dd bs=265b conv=noerror if=/dev/st0 of=/tmp/bad.tape.image

Example 5

The Linux kernel Makefiles use dd to build the boot image. In the Alpha Makefile /usr/src/linux/arch/alpha/boot/Makefile, the srmboot target issues the command:

Example 5 : Kernel Image Makefile
dd if=bootimage of=$(BOOTDEV) bs=512 seek=1 skip=1

This skips the first 512 bytes of the input bootimage file (skip=1) and writes starting at the second sector of the $(BOOTDEV) device (seek=1). A typical use of dd is to skip executable headers and begin writing in the middle of a device, skipping volume and partition data. As this can cause your disk to lose file system data, please test and use these applications with care.

Recovering Corrupted or Lost Files from a FAT Partition with Linux by: Chad Anderson•

written by: Chad Anderson•edited by: Eric Stallsworth•updated: 5/20/2011

... ... ...

Scripts

#!/bin/sh
tar -cvf my.tar $(for i in `cat list`
   do
       echo $i
   done)
exit
This script is backing up to tape using dump command. Logging date and all the messages to a log file.
#!/bin/sh
#
echo "$DATE"backup.log
filenumber=`/usr/bin/mt stat|/usr/bin/grep "File Number"|/usr/bin/awk '{print $3}'`
echo "Backing up / to tape location: $filenumber"backup.log
/sbin/dump -0ua -f /dev/nrsa0 / &2backup.log
if [ $? -eq 0 ];then
   echo "/ backup successful"$HOME/log/backup.log
fi


Copy files ( even complete filesystem) from remote to local system
Note: You must be able to rlogin into the remote machine without a password. To do this add the name of your local machine with your user name in the .rhost file in your home directory on the remote machine.

    #!/bin/sh                                                        #                                                               # Copies files from Remote System to the local current directory
    #        name=`basename $0`                   if [ $# -ne 2 ];then
    echo "Usage: $name <remote-system> <dir-to-copy>"
    exit                                                             fi                                                                system=$1                                                    dir_to_cp=$2                                                  rsh $system "cd $dir_to_cp; find . -print|cpio -ocB"|dd ibs=5k obs=5k|cpio -iducmvB

Recommended Links

Top Visited

Bulletin Latest Past week Past month
Google Search



Learn how to use the free and open source operating system, GNU/Linux, to analyze, recover, and repair your computer. Using Linux's free utilities you can fix lost partitions, resurrect deleted files, and repair seemingly unrepairable hard disks.
  1. Recovering Corrupted or Lost Files from a FAT Partition with Linux
  2. Working With Recovery Images in Linux
  3. How To Make a Linux Boot Disk for Windows Registry Repair
  4. Deleting and Recovering Files In Linux



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: July 28, 2019