Reverse Engineering
Reverse engineering is a very broad term. On high end it includes design recovery and on the other end --
recompilation and disassembly. But the essence
of all this different activities is understanding of a particular program when something is
missing (design documentation, source code, etc.). Actually it might be useful to distinguish 'reverse engineering in the small"
from "reverse engineering on the large", like we distinguish "programming in the small" and " programming in the large". Design recovery
and program renovation is generally connected with research on programming understanding, while lower level reverse engineering activities
(decompilation and disassembly) are more connected with compiler design and machine architecture
issues.
In the United States, once you own a copy of a program, you can back it up, compile it, run it, and even modify it as
necessary, without permission from the copyright holder. See
17 USC 117 According to the CONTU Final Report, which is generally interpreted by the courts as legislative history, "the right
to add features to the program that were not present at the time of rightful acquisition'' falls within the owner's rights of modification
under section 117.
What does all this mean? Once you've legally downloaded or bought a program, you can can run it, you can modify it,
you can distribute your patches for other people to use. If you think you need a license from the copyright holder, you've been bamboozled
by Microsoft. As long as you're not distributing the software, you have nothing to worry about unless you are trying to defeat some
protection mechanism in the original software. Please browse my Copyright issues
for more information.
IMHO we need to support public interests against current abuse. If you own a website please consider joining
opposition to copyright extentions,
No Cense, or other similar opposition group.
UCITA didn't even make the Federal Trade Commission happy.
The other threat to reverse engineering is from pseudo scientists, for example all those "object-oriented no matter
what" zealots, who does not understand the difference between programming-in-the-large and programming-in-the small ;-). There
is a lot of pseudoscience papers and even books on the problem of modernization of existing software system and a small army of successful
sellers of snake oil makes good money selling more or less useless (and sometimes outright harmful) methodologies. There was a big splash
of their activity related to Y2K problem, but now it's over. Actually all this Y2K efforts has one positive side effect: along with
snake oil some useful research was conducted and some useful tools were polished and/or developed due to the huge money influx to the
area at this time (see for example Open Directory
- Computers Software Year 2000 Products).
Actually reverse engineering is more about understanding the product than an attempt to produce a clone benefiting from
the original code. Documentation to many software products is notoriously bad, source code is unavailable or too expensive to obtain
and if you need to develop a product that interfaces to such software package reverse engineering is your only option. In this page
I mainly try to help the latter category of developers.
Please note that I am no longer involved in the research in this area and many links below can be dead or outdated.
Dr. Nikolai Bezroukov
Step 1: Allocate physical or virtual systems for the analysis lab
A common approach to examining malicious software involves infecting a system with the malware specimen and then using the appropriate
monitoring tools to observe how it behaves. This requires a laboratory system you can infect without affecting your production environment.
The most popular and flexible way to set up such a lab system involves virtualization software, which allows you to use a single
physical computer for hosting multiple virtual systems, each running a potentially different operating system. Free virtualization
software options include:
Running multiple virtual systems simultaneously on a single physical computer is useful for analyzing malware that seeks to interact
with other systems, perhaps for leaking data, obtaining instructions from the attacker, or upgrading itself. Virtualization makes
it easy to set up and use such systems without procuring numerous physical boxes.
Another useful feature of many virtualization tools is the ability to take instantaneous snapshots of the laboratory system. This
way, you can record the state of the system before you infect it, and revert to the pristine environment with a click of a button
at the end of your analysis.
If using virtualization software, install as much RAM into the physical system as you can, as the availability of memory is arguably
the most important performance factor for virtualization tools. In addition, having a large hard drive will allow you to host many
virtual machines, whose virtual file systems typically are stored as files on the physical system's hard drive.
Because malware may detect that it's running in a virtualized environment, some analysts prefer to rely on physical, rather than
virtual, machines for implementing laboratory systems. Your old and unused PCs or servers can make excellent systems for your malware-analysis
lab, which usually doesn't need high-performing CPUs or highly redundant hardware components.
To allow malware to reach its full potential in the lab, laboratory systems typically are networked with each other. This helps
you observe the malicious program's network interactions. If using physical systems, you can connect them with each other using an
inexpensive hub or a switch.
Step 2: Isolate laboratory systems from the production environment
You must take precautions to isolate the malware-analysis lab from the production network, to mitigate the risk that a malicious
program will escape. You can separate the laboratory network from production using a firewall. Better yet, don't connect laboratory
and production networks at all, to avoid firewall configuration issues that might allow malware to bypass filtering restrictions.
If your laboratory network is strongly isolated, you can use removable media to bring tools and malware into the lab. Consider
using write-once media, such as DVDs , to prevent malicious software from escaping the lab's confines by writing itself to a writable
removable disk. A more convenient option is a USB key that includes a physical write-protect switch.
Some malware-analysis scenarios benefit from the lab being connected to the internet. Avoid using the production network for such
connectivity. If possible, provision a separate, and usually inexpensive, internet connection, perhaps by dedicating a DSL or Cable
Modem line to this purpose. Avoid keeping the lab connected to the internet all the time to minimize the chance of malware in your
lab attacking someone else's system on the internet.
If virtualizing your lab, be sure to keep up with security patches released by the virtualization-software vendor. Such software
may have vulnerabilities that could allow malware to escape from the virtual system you infected and onto the physical host. Furthermore,
don't use the physical machine that's hosting your virtualized lab for any other purpose.
Step 3: Install behavioral analysis tools
Before you're ready to infect your laboratory system with the malware specimen, you need to install and activate the appropriate
monitoring tools. Free utilities that will let you observe how Windows malware interacts with its environment include:
- File system and registry monitoring: Process
Monitor with ProcDOT offer a powerful way to observe how local processes read, write,
or delete registry entries and files. These tools can help you understand how malware attempts to embed into the system upon infection.
- Process monitoring: Process Explorer
and Process Hacker replace the built-in Windows Task Manager, helping you
observe malicious processes, including local network ports they may attempt to open.
- Network monitoring: Wireshark is a popular network sniffer, which can observe
laboratory network traffic for malicious communication attempts, such as DNS resolution requests, bot traffic, or downloads.
- Change detection: Regshot is a lightweight tool for comparing
the system's state before and after the infection, to highlight the key changes malware made to the file system and the registry.
Behavioral monitoring tools can give you a sense for the key capabilities of malicious software. For further details about its
characteristics, you may need to roll up your sleeves and perform some code analysis.
Step 4: Install code-analysis tools
Examining the code that comprises the specimen helps uncover characteristics that may be difficult to obtain through behavioral
analysis. In the case of a malicious executable, you rarely will have the luxury of access to the source code from which it was created.
Fortunately, the following free tools can help you reverse compiled Windows executables:
- Disassembler and debugger: OllyDbg and
IDA Pro Freeware can parse compiled Windows
executables and, acting as disassemblers, display their code as assembly instructions. These tools also have debugging capabilities,
which allow you to execute the most interesting parts of the malicious program slowly and under highly controlled conditions,
so you can better understand the purpose of the code.
- Memory dumper: Scylla and
OllyDumpEx help obtain protected code located in the lab system's memory
and dump it to a file. This technique is particularly useful when analyzing packed executables, which are difficult to disassemble
because they encode or encrypt their instructions, extracting them into RAM only during run-time.
Step 5: Utilize online analysis tools
To round off your malware-analysis toolkit, add to it some freely available online tools that may assist with the reverse engineering
process. One category of such tools performs automated behavioral analysis of the executables you supply. These applications look
similar at first glance, but use different technologies on the back end. Consider submitting your malware specimen to several of
these sites; depending on the specimen, some sites will be more effective than others. Such tools include:
You can see a longer list of free automated malware analysis services
that can examine compiled Windows executables.
Another set of potentially useful online tools provides details about websites that are suspected of hosting malicious code. Some
of these tools examine the sites you specify in real time; others provide historical information. Consider submitting a suspicious
URL to several of these sites, because each may offer a slightly different perspective on the website in question:
You can see a longer list of free on-line tools for looking up a potentially
malicious website.
Next Steps
With your initial toolkit assembled, start experimenting in the lab with malware you come across on the web, in your e-mail box,
on your systems, and so on. There are several "cheat sheets" that can help you in this process, including:
Begin analysis with the tools and approaches most familiar to you. Then, as you become more familiar with the inner workings of
the malware specimen, venture out of your comfort zone to try other tools and techniques. The tools I've listed within each step
operate virtually identically. Since they're all free, you should feel free to try them all. You'll find that one tool will work
better than another, depending on the situation. And with time, patience, and practice, you will learn to turn malware inside out.
For additional tips and resources, see my article
How to Get Started With Malware Analysis
and check out the Reverse-Engineering
Malware course I teach at SANS Institute
Authors of this course created the following cheat sheets to summarize some of the concepts and tools useful for
malware analysis:
You can also get a sense for malware analysis approaches explored in this course by looking at the following resources:
This course is a part of SANS' comprehensive Digital Forensics and Incident Response (DFIR) curriculum.
Learn more about our DFIR courses and free resources. Take your learning beyond
the classroom. Explore our DFIR site network for additional resources related to the subject matter of this course.
wiredmikey writes "Security startup CrowdStrike has launched
CrowdRE, a free platform that allows security researchers and
analysts to collaborate on malware reverse engineering. CrowdRE is adapting the collaborative model common in the developer world
to make it possible to
reverse
engineer malicious code more quickly and efficiently. Collaborative reverse engineering can take two approaches, where all the
analysts are working at the same time and sharing all the information instantly, or in a distributed manner, where different people
work on different sections and share the results. This means multiple people can work on different parts simultaneously and the results
can be combined to gain a full picture of the malware. Google is planning to add CrowdRE integration to BinNavi, a graph-based reverse
engineering tool for malware analysis, and the plan is to integrate with other similar tools. Linux and Mac OS support is expected
soon, as well."
January 13, 2012 | Slashdot
Sparrowvsrevolution writes "At the Shmoocon security conference later this month, Danny Quist plans to demo a new three-dimensional
version of a tool he's created called Visualization of Executables for Reversing and Analysis, or VERA, that
maps viruses' and worms' code into intuitively visible models.
Quist, who teaches government and corporate students the art of reverse engineering at Los Alamos National Labs, says he hopes
VERA will make the process of taking apart and understanding malware's functionality far easier.
VERA observes malware running in a virtual sandbox and identifies the basic blocks of commands it executes. Then those chunks
of instructions are color-coded by their function and linked by the order of the malware's operations, like a giant, 3D flow chart.
Quist provides a sample video showing a model of a section of the Koobface worm."
An interesting static byte code analyzer for Java
Structure101 for Java parses your byte code and creates an implementation model of
all the dependencies mapped up through the compositional hierarchy. It does this at a rate of mega-SLOCs per minute. You can
browse the model and view dependency diagrams at any level - method, class, package or jar. (More...)
We consider structure to be important through the life of an application - not just something that gets fixed in an expensive 'Big
Bang'. At the same time, we realize that many of our customers only begin looking at structure when they get the feeling it is out
of control....Structure101TM, currently available for Java only, is designed for live,
evolving, imperfect, real projects, where ongoing development must continue. We have focused on making sense of large, difficult
code-bases. Structure101 lets you keep a lid on the structural complexity so that it doesn't get any worse, and
enables you to gradually streamline the structure while still working to hard delivery schedules.
We have been doing structure since 1999. The core engine of Structure101, the Higraph, is on its 3rd incarnation,
lightning fast and massively scalable. It is our passion to continually find new ways to understand and control structure - to make
structure simple.
It is very common for packages and classes to outgrow themselves. Big fat packages or classes tend to be difficult
to work with because they lack the structure that helps to guide human understanding. Structure101 helps by
letting you view even very large dependency graphs of the package or class contents. To help further, Structure101
can perform an Auto-partition on the graph, to reveal the hidden, inherent structure. As well has helping
you understand what you've got, seeing the inherent structure may help you to decide how to add structure by creating sub-packages
or classes.
computer_intelligence_assembler_disassembler
This page is about how the
Post-It Fix-Up principle works out in practical
program code in Forth. For the impatient:
jump to the downloads
Actual assemblers
Applying the Post-It Fix-Up principle to a
8086 assembler led to the discovery of problems
that had to be solved. It turns out that some types of fixups better be considered not relative to the start of the instruction,
but relative to the end. Otherwise there would be diffent fixups for e.g. byte/cell indication (B| X|), dependant on the length
of the opcode. It is still there in the fig-forth
version of the opcodes, such as B| W| besides B1| and W1| . So a new class of fixup, the "fix up's from behind"
or reverse fixups were added. It turned out that other fixup's are not needed for the Intel, up to the Pentium. Other processors
require fixup's with build in data. These so called data fixups are needed for the 6809 and the DEC Alpha.
A program was added that generates a PostScript file
with the first byte opcodes for 8080 as well as
8086 , and the
80386 , a so called quick reference card. Comparing
that to Intels documentation led to the discovery of one more bug. I had to redesign the opcodes, so other people could have trouble
using this beast without such a reference card and the `SHOW: MOV|SG,' that lists all forms allowed for the move segment instruction.
A book
- Into the House of Logic
- Should Reverse Engineering Be Illegal?
- Reverse Engineering Tools and Concepts
- Approaches to Reverse Engineering
- Methods of the Reverser
- Writing Interactive Disassembler
(IDA) Plugins
- Decompiling and Disassembling Software
- Decompilation in Practice: Reversing
helpctr.exe
- Automatic, Bulk Auditing for Vulnerabilities
- Writing Your Own Cracking Tools
- Building a Basic Code Coverage Tool
- Conclusion
Most previously published case studies in architecture recovery have been performed on statically linked software
systems. Due to the increase in use of middleware technologies, such as CORBA, and OOP concepts, such as polymorphism, there is an
opportunity and a need to analyze architectures of these dynamically linked systems. This paper presents the results of software
architecture extraction of the Nautilus file manager, which employs CORBA in its implementation. A combination of existing static
analysis and use-case modeling architecture recovery techniques was used with the expectation of complex but complete architecture
extraction of a system such as Nautilus. We have found that this combined approach named Dynamo-1 presented in this paper provided
successful focused architecture recovery and guidance for the future work in complete architecture recovery of dynamically linked
applications.
Keywords: Nautilus, GNOME, program comprehension
one of the best legal paper on the subject. Highly recommended.
Contains a beta version of DisC - Decompiler for TurboC
and a small intro to the problem of decompilation using Intel assembler fragments of small C programs as an example.
Compare to Decompilation of Binary Programs - dcc. See
Decomlilation and Decompilers Page
An article on PlanetIT.com
discusses a court ruling that establishes the reverse-engineering of hardware and software as legal, under the "fair use" umbrella.
What ramifications does this have in the industry? Can I reverse-engineer MS Word and write a word processor that can read and save
.DOC files?"
Well, that clears that up, then.
(Score:3, Insightful)
by Anoriymous Coward on Sunday February 04, @03:14PM EST (#18)
(User #257749 Info) |
I'm confused. Possibly so is the author of this article. He seems to imply
that UCITA is a pending piece of federal legislation, rather than state legislation. As it is, UCITA appears to be dead and
buried in most states (hooray!).
He draws a line between the Reimerdes and Connectix cases by quoting that Reimerdes "didn't have a right to the DVD". Did he
steal it? More confusion.
Anyway, it seems the 9th Circuit gets overturned all the time, so I wouldn't get too hopeful about this being a positive sign. |
Re:Well, that clears that up, then.
(Score:1)
by edwardames (edwardames at hotmail dot
com) on Sunday February 04, @08:57PM EST (#166)
(User #157809 Info) |
That assertion by the journalist also took me aback for a second. I have
no doubt the software industry would likely try to get legislation through Congress to "correct" a court ruling such as this
one, but that's just my suspicion. UCITA, though it would impact cases like the one in the story, certainly has nothing to
do with the U.S. Congress. UCITA's going through the legislatures, even if it is going slowly.
Despite your opinion of the current status of UCITA, I think that it is far from dead. Take
a look at this map to see where UCITA lobbying activities
are underway. Check out anti-UCITA ucita.com. and pro-UCITA
ucitaonline.com. It's still an issue that has to be followed or it'll
take us all by surprise one day, by becoming the law of the land.
Ed
|
Reverse Engineering file formats
(Score:5, Insightful)
by Pedrito ([email protected])
on Sunday February 04, @03:28PM EST (#31)
(User #94783 Info) |
I reverse engineered quite a few MS file formats (see my out-of-print book
Undocumented Windows File Formats) and never had any hassles from MS regarding the reverse engineering.
In fact, MS tried to hire me to provide them with the specs for one of their file formats. Apparently the author of the code
never documented the file format. MS had released specs for it, but they were completely wrong.
After being told by several friends that MS was notorious for delaying payment with contractors, I asked for half the money
up-front. They refused and I never did the work.
But I digress. I reverse engineered a number of file formats that were "proprietary" Microsoft files. If they're going to go
after anyone for it, surely they would have gone after me since I was publishing them left and right in magazines and my book.
I've figured ever since then that MS must have known that the whole thing about reverse engineering in their licenses must
be unenforceable.
You can also look at all the work Andrew Schulman and Matt Pietrek did reverse engineering Windows code and the PE file format
and neither of them ever got hassled either, as far as I know.
Pete Davis
-- "Suppose you were an idiot. And suppose you were a member of congress. But I repeat myself." - Mark Twain |
DeCSS Reverse Engineering? No proof
(Score:3, Informative)
by joneshenry on Sunday February 04, @03:44PM EST (#47)
(User #9497 Info) |
I urge everyone who thinks that DeCSS was reverse engineering to actually
read materials such as the
transcript of Johansen's testimony. There is simply no evidence that DeCSS was the product of legitimate reverse engineering.
Not just once but twice anonymous information was contributed to crack the problem in a form that does not resemble what one
would get from treating the system as a black box. Johansen testified: "Yes, I believe the CSS authentication had been posted
anonymously in Assembler language on the Internet, and Derek Fawcus had picked that up and rewritten it in C language and posted
it on his website." Note the word "Assembler". Johansen also testified that he was given further information from a complete
stranger on IRC. On the Livid-dev mailing list on Saturday, October 02, 1999 Eric Smith had posted: "The specific issue WRT
the CSS code is that the x86 code was apparently simply ripped out of a working commerical implementation (which was presumably
copyrighted)" to which Derek Fawcus had replied "Well I guess it might have been, but I don't _know_ that." (Fawcus went on
to explain how he had "worked to understand the algorithm underlying the x86 code.") Why the developers didn't run away as
fast as they could once there were questions is something I cannot understand. Didn't anyone learn from previous examples such
as Compaq's reverse engineering of the IBM PC BIOS? Compaq set up their reverse engineering effort so that at every stage they
could prove the source of information using engineers whom they could assert did not have prior exposure to IBM IP. |
Drivers (Score:2, Insightful)
by tzoompy on Sunday February 04, @09:19PM EST (#171)
(User #312872 Info) |
A lot of the *nix drivers are done by reverse-engineering. I know of a couple
of Linmodem driver projects that started with a copy of the binary of the corresponding Winmodem. Reverse engineering applied
in the purpose of getting hardware specs out of the driver is OK with most of the driver companies. The Win/Lin modem manufacturers
care mostly about the SP processing algorithms rather than the DSP specs. The problem with reverse engineering for these modems
is that together with the hardware specs, there is sufficient information about what SP algorithm they are using that a sufficienly
knowledgeable person can reverse engineer everything out of their driver. |
[Dec 18, 2000] Clipper and FoxPro decompilation
This little 16bit DOS program generates a ".afs" of any ".8bf" file compiled by
the Filter Factory of Adobe Photoshop (PC version only).
[Sept 15, 2000] Java decompilers
The wonderful ferroconcrete world we live in has more lawyers than rats. There are patents underlying the most
obvious software designs (yes, a simple lawsuit showing prior art will defeat three quarters of them, but I for one won't spend my
life savings on them, and companies with pockets that are deep enough prefer not to invalidate competitors patents for fear of getting
blasted themselves).
Patent issues aside, there's the legal debate about licenses. If we (the Open Source developers) cannot put our
legal squabbles aside (my license is more free than yours -- no, mine is), how would anyone expect to put big business to put theirs
aside? Beside ego, they've got shareholders to take into account.
I've been mighty impressed with IBM's venture into the Open Source arena. I think they've taken the boldest steps
of all. It's not just half-baked Java stuff (with tremendous investments behind them) or stuff without direct revenue potential (like
jfs, which they couldn't sell as long as competitors think their mouse trap is better). If you search for "IBM Visual Data Explorer"
on www.ibm.com, you'll get a price list with a rather hefty price tag (and if you dig deeper, you'll find an impressive array of
Fortune 500 companies and research institutes that paid those prices and got their moneys worth). If you look at
opendx.org, you'll see the same software, free. The stuff is awesome!
Whatever their motivation, I rate IBM highly for its commitment to Open Source. It's a rather stunning move, given
their revenue streams and the fact that they spearheaded the move from free to paid-for software eons ago.
"Our tools help developers understand, document, and maintain impossibly large or complex amounts
of source code."
Reverse engineering is a powerful tool that allows innovators to improve upon someone else's design
or make compatible products.
Or ... reverse engineering is a way for freeloaders to gain unauthorized access to proprietary products
and infringe upon someone's intellectual property rights.
Either way you look at it, keep in mind that some forms of software reverse engineering are restricted
by legal rules of contract and copyright law.
Latest battle
Last month Mattel (MAT)
filed a lawsuit against two software programmers,
alleging both copyright and software license violations.
Mattel and its subsidiary, Microsystems Software Inc., filed documents in Massachusetts federal
court claiming that software programmers from Sweden and Canada reverse-engineered its CyberPatrol Internet filtering software. According
to Mattel, the programmers then created a utility known as "cphack.exe" or "CP4break.zip" that allows people to see a list of those
sites considered off-limits by CyberPatrol.
Mattel said that by reverse engineering CyberPatrol, "developing source code and binaries to bypass"
CyberPatrol's protections, and then posting the utilities on the Internet, the programmers had violated Mattel's copyrights and the
terms of the CyberPatrol license.
Last Friday, the court hearing this case agreed that Mattel's claims have merit, and issued a temporary
restraining order prohibiting the distribution of cphack.exe and CP4break.zip.
SecurityFocus.com: The Fine Print in UCITA
(Mar 17, 2000)
Cyber Patrol sues codebreakers (the AP story is *wrong*)
(Mar 16, 2000)
Wired: Furor Over Virginia E-Biz Law [UCITA] (Mar 16, 2000)
SJ Mercury/AP: Software filter firm sues hackers (Mar 16,
2000)
SJ Mercury: Greed undermines benefits of digital technology
(Mar 05, 2000)
ZDNet: Hollywood's war on open source (Feb 28, 2000)
Freshmeat: The Dangers of UCITA (Feb 24, 2000)
osOpinion: Cronus Overthrown: a perspective on CSS and SDMI
(Feb 07, 2000)
UPDATED: Richard Stallman -- Why We Must Fight UCITA (Feb
06, 2000)
Arne Flones -- The Digital Millenium Copyright Act: A Corporate
Bully Bludgeon (Jan 25, 2000)
Copyright Office: Exemption to Prohibition on Circumvention of
Copyright Protection Systems... (Jan 21, 2000)
Linux Journal: Copyright Strikes Back (Nov 23, 1999)
objdump-beautifier is a Perl script to make objdump output more useful. It traces function calls and
jumps, locates string constants, removes leading zeroes, and corrects objdump's annoying habit of making negative numbers positive.
But when it comes to reverse engineering, Nebergall says a prohibition in a shrink-wrap license would probably
be binding under UCITA. In the absence of UCITA, according to software engineer and lawyer Cem Kaner, "No court has ever upheld a
ban on reverse engineering for mass-market software."
Like other provisions in UCITA, a prohibition could be overturned by law. But current law is not too strong. The
flagship copyright law of the decade, the 1998 Digital
Millennium Copyright Act, permits reverse engineering "for achieving interoperability." Sounds good; if Corel wants to create
a word processing product that accepts .DOC files, that's all for the benefit of interoperability, isn't it?
But an aggrieved company could plausibly claim that reverse-engineered products constitute competition, not just
interoperability-so a prohibition on reverse-engineering might stand. And as Kaner
points out, reverse engineering has many legitimate purposes
that might be squelched by a shrink-wrap license.
Reverse engineering raises many of the same questions as the
user interface or "look-and-feel" copyright suits ten
years ago. Both issues raise the questions of what is the true intellectual property in software, and how important it is to promote
new innovations or competition in comparison to protecting earlier innovations. Where your sympathies fall will determine whether
you think UCITA is fair.
Design Recovery involves the examination of legacy code in order to reconstruct design decisions taken by the original
implementors. Artifacts in both the source code and in executable images are examined and analyzed. Software tools are necessary
because such systems are often very large and the goal is the understanding of significant global and diffuse features of that code.
In partnership with the IBM Centre for Advanced Studies,
the University of Toronto, the University of Victoria, and McGill
University, the group is developing novel tools and techniques that will leverage the human learning process when applied to the
understanding of legacy source.
A prototype tool (ART for Analysis of Redundancy in Text) based on exact matching of text has successfully identified
useful structural information in a 40 MB source tree and has demonstrated a potential to scale up to 500 MB. Work is underway on
integrating the tools of the research partners to produce a system with a shared repository, visualization tools, and access to a
major commercial tool.
Andys Binary Folding Editor is primarily designed for structured browsing, although it also provides minimal editing
facilities. This program is designed to take in a set of binary files, and with the aid of an initialisation file, decode and display
the definitions (structures or unions) within them. BE is particularly suited to displaying non-variable length definitions within
the files. This makes examination of known file types easy, and allows rapid and reliable navigation of memory dumps. BE is often
used as the data navigation half of a debugger.
[July 25, 1999] cgvg
Tools for convenient grepping through code. cgvg is a pair of Perl scripts ("cg" and "vg") which act as wrappers
for find and grep. The main idea is to act as a temporary replacement for cscope until there is a good GPLed cscope available. "cg"
does the grep through code, storing the info in a text file in the user's home directory, and "vg" lets the user open on editor at
the line in the file where a particular match was found. Some features include color highlighting, human-readable output, resizing
to screen width, support for many editors, etc.
May 08, 1999
Converts a program's source code to syntax highlighted HTML , 15:02 stable: 0.6.2 - devel: none license: Freeware
[July 17, 1999] legdoc is
a perl script to document C source file trees.
It uses tags often found in legacy C code to provide documentation for the same. It can either convert a list of named files or
an entire directory. The documentation will be stored in index.html.
Softpanorama Recommended
New:
[Dec 30, 2001] The Law & Economics
of Reverse Engineering by Prof. Pamela Samuelson. -- one of the best legal paper on the subject.
Highly recommended. See also Professor
Samuelson
[Dec 28, 2001] Decompilation page and link to a decompiler
by Satish Kumar. Contains a beta version of DisC - Decompiler for
TurboC and a small intro to the problem of decompilation using Intel assembler fragments of small C programs as an example.
See Decomlilation and Decompilers Page
Internal
External:
-
The Law & Economics of Reverse Engineering
by Prof. Pamela Samuelson. -- one of the best legal paper on the subject. Highly
recommended. [Dec 30, 2001]
- Techniques for Software Renovation
by Michael B. Siff
Software renovation is the process of introducing new features---including polymorphism,
objects, and encapsulation---into existing software systems while preserving the original functionality of the system. The goal of
software renovation is to improve the efficiency of development, maintenance, and comprehension. The research described in this thesis
focuses on three software-renovation techniques:
- Generalization: The identification and subsequent transformation of program components
that operate on a particular type of input into polymorphic program components that operate on a wide array of inputs.
- Modularization: The clustering of associated data types and functions with the intent
of encapsulating the types and function into distinct classes or modules.
- Physical subtyping: The identification of relationships among data types based on the
representation of the types in memory with the intent of generating inheritance hierarchies.
The techniques described in the thesis are aimed particularly at the problem of transforming legacy
C programs into C++ programs that make use of C++'s advanced features---most notably classes, templates, inheritance, and virtual
functions. Some aspects of this work apply specifically to the C-to-C++ problem; however, most aspects apply to almost any language.
(Click here to access
the paper.)
- Reports of theme
Interactive Software Development and Renovation -- several reports. some titles look very interesting, for example:
- The Problem of Reverse Engineering by Cem Kaner Published
in Software QA Magazine, 1998
Many people misunderstand the nature of reverse engineering. Those misunderstandings shape corporate policies and
legislation. As I've spoken to working software developers (especially and including testers) about Article 2B, I've been surprised
to discover that we are as likely to misunderstand the breadth and importance of reverse engineering as the lawyers. (Article 2B
is a 273-page proposed revision to the Uniform Commercial Code that will govern all software-related contracts. For more information,
see my website, www.badsoftware.com, or my new book, Bad Software, which has a
detailed appendix on 2B).
Originally, I presented material on reverse engineering at a conference on Article 2B for lawyers at UC Berkeley.
That led to additional talks and to a paper for lawyers (Kaner, 1998). This paper is not about the legal issues of reverse engineering.
And, though I mention Article 2B (as the key current effort to ban reverse engineering), this paper is not about Article 2B. The
reverse engineering debate will go on even if we kill 2B. Rather, I have four objectives with this paper.
- First, to make you aware of a debate whose resolution will affect your work (you probably do reverse engineering
quite often, and you might not like it if that gets banned).
- Second, to suggest some ways that you can articulate your concerns. (Your company might be one of the ones
pushing for a ban on reverse engineering. Maybe you should explain what this would cost them if they succeed.)
- Third, to appeal to you for examples. I wrote this paper out of my own personal experiences. They're good examples,
but there are better ones. A longer collection of good examples might carry a lot of influence.
- And finally, to solicit criticism from you. I'm going to make these arguments again and again and again and
again. If there are holes or unfairnesses, I'd like to know about them. For that reason, even though I have adapted this paper
from the one that I wrote for lawyers (cutting out legal arguments), I've left my descriptions of engineering issues largely intact.
- [ June 25, 1999] Collberg's Publications -- good
- Recent Publications of Yih-Farn Robin Chen
- A.Buchsbaum, Y. Chen, H. Huang, E. Koutsofios, J. Mocenigo, A. Rogers, M. Jankowsky, S. Mancoridis,
"Enterprise Navigator: A System for Visualizing and Analyzing Software Infrastructures", to appear in IEEE Software.
- Y. Chen, F. Douglis, H. Huang, K. Vo "TopBlend:
An Efficient Implentation of HtmlDiff in Java", to appear in the WebNet2000 Conference, San Antonio, Texas, November
2000. Also available as AT&T Labs - Research Technical Report TR00.5.2.
- H. Rao, Y. Chen, M. Chen, "A Proxy-Based Web Archiving
Service", Middleware Symposium, Portland, Oregon, July, 2000.
- J. Korn, Y. Chen, E. Koutsofios, "Chava: Reverse
Engineering and Tracking of Java Applets", Proceedings of the Sixth Working Conference on Reverse Engineering,
pp. 314-325, Atlanta, October, 1999.
- S. Mancoridis, B.S.Mitchell, Y.Chen, E.Gansner.,
"Bunch: A Clustering Tool for the Incremental Maintenance
of the Structure of Software Systems", Proceedings of the 1999 International Conference on Software Maintenance,
Oxford, England, August, 1999.
- P. Devanbu, Y-F. Chen, E. Gansner, H. Muller, J. Martin,
"Chime: Customizable Hyperlink Insertion and Maintenance
Engine for Software Engineering Environments", Proceedings of the 21st International Conference on Software Engineering,
pp.473-482, Los Angeles, May 1999.
- UQCS Cristina Cifuentes' Publications
- Todd Proebsting
- Reverse Engineering the LEGO RCX
- ristina Cifuentes' Publications
- Todd Proebsting: Report on Toba
See also Softpanorama Copyright Links
Etc
DIGITAL Technical Journal - Differential Testing
for Software -- technique useful in reverse engineering. In case you reimplementation is "almost" complete it can be tested against
etalon by feeding to both randomly modified test cases -- difference in behaviors can lead to interesting insights. IMHO especially
useful in case of reimplementation of Microsoft Office suit...
Application Note -- Software Test Tools Considered Harmful
A failure to synchronize during playback causes the test recording to abort and typically signals that the application has changed
in a way that the test has detected. The problem is that in some cases, when the application hasn't changed, the failed test would
imply that it had incorrectly. Too many such false-negative results would tend to lead to the view that the test suite is unreliable.
Testing Strategies and Methods
-- The Reengineering Forum
is an industry association to encourage combined industry/research review of the state of the art and the state of the practice in
reengineering of software, systems, and business processes. It is a meeting place for key people in the reengineering and reverse
engineering fields: developers, researchers, and leading-edge users.
WCRE The Working Conference on Reverse Engineering (WCRE) is the premier research conference on the theory
and practice of recovering information from existing software and systems. WCRE explores innovative methods of extracting the many kinds
of information that can be recovered from software, software engineering documents, and systems artifacts, and to examine innovative
ways of using this information in system renovation and program understanding. WCRE proceedings are available from IEEE Computer Society
Press
(phone +1-714-821-8380 or +1-800-CS-BOOKS; [email protected]). See
CS Press Catalog site.
- May 1993 - Baltimore, Maryland (with the Intl Conf on Software Engineering)
- Jul 1995 - Toronto, Ontario, Canada (with the Intl Workshop on CASE).
WCRE-95 Information
- Nov 1996 - Monterrey, California, USA (with the Intl Conf on Software Maintenance ).
WCRE '96- Abstracts
- Oct 1997 - Amsterdam, The Netherlands
4th WCRE Information Page
- Oct 1998 - Honolulu, Hawaii, USA (with the Automated Software Engineering Conf)
WCRE '98 - Working Conf on Reverse Engineering
- Oct 1999 - Atlanta, Georgia, USA
6th Working Conference on Reverse Engineering (WCRE
'99)
- WCRE '99 - Call For Papers
- Nov 2000 - Brisbane, Queensland, Australia
- Oct 2001 - Stuttgart, Germany
- Neil Aggarwal
- Cornelia's Boldyreff
- Giampiero Caprino
- Cristina Cifuentes
- Christian Collberg
- Arie van Deursen
- Jean-Marie Favre
- Bookmarks for Jean-Marie Favre
- J.M.Favre
"A Flexible Approach to Visualize Large
Software Products"
ICSE Workshop on Software Visualization,
Toronto, Canada, May 2001
J.M.Favre,
F. Duclos,
J. Estublier,
R. Sanlaville, J.J. Auffret
"Reverse Engineering a Large Component-based
Software Product"
European Conf. on Software Maintenance and Reengineering, (CSMR'2001), pp. 95-104,
Lisboa, Portugal, March 2001
J.M. Favre
"Understanding-In-The-Large"
International Workshop on Program Comprehension (IWPC'97)
Deadborn (Michigan), May 1997
- Douglas Low
- Todd Proebsting
- Clark Thomborson
- Paul M. Týma
- H. P. Van Vliet (RIP)
Society
Groupthink :
Two Party System
as Polyarchy :
Corruption of Regulators :
Bureaucracies :
Understanding Micromanagers
and Control Freaks : Toxic Managers :
Harvard Mafia :
Diplomatic Communication
: Surviving a Bad Performance
Review : Insufficient Retirement Funds as
Immanent Problem of Neoliberal Regime : PseudoScience :
Who Rules America :
Neoliberalism
: The Iron
Law of Oligarchy :
Libertarian Philosophy
Quotes
War and Peace
: Skeptical
Finance : John
Kenneth Galbraith :Talleyrand :
Oscar Wilde :
Otto Von Bismarck :
Keynes :
George Carlin :
Skeptics :
Propaganda : SE
quotes : Language Design and Programming Quotes :
Random IT-related quotes :
Somerset Maugham :
Marcus Aurelius :
Kurt Vonnegut :
Eric Hoffer :
Winston Churchill :
Napoleon Bonaparte :
Ambrose Bierce :
Bernard Shaw :
Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient
markets hypothesis :
Political Skeptic Bulletin, 2013 :
Unemployment Bulletin, 2010 :
Vol 23, No.10
(October, 2011) An observation about corporate security departments :
Slightly Skeptical Euromaydan Chronicles, June 2014 :
Greenspan legacy bulletin, 2008 :
Vol 25, No.10 (October, 2013) Cryptolocker Trojan
(Win32/Crilock.A) :
Vol 25, No.08 (August, 2013) Cloud providers
as intelligence collection hubs :
Financial Humor Bulletin, 2010 :
Inequality Bulletin, 2009 :
Financial Humor Bulletin, 2008 :
Copyleft Problems
Bulletin, 2004 :
Financial Humor Bulletin, 2011 :
Energy Bulletin, 2010 :
Malware Protection Bulletin, 2010 : Vol 26,
No.1 (January, 2013) Object-Oriented Cult :
Political Skeptic Bulletin, 2011 :
Vol 23, No.11 (November, 2011) Softpanorama classification
of sysadmin horror stories : Vol 25, No.05
(May, 2013) Corporate bullshit as a communication method :
Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000):
the triumph of the US computer engineering :
Donald Knuth : TAoCP
and its Influence of Computer Science : Richard Stallman
: Linus Torvalds :
Larry Wall :
John K. Ousterhout :
CTSS : Multix OS Unix
History : Unix shell history :
VI editor :
History of pipes concept :
Solaris : MS DOS
: Programming Languages History :
PL/1 : Simula 67 :
C :
History of GCC development :
Scripting Languages :
Perl history :
OS History : Mail :
DNS : SSH
: CPU Instruction Sets :
SPARC systems 1987-2006 :
Norton Commander :
Norton Utilities :
Norton Ghost :
Frontpage history :
Malware Defense History :
GNU Screen :
OSS early history
Classic books:
The Peter
Principle : Parkinson
Law : 1984 :
The Mythical Man-Month :
How to Solve It by George Polya :
The Art of Computer Programming :
The Elements of Programming Style :
The Unix Hater’s Handbook :
The Jargon file :
The True Believer :
Programming Pearls :
The Good Soldier Svejk :
The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society :
Ten Commandments
of the IT Slackers Society : Computer Humor Collection
: BSD Logo Story :
The Cuckoo's Egg :
IT Slang : C++ Humor
: ARE YOU A BBS ADDICT? :
The Perl Purity Test :
Object oriented programmers of all nations
: Financial Humor :
Financial Humor Bulletin,
2008 : Financial
Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related
Humor : Programming Language Humor :
Goldman Sachs related humor :
Greenspan humor : C Humor :
Scripting Humor :
Real Programmers Humor :
Web Humor : GPL-related Humor
: OFM Humor :
Politically Incorrect Humor :
IDS Humor :
"Linux Sucks" Humor : Russian
Musical Humor : Best Russian Programmer
Humor : Microsoft plans to buy Catholic Church
: Richard Stallman Related Humor :
Admin Humor : Perl-related
Humor : Linus Torvalds Related
humor : PseudoScience Related Humor :
Networking Humor :
Shell Humor :
Financial Humor Bulletin,
2011 : Financial
Humor Bulletin, 2012 :
Financial Humor Bulletin,
2013 : Java Humor : Software
Engineering Humor : Sun Solaris Related Humor :
Education Humor : IBM
Humor : Assembler-related Humor :
VIM Humor : Computer
Viruses Humor : Bright tomorrow is rescheduled
to a day after tomorrow : Classic Computer
Humor
The Last but not Least Technology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org
was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP)
without any remuneration. This document is an industrial compilation designed and created exclusively
for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong
to respective owners. Quotes are made for educational purposes only
in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.
Last modified: March 12, 2019