|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
Copyright: Dr. Nikolai Bezroukov 1994-2013. Unpublished notes. Version 0.80.October, 2013
Contents : Foreword : Ch01 : Ch02 : Ch03 : Ch04 : Ch05 : Ch06 : Ch07 : Ch08 : Ch09 : Ch10 : Ch11 : Ch12 : Ch13
Chapter 5: Macro Viruses
Contents
How to check if the document is in RTF forma. Fighting the fake RTF problem
How to convert the document to Rich text format, WordPad as an antivirus tool
What to do if Save As does not work
What to do if saved in RTF format document is not seen in File/Open window
RTF and protection of documents with password
How to distinguish corrupted Ms Word documents from infected
What to do if you cannot open damaged document
Insert damaged document as a file into new document
Open the File by Linking to It
Using Paste Special in MS Word 6.0
How to open damaged Ms Word document in WordPad or Microsoft Write
RTF format can be considered as a kind of proprietary text version somewhat similar to TeX and is a form of encoding of various text formatting properties, document structures, and document properties, using the printable ASCII character set. In this respect it is similar to HTML, but like TeX provides more formatting capabilities (see for example this fragment of RTF specification). It was developed by Microsoft and used by many vendors (Lotus).
RTF format does not contains macros and as such immune to macro viruses. |
RTF format does not contains macros and as such immune to macro viruses. Moreover if you need to translate MS Word document for HTML you will be better off converting it to RTF first and then using HTML editor Inserting File feature (for example FrontPage 98 Inserting Files feature). HotMetal Pro also provide this functionality. It usually provide more reasonable translation than a straightforward conversion in MS Word 97 or Ms Word 2000 via Save to HTML item in the File menu. In Ms Word 97 the latter is really unimpressive -- for example all headers are not translated to <h> counterparts. In Office 2000 it's pretty strange and very verbose with immense number of SPAN format each 100 character long. Actually it looks more like translation to XML.
Due to absence of any decent built-in tools in MS Word that would list macros and integrity check the body it is recommended to use the Rich Text Format ( .RTF) in e-mail attachments instead of native MS Word format whenever possible. It company adopts such a policy that it's easy to check such attachnments and block them on the e-mail gateway. that does not mean that documents in native MS Word format cannot be send by e-mail -- they just need to be packed by WinZip or similar archiver -- and this is a good think becuase the esiness of sending such attachments is the root of all evils.
For text documents the size of the file in RTF format will be approximately the same as in native MS Word format (it will be much less, if Fast Save mode is used for saving documents in native MS Word format). At the same time MS Word 97 documents that contains pictures (especially large one) should not be converted to RTF format as it will substantially increase the size of the document.
Small documents could also be included directly into the text of e-mail message using Copy(Ctrl-C) and Paste(Ctrl-V) operations available in Windows. MS Mail document format is essentially RTF format, so its immune to macro viruses. Netscape Mail use HTML -- also immune to viruses (you can lose formatting of the document in current version of Netscape Mail).
RTF format has several important side benefits:
Important: There are 2 different things: file extension .RTF and text format RTF (Rich Text Format). |
Important: There are 2 different things: file extension .RTF and text format RTF (Rich Text Format).
In all five major MS operating systems (DOS, Windows 3.xx, Windows CE, Windows 9x and Windows NT) file extension and a real file format are quite independent. You can consider this a bug or a feature, but this how it is in Microsoft world. For a file you can have any extension you want. Document with .RTF extension is immune to macro viruses if and only if the file is really in RTF format. If the file is just renamed document in native MS Word format it can be infected like any other MS Word document (CAP.A virus fools the user into illusion of saving the document in RTF format, but save it in native MS Word format in a file with extension .RTF). So extension by itself does NOT guarantee absence of macro virus. Correct conversion to RTF format in clean MS Word environment does guarantee absence of macro viruses in the document.
Reverse is also true: one can save document in RTF format with extension .DOC. In MS Word that have additional benefit -- you do not need to change default format of open menu to see all documents that are present in the current directory (in fact this is difficult in MS Word versions before Word 97). In fact if user will try to save document in MS Word 97 (without Service Pack 1 applied) using MS Word 6.0/95 compatibly format, the document will be saved as RTF document with extension .DOC.
RTF file has the following general syntax:
'{' <header> <document>'}'
As one can see is should start with "{" and end with "}" that provide an easy method of checking the format of the file.
Inside RTF file the are commands called control words. A control word starts with delimiter "\" should contain only lowercase lowercase alphabetic characters(a-z) and cannot be longer than 32 characters:
\controlwords<Delimiter>
Note that a backslash should precede each control word for example \rtf1\ansi are two control words the usually start RTF file. The delimiter can be a space, a digit or a hyphen ( indicates that a numeric parameter follows. The subsequent digital sequence is then delimited by a space or any character other than a letter or a digit) or any character other than a letter or a digit. Control symbols consists of a backslash followed by a single, nonalphabetic character. For example, \~ represents a nonbreaking space. They do not need any delimiters.
A group consists of text and control words or control symbols enclosed in curve braces "{" and "}". Each group specifies the text affected by the group and the different attributes of that text.
Formatting specified within a group affects only the text within that group. Generally, text within a group inherits the formatting of the text in the preceding group with the exception of footnote, annotation, header, and footer groups
The control properties of certain control words have only two states. When such a control word has no parameter or has a nonzero parameter, it is assumed that the control word turns on the property. When such a control word has a parameter of 0 , it is assumed that the control word turns off the property. For example, \b turns on bold, whereas \b0 turns off bold.
The control words, control symbols, and braces constitute control information. All other characters in the file are plain text.
As we can see from the discussion above any RTF document should start with sequence "{\". These two symbols represent a typemark of RTF document in the same way as MZ(or ZM) represent typemark of the executables in EXE format. To test if the document is in RTF format open it in any plain-vanilla text editor (or viewer if you use Orthodox File Manager), for example in Notepad. The following 4 steps do the trick:
If the document is really large (Notepad will NOT open large documents), then instead of using Notepad one can use Write. Please type the following command from the DOS prompt:
WRITE filename
It will prompt you about conversion, answer "NO". Again, the check is the same. The "real" RTF looks like an ANSI text document and will start with:
{\rtf1\ansi\... ... ...
Native MS Word format is binary and will start with some bizarre symbols and will contain a lot of spaces(unprintable characters):
ÐÏࡱá ; þÿ þÿÿÿ
Note: This test should always performed if the document is send as an attachment to a large group of people.
It's easy to write Perl Script that will check all documents with the extension .RTF for the standard beginning.
In order to save documents in MS Word in the .RTF format please open File menu, select SaveAs option and click the mouse on the arrow in right corner of the Save as type option. You will see the list of formats that MS Word "understands". Scroll it down a little bit and select Rich Text Format (.RTF). Then click on Save button. Please make sure that Tools|Macro menu produce list of macros. If not, than you may be infected with CAP.A or similar virus and should disinfect the document using F-macro or other available tool.
Actually the best tool for conversion from the native MS Word format to RTF is WordPad. this utility can be called the best anti macro virus tool and it is more reliable that any AV program. WordPad does not interpret any macros and as such is much more suitable tool for conversion of infected documents than Ms Word.
The best tool for conversion to RTF is WordPad |
ATTENTION: If you will not select Rich Text Format (.RTF) in Save File As Type option, but just change extension to .RTF that you will save document in regular MS Word format (DOC format), but with extension .RTF. Document will remain infected and the fact that .RTF files are infected could produce a lot of confusion in troubleshooting the problem. This is a common mistake that often is done during conversion.
You will be able to open attachment in .RTF format as a regular MS Word document but you will need to specify this format Files of type option of the File/Open menu.
The most common mistake is to rename the file by typing extension .RTF in Save As dialog, instead of converting it. For example if user forget to change type to Rich Text Format (.RTF) in Save File As Type menu, but just change extension to .RTF, document will be saved in regular MS Word format (DOC format), but with extension .RTF. Document will remain infected and the fact that .RTF files are infected could produce a lot of confusion in troubleshooting the problem. Some viruses, including CAP.A and NPad produce the same effect.
In most cases if document was infected with macro virus the standard template normal. dot will be infected too. The best way to disinfect it is to restore it from some backup.
The best way to disinfect normal.dot template is to restore it from backup. That means that you should have a backup of this file. |
In case it does not contain any useful customarizations it can be simply deleted. Ms Word will recreate it automatically. This is a disinfect ion of the last resort ;-). It's also possible to delete macros from normal.dot template but it requires some experience in working with templates and will not be covered in this document.
If document is infected or is marked as "template", then you will not be able to save it directly as .DOC or .RTF file via Save As option of the File menu. In this case open new document, then switch to previous windows via Windows menu, select all document via Edit/Select All, copy it with Ctrl-C(or Edit/Copy), switch back to new document via Windows menu and paste content via Ctrl-V (or Edit/Paste). Now you will be able to save the document as Rich Text Format.
By default in Ms Word 6.0 and Ms Word 7.0 (Word 95) File/Open windows shows only documents in with extension .DOC. So one need to change Files of type menu to All files or Rich Text Format to see documents with extension .RTF.
The second reason can be that directory in which document was saved is different from the current directory.
The RTF format is not a panacea. It is good for text documents but documents with pictures substantially increase in size after conversion. In such cases it can be used as temporary format just for disinfection. after disinfection document can be converted back to native format.
At the same time there is no anti-virus tool that is 100% proof against any possible form of macro virus. That simple fact underlines importance of RTF in corporate environment. So change of format is the only reliable alternative to complex "on the fly" macro virus protection software like Macro Virus Track that can create more problems than it solves. Of course RTF in not only format immune to macro viruses. There are other possibilities. Basically there are 2 alternatives to RTF format:
As for the first alternative, the most attractive is to use HTML instead of RTF. HTML is definitely the future. But currently HTML is limited and not all Ms Word documents could be converted to HTML without loss of formatting. HTML format can contain JavaScript macros, but currently there is no macro viruses, written in JavaScript.
Second is to use Adobe Acrobat format. I am not sure that it is completely immune from macro viruses, but MS Word viruses will not survive the conversion. That’s for sure. This format is not supported by current versions of MS Word so one need third party conversion utilities to perform the conversion.
Yet another alternative is to use different word processor (for example Word Perfect) in case it is installed on the particular PC. Theoretically for home users Corel Office with Word Perfect instead of Microsoft Office is much more cost effective solution than Ms Office. For heavy Lotus Notes users Lotus Smart Suite could be also an attractive choice. XML compliant, Office 97 compatible SmartSuite Millennium Edition is a $149 suit that competes in functionality (and at the same time has much better price) with MS Office (see Softpanorama bulletin vol.10, No.3 for details)
SmartSuite Millennium Edition contains of five important applications: WordPro, 1-2-3, Approach, Organizer and Freelance . Speech recognition technology is supported for 1-2-3 and WordPro. It also includes data sharing capabilities with eSuite Workplace -- Java minisuit from Lotus
The most interesting is FastSite -- a new Web publishing application that allows end-users to take set of files created in Word, spreadsheet, or other desktop applications, convert them all to HTML and download to the WEB page
New capability In 1-2-3 includes the ability to access and present Web data in a spreadsheet. For example, a user can take stock quotes, put them in the spreadsheet and link them to a Yahoo business site.
SmartSuite Millennium Edition is compatible with Microsoft Office 97 applications and is XML, as well as HTML, compliant. It also features connection capabilities for enterprise applications developed by SAP and PeopleSoft.
Upgrades from previous versions, or from competitive products are $149. There are chances that Smartsuite will be available for Linux too.
But Ms Word is dominant word processor now, and there are important advantages of using dominant software, even it cost more.
MS Word protection is breakable and is not recommended for highly sensitive information. Also it works only for native MS Word format. So if user wand to use this feature he/she needs to use native MS Word format.
But it is very easy to protect RTF document with PGP and/or PKZIP. Both (especially PGP) provide pretty decent level of protection (PGP provides military strength protection).
PKZIP is simplest to use, but does not provide strong protection. The main advantage of using PKZIP is that it cuts transmission time via modem in half or more. So it could be recommended to field personnel that communicate via modems. U still use older version of pkzip (2.04g) but current version of command line PKZIP utility is 2.5x. It is available for DOS, and all flavors of Windows (3.11, 95, NT). You can download it from almost any file archive on the Internet including SimTel and its mirror sites, www.cdrom.com and its mirror sites. It is also available directly from PKZIP® Command Line for Windows 95NT site.
In this case to protect document using PKZIP one need to run command (option -s may be different in pkzip 2.5x)
pkzip -s <name_of_ZIP_file> <names_of_the_document(s)_in_RTF_format>
for example in order to create file confidential.zip from MS Word document web_abusers.doc one needs to execute a command line:
pkzip -s web_abusers.zip web_abusers.doc
PKZIP also has advantage that you can protect as many documents as you wish in one archive. For example in order to store all documents from current directory in zip file confidential.zip one need to execute command line:
pkzip -s confidential.zip *.*
RTF preserve static imbedded objects like pictures, tables, etc. Dynamic imbedded objects like EXCEL spreadsheets are not preserved. In this case native MS Word format should be used.
Damaged documents can cause MS Word to exhibit unusual behavior. Such behavior occurs because the program attempts to make decisions based on incorrect information. Damaged documents often exhibit behavior like infinite repagination, incorrect document layout and formatting, unreadable characters on the screen, error messages during processing, system hangs or crashes when you load or view the file. In that sense behavior is close to behavior of infected documents.
NOTE: Often document is corrupted when it was saved in Word for Windows with the Allow Fast Saves check box selected. In this case the text is stored in noncontiguous blocks. The Fast Save feature keeps track of the changes that you make by appending the changes to the end of your document and remembering where these changes go. On slow computers this method is faster and takes less memory than saving the entire document, but it can lead to additional problems. It is strongly recommended to turn off this feature of MS Word via Tools/Options/Save.
To rule out macro virus infection, use the following steps:
First try to save the file in RTF file format; this format preserves the formatting in your Microsoft Word for Windows document, but delete all macros. After you save the file in RTF format, re-open the document in Word for Windows, and convert it back to native DOC format. If this method succeeds, the file corruption is removed during conversion. If problem persists, please rename NORMAL.DOT and any templates associated with the document or stored in the STARTUP directory and repeat conversion (NORMAL.DOT could be infected). Then repeat conversion to RTF.
If conversion to RTF fail to, but you can view at least the beginning of the document, you can try to copy part of the corrupted document to a new document. Then save the rest of the file in Text Only With Line Breaks format and recover the rest of the file from text document by reapplying formatting:
Other things to try:
There are several techniques you can use to try to open a document that will not open. Which method you use depends on the nature and severity of the damage to your document and the nature of the behavior exhibited. Any MS Word document can be recovered with (sometimes total) loss of formatting and macros. In any case you will usually be able to recover most of text of the document. Formatting will usually be lost. You have several options:
MS Word 97 has a special text recovery tool "Recover Text From Any File converter" (see below). So it is recommended to use Ms Word 97 for recovery of documents in Word 6.0 format too.
Sometimes you can open a document successfully in draft mode when it will not open in other views. Once you open the file, you may be able to recover or repair the file. To do this please open View menu and click Normal. On the Tools menu, click Options, select the View tab, and select the Draft Font option.
You can also use following macro to turn off screen updating, open your damaged document, switch to draft mode, and then reactivate screen updating:
Sub Main
ScreenUpdating 0
FileOpen .Name = "<path>\Filename.doc" ' path and name for damaged document
ToolsOptionsView .DraftFont = 1
ScreenUpdating
End Sub
Using this macro may enable you to open documents that you cannot otherwise open due to damage that affects printer setup, page layout, or screen updates in Word. For example, if a general protection (GP) fault occurs in Word before the document is opened, you may be able to avoid it using the above macro.
Second method is to insert the damaged document as a file into a New Document. As was mentioned above the final paragraph mark in a Word document contains information about the document. If the document is damaged, you may be able to retrieve the text of the document if you can omit this final paragraph mark. To access a document but leave its final paragraph mark behind, use the following steps:
1. Create a new blank document.
2. On the Insert menu, click File.
3. In the Insert File dialog box, locate and select the damaged document, and click OK.
4. Strip out the File Header Information in MS DOS mode
You may need to reapply some section formatting to the last section of the document.
This method works for Ms Word 7.0 (Word 95). It does not work for Ms Word 6.0. When you create a link, part of the header information of the corrupted document is not read. So this method allows you to open the file if this part of the header or if the final paragraph mark is in the damaged area of the document.
Use the following steps:
When you cannot open a damaged document in MS Word (usually because of corruption in the file header), you can strip out the file header and open the file in WordPad or Write. Formatting will be lost but you will be able to recover text.
To recover text from the corrupted document you can strip header information in MS DOS mode.
copy con+FILENAME.DOC NEWNAME.DOC
where "FILENAME" is the name of the damaged file, and "NEWNAME" is
the name of the new file. (This causes the word "CON" to appear and the insertion point to blink on a blank line.). Then press the SPACEBAR twelve times. Press F6, and then press ENTER.
Word 97 have special Recover Text From Any File converter. It allows you to extract the text from any file. The file does not have to be a Word file. Using the Recover Text From Any File converter does have its limitations. Document formatting will be lost, along with anything that is not of a text nature. Graphics, fields, drawing objects, and so on, will not be converted. However, headers, footers, footnotes, endnotes, and field text, will be retained as simple text. To use the converter on a Word file:
1. On the File menu, click Open, and select the document.
2. In the Files of Type box, select Recover Text from Any File, and click Open.
To use the converter on any non-Word file:
MS Word 97 can automatically recover damaged file. Message will be displayed when the corrupted document will be opened. If you click No, Word will not attempt to recover the document. If you click Yes, Word will attempt to recover the document. If Word is successful in recovering the file, save the document as a normal Word document.
If "Recover Text from Any File" Doesn't Appear in the Convert File dialog box then the Recover Text From Any File converter is not installed. You will need to re-run Setup to install this converter.
Rich Text Format (RTF) Specification and Sample RTF Reader Program
Tip 118 Converting a Word for Windows Document to RTF Format
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March 12, 2019