Q&A:Archiving Web Pages for Future Reference


“I am in the process of trying to convert to a less paper driven office. We have occasion to verify information on Web pages and typically copied it and placed it in the hard file. I cannot seem to save it to a file on the computer and view it after the fact. Any suggestions? I have tried copy and paste and send it to.”


“I’ve been saving Web pages as files or emailing them to myself for 10 years or so – ever since I first got Internet access. If neither procedure works, you must have a configuration problem. With some Web sites that use frames, a “save as” or “email to” command may just capture a blank frame; in that case, I’d use the cut and paste method described below, which is pretty well guaranteed to work.”

“When using the ‘save as’ command, Internet Explorer gives you the option of saving the whole page, or just the html portions, or just plain text. If you choose the whole page, you end up with an html file plus a folder of other stuff – image files that appeared on the page, etc. If the information includes pictures and other non-text items, you may need that; otherwise, I just choose the html only option. Plain text loses the formatting, which can make the article harder to understand, and really doesn’t reduce the file size by much. Other browsers like Firefox have the same choices, although they may have slightly different names.”

T”he method I use now most of the time is to email the information to myself (and to colleagues or clients at the same time), and then file my copy of the email in an appropriate place. There’s 2 ways to do that. There’s a command in the file menu to ‘send page’ or ‘send page by email’. Insert your own address if it’s not there by default (as it would be if you send yourself blind copies of all your messages), and choose a subject line that will make it easy to find the message when you need it. The second way to email information is to highlight it on the Web page, hit control + c to copy it, and paste it into the blank message box. That has the advantage of avoiding all the ads, unrelated links, etc. that may be on the page. With either method, you can make a note to yourself at the beginning of the email about why you’re saving the information, and if you’re using the cut and paste method, you can combine information from several pages in a single message.”

“One final tip – the little “print” link that appears on most Web stories is invaluable. Its main function is to generate a much cleaner version of the article with most of the ads removed. If the article is a long one, it will often be broken into several smaller pieces on the main page – you’ve seen items where you have to keep clicking “next” to get through it. You’d have to email or save each part of the article separately if you’re working from the normal page. But if you use the “print” option, you’ll almost always get the entire article at once, including any pictures or other graphics that are actually part of the story.”

“There are other ways to archive information – printing the pages to a pdf driver, or emailing just the link rather than the whole story, or using a service like digg or delico.us. To use the material as evidence, the pdf version might be preferable. I avoid anything that relies on sending just the link to the page rather than its contents, because links stop working when the article is taken off the Web site or moved to the site’s archives where a password is required. If you’ve saved the article as a file or emailed its content to yourself, you’ll have it no matter what the Web site decides to do with it. For that reason, I don’t use the “email” icon that often appears alongside the ‘print’ link on a page. While it may sound like just what’s needed, it usually just sends the link to the page.”

James Sayre Community Legal Assistance Society Suite 300, 1140 West Pender Street Vancouver, B.C. V6E 4G1
_____________________________ Source: The Technolawyer Community: Answers to Questions April 3, 2008 http://www.technolawyer.com

Contact Information