Web Archiving Whitepaper

15
Web Archiving: The Next Phase in the Evolution of Archiving An Osterman Research White Paper Published November 2010 SPONSORED BY Osterman Research, Inc. • P.O. Box 1058 • Black Diamond, Washington 98010-1058 Tel: +1 253 630 5839 Fax: +1 253 458 0934 [email protected] www.ostermanresearch.com Twitter: @mosterman

Transcript of Web Archiving Whitepaper

Page 1: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

sponsored by

An Osterman Research White Paper Published November 2010

SPONSORED BY

!!

!!

!!!!!

!"#$!#%&'()*(!

!"#$!#%&'()*(Osterman Research, Inc. • P.O. Box 1058 • Black Diamond, Washington 98010-1058

Tel: +1 253 630 5839 • Fax: +1 253 458 0934 • [email protected] www.ostermanresearch.com • Twitter: @mosterman

!

(

!!

!

Page 2: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 1

Executive Summary OVERVIEW The web has become the primary communication and commerce channel for businesses and government agencies. Digital media (web sites and other web-based content) has all but replaced print media as the primary mode of communication with customers, constituents, prospects, investors and others. The web is also becoming the primary channel for transacting business, managing commerce for everything from online purchases to tax payments. However, business and governments do not yet understand that they are liable for everything they publish online. Organizations that do not archive web content run the risk of not preserving a record of their claims, offers and other content posted on their web sites. Retaining this content has become both a legal and regulatory requirement, and so the question is not if web content should be retained, but only how much and for how long it should be preserved. Web archiving has been going on for quite some time, but enterprise-class solutions have only recently become available. New, state-of-the-art technology is now available to manage web archiving and it has the power and flexibility to meet existing and emerging web archiving requirements. As a result, any organization that uses the web to communicate or manage commerce should consider developing a web archiving policy and deploy the appropriate technology to support that policy. KEY TAKEAWAYS The fundamental message of this white paper is: • Web archiving is, without question, a best practice for virtually any organization.

Organizations that do not archive web content are placing their organizations at unnecessary risk from both a legal and regulatory viewpoint, and they are denying themselves the use of capabilities that can provide a distinct competitive advantage.

• Web archiving is fundamentally identical to what many organizations have already

implemented in the context of email archiving, file archiving and long-term retention of other types of important business content. In essence, web archiving is merely a superset of traditional types of archiving that are already well established in business and government.

• Many current web archiving technologies are not designed with enterprise-class

capabilities that will retain web content of evidentiary value. • Organizations should consider developing a web archiving policy, particularly as

more content migrates to the web and web-based applications.

Page 3: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 2

ABOUT THIS WHITE PAPER This white paper discusses the importance and benefits of web archiving and various use cases for it. It also briefly discusses the sponsor of this white paper and their relevant offerings in the space.

Why the Web Represents the Next Phase of Archiving WHAT IS WEB ARCHIVING? Web archiving is what its name implies: the capture and archival storage of web-based content. This can include individual web pages, entire web sites, content from web 2.0 applications like social networking sites, and other web-based content that is important to capture and retain, normally for long periods. The concept of web archiving is not new. For example, the Wayback Machine – a web archiving service maintained by the non-profit organization Internet Archive based in San Francisco, California – has been archiving web content since 1996i. However, the Wayback Machine has several limitations for use in a business context: • Web content is captured only periodically, not on a regular basis. This can prevent

the capture of a large proportion of web content, particularly for sites that update content frequently. Further, changes to a web page or web site may not be captured if the change occurs between content “snapshots”, the frequency of which is determined by Internet Archive.

• There is no guarantee that all web content will be captured. • Web content is not necessarily captured in a way that will satisfy evidentiary rules

during legal or regulatory proceedings. As a result, while the Wayback Machine is a good first step toward archiving web content, more sophisticated – and enterprise-class – web archiving is becoming a necessity for a growing number of applications, as discussed below. WHAT DRIVES THE NEED FOR WEB ARCHIVING? Many of the drivers for web archiving are fundamentally the same as those for email and other electronic content archiving: • Web content can be required for e-discovery and other litigation support

requirements in much the same way that emails, word processing files, PDF files and other content are required.

• Similarly, web content can be required to demonstrate an organization’s compliance

(or lack thereof) with regulatory requirements in the context of advertising, forward-looking statements, claims of suitability and other content that must – or must not – be posted to web sites.

Page 4: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 3

• Many organizations have a requirement, often driven by a need to reduce risk or maintain adequate records, to preserve web site content as part of their overall records retention and records management strategy.

• Unlike more traditional forms of archiving, web archiving can actually be used as a

competitive and/or investigative tool to understand content posted on competitors’ web sites.

WEB ARCHIVING vs. SERVER BACKUPS There are some significant differences between web backups and web archives: • Although both a backup and an archive of a Web site can reproduce content at a

later date for forensic, e-discovery or data mining purposes, a web archive will do so more quickly, more affordably and more easily.

• Because of the ubiquity of database-driven web sites, a backup must retain archives

of all of the files, as well as all of the databases that control the web site. • Searching through backups of a web site is much more difficult and more time-

consuming than searching through an archive. WEB ARCHIVING: THE NEXT STEP Web archiving can rightly be considered the next logical extension of an organization’s traditional archiving of email, files and other electronic content. While email and other types of electronic content archiving tend to focus on internal content – emails sent to and from employees and business, word processing files and presentations created for internal uses, and so forth – web archiving trends to focus much more on publicly available content. Because the web – including static web sites, web applications, social networking content, etc. – is primarily public-facing in nature, web archiving focuses primarily on content that the public has already seen or has had the opportunity to see. As a result, web archiving is focused to a greater degree than traditional electronic content archiving on issues like brand protection; reputation management; policy enforcement; protection of content based on when it is created, posted and taken down; business continuity and corporate memory.

Archiving Is Already an Established Best Practice THE WEB IS GROWING RAPIDLY The amount of content on the web has ballooned exponentially in recent years. For example, as of December 2009, there were 234 million web sites, 47 million of which were added just in 2009ii - an average of nearly 129,000 web sites added every day. Further, even as far back as 2008 there were well in excess of one trillion unique URLs on the web and the number continues to grow at a rapid pace. Growth of the web is being driven by a number of factors, including the ubiquity of web access, the ease and low cost with which content can be published and updated, and

Page 5: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 4

greater cultural acceptance of the web as a medium of information-sharing and commerce. For these reasons, both business and government are increasingly reliant on the web as their primary means of communications and process management. Consequently, the market for web archiving – as well as archiving of email, files, SharePoint content and other information – is growing at a healthy pace. Web archiving, currently a small segment of the total content archiving market, is poised to become an enormous area of growth, driven by the issues discussed in this white paper. GROWTH IN THE MARKET IS DRIVEN BY A VARIETY OF FACTORS For just about any company, government agency or educational institution, there are four primary drivers for archiving their electronic content. However, the importance of these drivers will vary by an organization’s size, the industry(ies) in which it participates, the advice of its internal and external legal counsel or compliance officers, and the locales in which it operates: • Driver #1: Litigation

Electronic content stores, including web sites, contain a growing proportion of business records that must be preserved for long periods of time. Further, this content is frequently requested during discovery proceedings because of the Federal Rules of Civil Procedure (FRCP) and state versions of the FRCP. As a result, it is critical that all relevant electronic content be made available for e-discovery purposes. Further, when a hold on data is required, it is imperative that an organization immediately be able to begin preserving all relevant data. For example, if a dispute arises because of a claim made on a page of a company’s web site, that content must be preserved for as long as a court, regulator or other authorized entity may deem necessary. An enterprise-class web archiving system allows organizations to immediately place a hold on data when requested by a court or on the advice of legal counsel. If an organization is not able to adequately place a hold on data when it is obligated to do so, it can suffer a variety of serious consequences, ranging from embarrassment to major legal sanctions or heavy fines. Litigants that fail to preserve electronic content properly are subject to a wide variety of consequences, including brand damage, additional costs for third-parties to review or search for data, court sanctions, directed verdicts or instructions to a jury that it can view a defendant’s failure to produce data as evidence of culpability. In addition to the e-discovery and legal hold benefits, an enterprise-class web archiving system allows an organization to perform either formal or informal early case assessment activities. For example, if a customer makes a claim against a company based on a statement made on the company’s web site, senior managers can search the archive for information that will help them determine the potential liability they face. If this assessment of the potential lawsuit results in a determination that the company was indeed wrong in making the claim, they can instruct legal counsel to pursue a quick legal settlement. If, on the other hand, the

Page 6: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 5

assessment results in the discovery of information that supports the company’s position, that information can be used to convince the customer to drop the case or it can help win the case if it goes to trial. In either case, an archiving system can help the organization to understand its position early on, either avoiding unnecessary legal fees or an adverse judgment, or reducing its costs by proving the sufficiency of its case.

• Driver #2: Regulatory Compliance

For just about every organization, there are a large and growing number of regulatory obligations to preserve electronic content. Some of the more important requirements are:

o Sarbanes-Oxley Act of 2002

The Sarbanes-Oxley Act of 2002 requires all public companies and their auditors to retain such relevant records as audit workpapers, memoranda, correspondence and electronic records for a period of seven years. Further, Section 403 of Sarbanes-Oxley amended Section 16 of the Securities and Exchange Act of 1934 to include a requirement for public companies to post certain types of content on their web sites. Under Sarbanes-Oxley, company officers are obliged to report internal controls and procedures for financial reporting and auditors are required to test the internal control structures. Businesses have to ensure that information is preserved – whether paper or electronic – that would be relevant to the company’s financial reporting.

o Health Insurance Portability and Accountability Act of 1996 (HIPAA)

All organizations operating in the healthcare field need to comply with HIPAA to ensure the safety of Protected Health Information. Organizations are required to protect the data from unauthorized users, as well as to retain for six years a broad range of documentation regarding their compliance. As part of the American Recovery and Reinvestment Act of 2009 (ARRA), the provisions of HIPAA have been significantly expanded. A key component of ARRA is the Health Information Technology for Economic and Clinical Health Act (HITECH). Now, business partners of entities already covered by HIPAA, such as pharmacies, healthcare providers and others, are required to comply with HIPAA provisions. This includes attorneys, accounting firms, external billing companies and others that do business with covered entities. While these business associates were accountable to the covered entities with which they did business under the old HIPAA, these associates are now liable for governmental penalties under the new law. HIPAA violations have been expanded dramatically. For example, if a covered entity or one of their business associates loses 500 or more patient records, it must notify HHS and a “prominent media outlet” to let them know what has occurred. Section 13402 of HITECH requires that if a “covered entity has insufficient or out-of-date contact information for 10 or more individuals, the

Page 7: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 6

covered entity must provide substitute individual notice by either posting the notice on the home page of its web site or by providing the notice in major print or broadcast media where the affected individuals likely reside.” Fines for HIPAA violations can now reach as high as $1.5 million per calendar year.

o Securities and Exchange

Commission Rules Members of national securities exchanges, brokers and dealers are obliged to preserve all records for a minimum of six years, the first two years in an easily accessible place (SEC Rule 17a-4). The affected records are broad and encompass originals of communications generated and received by individuals within financial institutions, including inter-office memoranda and internal audit working papers. Also included are automated messages sent to all customers, which could include email blasts. The records may be "immediately produced or reproduced on 'micrographic media' [microfilm, microfiche or similar] or by means of 'electronic storage media'. As noted above the Securities and Exchange Act of 1934 has been amended to specifically include the requirement to post certain types of content on the web.

o Financial Industry Regulatory

Authority (FINRA) FINRA is a non-governmental regulator formed in 2007 by the merger of various functions of the New York Stock Exchange and the National Association of Securities Dealers. FINRA manages a wide variety of rules that are imposed upon the more than 5,000 brokerage firms and nearly 675,000 registered representatives it oversees. FINRA requires that various types of communications with the public must be filed prior to their use, including content that often would be posted on web sitesiii. This includes CMO advertisements, sales literature and investment analysis tools.

Recent FINRA Disciplinary Actions Related to Web Content • An individual posted false and

misleading information on a Google Finance bulletin board relating to securities recomm-endations. The posting contained predictions and projections of future prices for the securities that were recommended, but the posting was made without approval. FINRA fined the individual $10,000 and suspended him from associating with any FINRA member for six months.

• A company made false and

misleading statements on its web site related to low cost commission rates and direct access to traders. The company was censured and fined $20,000.

• An affiliate of a company

participated in and won CD auctions without disclosing it was an auction participant. Further, the advertising materials used contained misleading, unwarranted and exaggerated statements, and published misleading market clearing yields on its web site. The company was found to have violated Rule 2210 and fined $225,000.

Page 8: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 7

o Model Requirements for the Management of Electronic Records (MoReq) MoReq is a specification, originally developed in 2001, that defines the functional requirements for the manner in which electronic records are managed in an Electronic Records Management System. MoReq has been used widely in Europe and has been updated with MoReq2.

o Other requirements

A small sampling of the many other requirements for data retention are FINRA 3010, the Investment Advisors Act of 1940 (hedge funds), the Gramm-Leach-Bliley Act, IDA 29.7, FDA 21 CFR Part 11, OCC Advisory, the Financial Modernization Act 1999, Medicare Conditions of Participation, the Fair Labor Standards Act, the Americans with Disabilities Act, the Toxic Substances Control Act, the UK Companies Act, the UK Company Law Reform Bill - Electronic Communications, the UK Combined Code on Corporate Governance 2003, the UK Human Rights Act, Basel II, and the Markets in Financial Instruments Directive.

• Driver #3: Knowledge Management and Data Mining

There is an enormous amount of useful content that is posted to a company’s own web site or other sites. This includes identifying and extracting information about companies’ products, their public financial information, their participation in trade shows and a wealth of other types of content. Applications for this information include competitive analysis, determination of compliance with various statutes, performing analytics to determine at what time of year certain events take place, and so on.

• Driver #4: Maintain Corporate Memory Web archiving can be very useful for maintaining a corporate record of what has been posted to a web site, how long this content was maintained or when it was replaced. For example, a company may want a record of its web site for historical purposes, or it may need an archive in order to re-use some of its content at a later date. Maintaining an accurate archive of web content can significantly reduce the costs associated with recreating this content.

The Consequences of Not Archiving Web Content The vast majority of organizations do not adequately archive their web content and they face a number of risks from not doing so: • Increased risk in legal disputes

An inability to produce past content from web sites – as with any electronic content – carries with it increased risk during legal actions. This includes an inability to produce time-stamped copies of web pages that will be admissible in court, an inability to respond to e-discovery requests when specific web content is required, and an inability to place legal holds on data so that existing web content is not overwritten when a legal dispute has been initiated or is anticipated.

Page 9: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 8

• Risk of non-compliance with regulatory obligations Many heavily regulated organizations, such as broker-dealers, have specific obligations to make (or not make) statements or claims on their web site. For example, FINRA Rule 2210 requires broker-dealers to archive their institutional communications, retail communications and correspondence. Because advertising and other public-facing communications often appear on regulated entities’ web sites, it is critical that web content is archived.

• Loss of context for notices, marketing messages, etc.

An organization that is not able to archive its web content cannot easily provide the context for its various web-based marketing messages and other communications. The use of this otherwise lost historical data can help a company keep track of past marketing campaigns, offers, policy statements, notifications to the public and a wide range of other content.

• An inability to prove when statements were made or retracted

Similarly, not archiving web content makes it very difficult to prove exactly when content was posted or removed from a web site or web page. For example, if a press release is embargoed until a certain date and time, a web archiving system can demonstrate exactly when the content was posted, and conversely can prove that the content was not posted before the embargo had been lifted. Another example is that of warning letters issues by the US Food and Drug Administration. These letters warn pharmaceutical manufacturers and other regulated companies about misleading statements, missing information and other claims. As but one of the many examples of such letters is an October 18, 2010 letteriv to a pharmaceutical company, in which it was advised that two of its web pages discussing a magnetic resonance imaging contrast media it produces “omits important information about the approved indication for [the product], and both webpages misleadingly suggest unapproved new uses for the drugs.” Maintaining a web archive is critical to ensuring that an accurate record of content can be preserved and demonstrated when required.

• Loss of digital heritage/corporate memory

When web content is not archived, a significant proportion of an organization’s digital heritage – or corporate memory – simply disappears. Preservation of this content is important on a number of levels – legal, regulatory, productivity, etc. – but also because it represents something of the corporate history of the firm in the form of announcements to the public and other content that constitutes an organization’s digital record.

• An inability to gauge the effectiveness of web campaigns

Some organizations use their web site extensively to present marketing campaigns, post notifications of sales or special offers, and announce promotions of various types. If an organization cannot accurately archive its web content, it is at a disadvantage when attempting to correlate customer activity like sales calls or web inquiries with the specific timing of announcements and other web content.

Page 10: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 9

• Productivity and monetary loss from recreating unarchived content If web content is not archived and must be recreated, there can be significant time and money lost by those who created the original content, those who must code the content anew, etc. A web archive can, therefore, make various types of employees more efficient and save the organization money by allowing web content to be easily discovered and reused.

There Are Many Use Cases for Web Archiving There is a large and diverse set of use cases for web archiving, some examples of which are discussed below: • Facilitating regulatory compliance

There is a wide range of applications for web archiving in the context of regulatory compliance. For example, state consumer protection agencies, the Federal Trade Commission, various watchdog groups and similar organizations worldwide have an interest in monitoring the claims, advertising, marketing messages and other content posted by companies on their web sites. Archiving web content from these organizations is crucial to monitoring their compliance with various regulations and statutes. As but one example of the myriad such compliance obligations that exist is the aforementioned FINRA Rule 2210, a set of compliance obligations imposed on broker-dealers and certain others in the financial services industry to advertise their services accurately. Similarly, government agencies have obligations with regard to state sunshine and freedom-of-information laws to provide content to citizens upon demand. Archiving of web content posted on government-operated web sites is key to helping government agencies fulfill their obligations under these requirements.

• Checking web content for copyright violations

Web archiving can be extremely useful in capturing content from various sources on the web and then searching that content for potential violations of copyright. For example, a major US-based men’s magazine uses the Wayback Machine roughly every month to search for content on the web that might be using its trademarked logo or other content, particularly its published images. As noted above, while the Wayback Machine offers some utility for this type of application, an enterprise-class web archiving capability can provide timelier and more complete information, not to mention the ability to accurately determine when content was posted and deleted from web pages. This can be particularly important in cases where a violator takes down content after receiving notice of a legal action by a copyright holder – an inability to prove exactly when the content was taken down can undermine a legal case. An important case in this regard was Innervision Web Solutions’ use of the domain name “DellComputersSuck.com”. Because Dell contended that Innervision had used the domain name to redirect visitors to the Innervision web site for commercial gain, and because they were able to prove this based on archived web content, Dell was

Page 11: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 10

able to have this domain transferred to its ownership because Innervision was found to have registered the domain in bad faith.

• Proving the bona fides of expert witnesses

The Federal Rules of Civil Procedure, Rule 26 requires that expert witnesses whose testimony is introduced during legal proceedings offer “the witnesses’ qualifications, including a list of all publications authored in the previous 10 years.” Because a growing proportion of many such experts’ publications are electronic in nature, such as blog posts or other web-based content, it is increasingly important for this content to be available to all parties during a legal proceeding. From the perspective of the litigating party that has not hired an expert witness, it is particularly important to be able to access web archives of all of the content offered by that witness. For example, if a litigant can access content older than 10 years, or if they can uncover an obscure blog post that might be contrary to the testimony offered in court, this may prove to be extremely valuable.

• Demonstrating the veracity of electronic content In Vinhnee v. American Express, the defendant owed American Express in excess of $40,000 and the company sued to recover. Although American Express presented records of the defendant’s monthly statements, the company could not demonstrate the authenticity of these records and so lost the case, even after an appeal. In another case, Janssen-Ortho Inc. v. Novopharm Limited, an affidavit was presented that contained the link to a home page, but it did not include a copy of the page contents. The Federal Court in Canada that heard the case did not accept this affidavit, finding it to represent insufficient evidence. In both cases, a web archiving capability that could demonstrate the veracity of the information presented, along with verifiable time and date stamping, would likely have enabled the losing party to win its case.

• Performing marketing analysis A web archiving capability can be very useful when researching various types of marketing messages as part of a promotional campaign, even when this research is about a competitor. For example, a hotel chain may wish to archive the web content of its three leading competitors to determine when specific messages were posted to the web and when they were taken down. This information can then be correlated with sales data, marketing reports and other information to determine which messages were most or least effective.

• Conducting research

A web archiving capability can be extraordinarily useful in a wide range of research applications, such as a journalist exploring the positions of a political candidate prior to conducting an interview, a customer researching exactly when a company’s stated policy was first posted to its web site or when it was withdrawn, a human resources staffer investigating the statements made to a blog post or Facebook wall by a prospective employee, or when and where information about a trade secret was first

Page 12: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 11

posted to the web, to name but a few of the tens of thousands of potential use cases for web archiving focused on research.

THE BOTTOM LINE While there are a variety of applications for web archiving technology, the bottom line is that web content must be preserved for the same reasons that email and other electronic content must be archived. This was summarized in a landmark court decisionv in which the presiding judge wrote, “This Court sees no reason to treat websites differently than other electronic files.”

Key Issues in Selecting a Web Archiving Vendor There are a number of features, functions and capabilities that decision makers should consider as they evaluate web archiving solutions. Among these are the following: BREADTH OF WEB CONTENT ARCHIVING A web archiving solution should be able to archive a wide variety of content, from individual web pages to entire web sites. This should also include social media pages, RSS feeds, blogs and any other content that might be required for e-discovery, research or other uses. SUPPORT FOR A WIDE RANGE OF TECHNOLOGIES A wide and growing variety of technologies are used on the web, including Adobe Flash, AJAX, Javascript, PHP, various image formats (JPG, PNG, GIF, etc.), video content and other formats. Any web archiving technology must be able to archive all of these technologies. Further, it must accommodate new technologies as they become available. FLEXIBILITY OF ARCHIVING A web archiving platform must also provide flexibility in the timing of archiving. Unlike email or file archiving that is driven by the creation of discrete emails or files, web archiving is based on specific timing requirements. For example, a web archive should be able to archive all necessary web content at regular intervals, on a one-off basis, automatically, manually, etc. In short, a web archiving platform must be able to archive web content whenever it is required. ANALYSIS AND REPORTING TOOLS Web archiving capabilities should also provide robust analysis and reporting tools so that content can be analyzed for purposes of e-discovery, litigation support, regulatory compliance, marketing analysis or other purposes; or for purposes of reporting high-level results to senior managers. For example, senior counsel may want to analyze an entire web site’s contents over a particular date range for a set of keywords that may be required as part of an early case assessment exercise. Or, a marketing manager may want to search a competitor’s blog over the past year to search for instances of business partners being mentioned. Analysis tools will ideally support the creation of charts to aid in the analysis of trends, such as comparisons of web content over time.

Page 13: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 12

INTEGRATION WITH EXISTING SYSTEMS Web archiving capabilities should integrate with other systems in place in the organization, including analysis tools, existing archiving systems for email, etc. The ability to integrate with these systems will make searching and analyzing web content easier and more efficient, and will allow organizations to respond more quickly to time-sensitive requests. Further, integration with existing systems will allow data to be analyzed without users learning a new tool, interface, etc. DELIVERY MODELS A web archiving platform should support a flexible delivery model. While many organizations prefer an on-premise solution that can be managed completely behind the corporate firewall, a growing number of organizations are opting for cloud-based solutions that are completely managed by a third-party service provider. FISMA-COMPLIANCE FOR FEDERAL GOVERNMENT CUSTOMERS The Federal Information Security Management Act of 2002 (FISMA) requires US federal agencies to create, implement and document an information security program to support their information management goals. A key goal of FISMA is the archiving of information assets, including web sites. Consequently, a best practice focused on FISMA compliance will include regular capture of all relevant web site content, including secure, long-term storage of this content. ABILITY TO PERFORM FULL-TEXT/CONTENT SEARCHING Another important feature of any web archiving solution is the ability to search for content using full-text/searching capabilities. This is particularly important when searching for specific keywords or phrases during an e-discovery or similar exercise in much the same way that this type of search is critical for any other type of archived content, such as email or files. USE OF ORGANIZATIONAL TOOLS Organizational tools are also a very useful feature for a web archiving system because it allows reviewers to organize content for subsequent searches. For example, the ability to organize content into folders, tag specific sections or pages for later review, or add notes to pages or sections is very helpful for paralegals who are scouring archived web content for later and more thorough review by senior counsel. ABILITY TO COLLABORATE USING ARCHIVED CONTENT Finally, it is important that any web archiving capability allow users to collaborate based on this archived content. Just as with email or other types of content archiving, teams of individuals will normally work on large cases involving archived web content and their ability to collaborate is essential.

Page 14: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 13

Conclusion: Consider Web Archiving Because the web continues to grow in importance for both business and government as a medium for communication and commerce, archiving of web content should become an essential element of any organization’s risk mitigation and compliance strategy. As a result, organizations should seriously consider developing a web archiving policy and deploying technology that can support this policy.

About the Sponsor of This White Paper ABOUT REED TECHNOLOGY Reed Technology & Information Services (RTIS) offers the Reed Tech Web Archiving Service for corporate enterprises, government, and professional services companies. Reed Tech has been providing clients with information capture, conversion, management, distribution and transformation services for almost 50 years. Reed Tech’s clients include large government agencies like the U.S. Patent & Trademark Office, a wide range of pharmaceutical and other life sciences companies, and law firms of all sizes. Reed Tech is a wholly-owned subsidiary of Reed Elsevier, an $8b global provider of professional information and online workflow solutions in the Science, Medical, Legal, and Risk and Business sectors. With almost 1,000 full time employees, Reed Tech reports in through LexisNexis, a leading global provider of content-enabled workflow solutions to professionals in law firms, corporations, government, law enforcement, tax, accounting, academic institutions and risk and compliance assessment. ABOUT ITERASI Iterasi Inc. - creates enterprise-class web archiving technology applications specifically for regulatory compliance, litigation protection, and e-discovery. Pete Grillo, CEO, founded the company in 2007.

Page 15: Web Archiving Whitepaper

Web Archiving: The Next Phase in the Evolution of Archiving

©2010 Osterman Research, Inc. 14

© 2010 Osterman Research, Inc. All rights reserved. No part of this document may be reproduced in any form by any means, nor may it be distributed without the permission of Osterman Research, Inc., nor may it be resold or distributed by any entity other than Osterman Research, Inc., without prior written authorization of Osterman Research, Inc. Osterman Research, Inc. does not provide legal advice. Nothing in this document constitutes legal advice, nor shall this document or any software product or other offering referenced herein serve as a substitute for the reader’s compliance with any laws (including but not limited to any act, statue, regulation, rule, directive, administrative order, executive order, etc. (collectively, “Laws”)) referenced in this document. If necessary, the reader should consult with competent legal counsel regarding any Laws referenced herein. Osterman Research, Inc. makes no representation or warranty regarding the completeness or accuracy of the information contained in this document. THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED REPRESENTATIONS, CONDITIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE DETERMINED TO BE ILLEGAL. i http://www.archive.org/about/faqs.php#The_Wayback_Machine ii http://royal.pingdom.com/2010/01/22/internet-2009-in-numbers/ iii Filing Communications for FINRA Review Webcast iv http://www.fda.gov/ICECI/EnforcementActions/WarningLetters/ucm230796.htm v Arteria Prop. Pty Ltd. v. Universal Funding V.T.O., Inc., 2008 WL 4513696 (D.N.J. Oct. 1, 2008)