www.illinois.gov

Rod R. Blagojevich, Governor

State Links Skip to Content Skip to State Links

 News
 Illinois Facts
 Living
 Health & Wellness
 Working
 Visiting
 Learning
 Business
 Public Safety
 Technology
 Government
 Help
 Home

[Search Tips]
 

Your Government


Illinois Flag Honors

Illinois Gallery Website

Inspector General


  Electronic Records Management  

Technical Considerations When Storing Public Records in Digital Format

On July 6, 2000, Illinois Governor George H. Ryan signed an amended Local Records Act (50 ILCS 205). This bill allows local governments to reproduce public records in either microfilm or digitized electronic formats.  The new law stipulates that if the local government keeps a public record in an electronic format, the method must be a “trustworthy manner so that the records, and the information contained in the records, are accessible and usable for subsequent reference at all times while the information must be retained.” This practice is only allowable if the electronic records are reproduced on a “durable medium that accurately and legibly reproduces the original record in all details,” and “does not permit additions, deletions, or changes to the original document images.” Each agency is also under the obligation to file a Records Disposal Certificate with the appropriate Local Records Commission before any original record may be disposed of and before the reproduced digital record is disposed of. 

When indicating his support for the bill, Governor Ryan noted that there are no universal standards for the creation and storage of electronic documents and called on local governments to “be cautious in the way in which they maintain public records and protect the public interest.”  He urged local governments to use the months between the signing of the bill and its effective date of January 1, 2001, to carefully develop strategy and methodologies for implementing digitized electronic document creation and storage.  

Unfortunately, you cannot simply buy a scanner and a CD burner and hope to effectively convert paper records into reliable electronic records.  The process of converting records to electronic format is complex and requires careful planning and vendor selection in order to be effective.

The challenge with digital records storage is to ensure that records remain in usable form regardless of changes in technology or obsolescence of particular file formats and storage media.  Another significant challenge is selecting an appropriate storage medium (CD-ROM, DVD, etc.) that will be sufficiently dependable for the entire life of a record.  At present, there is confusion and uncertainty over the long-term dependability and viability of certain media types.

Contrary to popular belief, magnetic and optical media do not last forever, and may only be relied upon for a few years before the records stored in a particular media must be transferred in order to avoid the inevitable electronic file corruption that physical degradation of media will bring.  Guidelines for digital media are always changing, and media quality is highly variable depending upon the manufacturing quality of the media.  As an example, a CD-ROM may have an expected usable life ranging from only a couple of years to more than 20 years depending on the manufacturing quality and the quality of storage.

The same is true of other magnetic and optical media types. Quality of storage and manufacture are very significant in determining the overall usable life of the media.  Obviously, care in choosing a reliable media source is essential to ensure data reliability.

Standards and File Formats

While there are no formal standards in place to guide an agency considering implementation of a electronic records storage system, there are a number of accepted industry and even proprietary standards that can increase the likelihood of long-term data accessibility. Industry standards that have gained common acceptance are GIF, TIFF and JPEG for images, ASCII and RTF for text documents, and HTML and XML for documents that are to be displayed on the Internet.  GIF, a graphics format owned by Compuserve, is also an industry standard but due to past attempts to levy royalty fees on creators of GIF files, this may be a more problematic standard than TIFF or JPEG.

It may also be desirable to store text documents both as searchable text and as an image.  Documents could be stored just as image files, but one of the primary advantages of digital document storage is the ability to search for particular documents based on contents. This can only be done through a full-text search which requires the conversion of the image to text.

The Adobe Portable Document Format (PDF) is a proprietary standard that has become a de facto standard for publishing fully formatted documents on the Web.  In order to read these documents, however, a user must have a copy of the Adobe Acrobat Reader, available free from Adobe, installed on their computer.  The creator of the document must have a copy of the full Adobe Acrobat installed on her PC.  Adobe charges for the full version of Adobe Acrobat.

Adobe PDF format has the advantage of being able to display the exact format of the original document with all graphics and text formatting intact, but it has some disadvantages.  The most significant of these is the inability to revise the document without the full version of the software, though this may be considered an advantage under certain circumstances.

It generally makes more sense to store word processing documents as searchable text rather than PDF, but the question is whether or not to store the document in the native word processing format, or store it in a more universally accessible format like ASCII or RTF.  RTF has the advantage of retaining most formatting and spacing; ASCII will not retain formatting. Regardless of the format chosen for storage, the document should be ultimately reducible to ASCII.

Compatibility and File Conversion

Those who have had experience with various word processors over the years know that compatibility between versions is not always guaranteed and while vendors try and maintain backward compatibility, forward compatibility is almost impossible to maintain.  As an example of this, old versions of Microsoft Word cannot access files created in newer versions of Word.  The same is true of older and newer versions of WordPerfect.

Conversion of documents between word processors like Word and WordPerfect is frequently problematic.  Many users who have attempted migration of documents between the two packages discovered that conversion to a format such as RTF or ASCII that is compatible with both word processors is more dependable than direct conversion, particularly when a document must be converted back and forth several times.  RTF can be used when transferring documents between most versions of Microsoft Word and WordPerfect or any other current word processing package.  RTF is also the default file format for certain programs like Microsoft’s Outlook e-mail software.

Browser-based Formats

Hypertext Markup Language (HTML) can also be used as a medium of exchange between word processing programs and other software programs that display text.  Most word processing programs today automatically detect HTML files and display them appropriately.

One emerging development is the storage and exchange of documents in eXtensible Markup Language (XML) format.  Microsoft seems to be moving in the direction of making XML its choice as a common medium of exchange between its various programs, as well as between Microsoft software programs and the programs of other software companies.

XML shows much promise as a standard for transferring data from one system to another.  As an example of how XML can work, imagine an everyday business letter, which includes an inside address, a salutation, a date, a body and a closing.  In XML, each of these elements would be enclosed by “tags” identifying each of the elements.  When transferred to a different system, the receiving system would always correctly identify each element and display it appropriately.

Risks Associated with Native File Formats

Another factor in deciding whether to store a document in the native word processor format (the native format is the proprietary default format used by the creator of the word processor) is the long-term viability of a particular manufacturer.  Few market sectors are as volatile as information technology, and the continued availability of a particular software package cannot be taken for granted.

In recent years, we have seen a number of software companies quickly lose market share to competitors and find it unprofitable to continue supporting software.  As a result, several previously common word processor packages, such as WordStar, PFS Write, and Volkswriter, have become unavailable. This indicates that long term storage of documents in native formats may be risky.

Records Management Systems and Vendor Selection

No matter which media or file format is chosen for storage of electronic documents, images and other digital objects, a system is needed that can organize and reliably retrieve the objects.  Such a system is called various things by different groups and vendors, but for the purpose of this article it will be called a Records Management System (RMS).

Choosing the right RMS—and a reliable vendor to provide support for the system—may be the most important decision you make concerning electronic records retention.  The RMS must dependably catalog and index all of the documents and images, and quickly retrieve those objects. This may seem to be a trivial matter, but think about all of the files and documents your agency deals with every day. Documents need to be accurately classified and organized into files, and files must be grouped into larger subject areas.  As with paper files, if you electronically misfile the document, you will have difficulty ever finding it again.

To retrieve individual documents, each must be uniquely identified and indexed in a database.  The identifier could be a system-generated sequential number but a better choice for the identifier would be an existing number like a case number.  Remember though that any number of individual documents must be “filed” under this one case number, so you may also need individual document numbers in addition to the case number.  The combination of a case number and the document number could make up the unique key number assigned to each document.  In addition to retrieving a document through the key number, you may also wish to search for documents based on key words or individuals associated with the file or document.  

In order to perform these searches, all of identifiers must be stored in the database,  Once the database itself and its search indexes are constructed, the RMS can quickly retrieve the digitized files.

Records do not have to be kept on-line at all times, but can be stored off-line in such a way that the RMS can locate the records and then restore them it the online system so that they can be retrieved.  Such a process could take just minutes, but depending on your system, it could take up to several hours to load and read tapes or other media containing archived records.  Obviously, a mix of on-line and off-line storage would be suitable for different needs.  On-line storage is best for frequently accessed records, and off-line is generally adequate for rarely accessed records.

It would be advantageous to integrate the RMS to an existing case management system so that your staff will only have to use one system to access case information.  However, integration of two systems can be tricky, especially when integrating legacy systems with newer relational database driven systems. Having an experienced vendor is key to successful integration.

 Choosing the Right Vendor

The importance of selecting a highly qualified vendor when planning and implementing a RMS cannot be over-emphasized.  The complexity and expense of such systems makes vendor selection perhaps the most important single decision you will make in the process of acquiring electronic document capabilities.

There are no hard and fast rules for selecting a vendor but there are a few guidelines:

  • Make sure the vendor has a track record of successful projects of the type you wish to accomplish.
  • Verify that the vendor can actually provide you with the people who have worked on the projects they provide as references.
  • Confirm that the vendor has the financial, human and other resources available to finish your project.

You should prepare for vendor evaluation and selection by learning as much as you can about electronic records management before starting the process.  Most agencies must  issue a Request for Proposals (RFP) in order to procure vendor services, so it would be prudent to begin this learning process before creating the RFP.

Steve Prisoc is Associate Director of the Illinois Criminal Justice Information Authority and can be reached at 312-793-8550 or by e-mail at sprisoc@icjia.state.il.us

Glossary

ASCII (American Standard Code for Information Exchange) - The ASCII character set is the basic set of characters that can be displayed on a PC screen.  When text documents are converted from their native wordprocessor file formats to ASCII they lose all special formatting like bold, italics, underline and more.

CD-ROM (Compact Disc-Read Only Memory) - A CD-based storage medium.

ERMS (Electronic Records Management Systems) - A system for managing electronic records

GIF (Graphics Interchange Format ) - A graphics format primarily used in creating small images for the Web.

HTML  (Hypertext Markup Language) - A tag-based language used to display text on the Web.

JPEG (Joint Photographic Experts Group) - A graphics file format that reduces large image files into smaller files that can be more easily stored and transferred.  JPEG is primarily used for display of photographs on the Web and also for digital photography.

Migration of Digital Data - Limits problems associated with continuing to store documents in old file formats by migrating to more current formats.

PDF (Portable Document Format) - A proprietary format that has become a de facto standard for documents displayed on the Web.

Refreshing Digital Data - Periodically transfering files from older physical storage medium to a newer medium to avoid physical decay or obsolescence.

RMS (Records Management System) - A system for indexing and organizing documents and other records.

RTF (Rich Text Fomat) - This text file format preserves special formatting like italics, underlining and spacing.

TIFF (Tagged Image File Format) - A rastor-based (bitmapped) graphics file format which maintains high resolution.  

ITO Links

 News
 Mission
 Projects
 Resources
 Gallery
 Awards/Rankings
 Privacy Statement
 Home

Copyright © 2008 State of Illinois Site Map | Illinois Privacy Info | Kids Privacy | Web Accessibility | Plug-Ins | Contact Us