Formats and Operating Systems

Electronic Records come in a very wide variety of file formats and work in a variety of operating systems. This page will provide a quick orientation on formats and operating systems, and is designed at an introductory level. Linked to this page are several additional pages providing more information about specific format types. Additional information is based on Archives staff expertise and format information available at www.nationalarchives.gov.uk/pronom/. If any of the format information provided is incorrect please let the Archives staff know and we will review the problem for possible revision.

Formats

A file format is the way that the information in a computer file is organized so a computer program can read the information and provide the computer user with the intended “look and feel.” The human-readable part of a file format is the file extension, which comes after the file name. This will always have a period (.) and will be followed by 3 or more characters. Examples of file extensions are .jpg or .previous. The file extension tells the computer what program to use to open a file. When a mistake is made because the file extension is used for more than one computer program, or the file extension is missing, and the wrong computer program opens the file, the results can be misleading.

For an example of what opening a file format with the wrong computer program, try opening a .jpg with Microsoft Word or Notepad.

To make it easier to identify what kind of content a record contains, when receiving Electronic Records from state agencies for preservation in SCERA, the records are assigned a “type.” The type assigned is based on the whole set of records received by SCERA, not the individual files. These format categories give hints about what kind of computer programs will open the files, and what is the most important part of the record to save. The categories are:

  • text
  • image
  • audio
  • video
  • mixed
  • multiple

Text means that the file format is supposed to show on a computer screen in readable characters, such as the content of this webpage.

Image means that the file format is mostly about visual elements that are not to be read, even if part of the picture is text. For example, a picture of a yield road sign off of Highway 1 that illustrates where a sign should be located does contain text, but that is not what is important about the file.

Audio means that the file is meant to use a file player program to be heard. An example is a .mp3 file played using a media player.

Video means that the file is a moving image, with or without audio, that is meant to be seen using a video player program. While individual images make up a video, it is not the individual pictures but the whole played at once that is important.

Mixed means that the individual file has more than one of the above categories within the file that is important to the file. For example, a webpage that has video to be played and text related to the video is mixed because both parts are considered equally important. PLEASE NOTE that when “mixed” is used the Archives will not be specifying what categories are included in the files.

Multiple means that the files provided to the Archives were submitted with multiple formats. Each file could be only a single category but the whole has multiple kinds.

Operating Systems and Operating Environments

The operating system is the whole environment within a computer. It is the basic program that makes the computer function. In some cases the operating system and computer program using that operating system are similar enough to programs in other operating systems that the same file can be used in both. But sometimes a file and program can only be used within the specific operating system. The following are the basic current operating systems that make most computers run.

  • Windows
  • Macintosh OSX (usually called MAC OSX or OSX)
  • Linux

The operating environment (in terms defined by SCERA) is similar to the operating system, but includes both the operating system and the computer programs running within the operating system. This is generic information that will tell a user how a record was created and is specific at the agency level only. It will include information such as what kind of office productivity software was used (i.e. MS Office Suite 2010), or special software being used. Software systems are important to know as changes in software being used can affect how a record was created and what it is “supposed” to look like. If an individual person at an agency had a special program on their computer that was unique to them, but has no bearing on how a record was created, this information is not documented.

Operating System Specifics

Due to the nature of software development, as computer programs have evolved older operating systems may not be able to handle newer software or the hardware necessary to run the newer software.

Windows

Windows is a operating system created by the Microsoft Corporation for corporate and home use. The operating system has gone through many versions over the years, with each version having changes in functionality and, in some cases, means of navigation. Major versions include:

  • Windows 3.1
  • Windows 95
  • Windows 98
  • Windows XP
  • Windows ME
  • Windows Vista
  • Windows 7
  • Windows 8
  • Windows 10

Current versions of Windows can be found through Microsoft at (as of the date of this writing) http://www.microsoft.com. The phone number (current as of 2015) is 877-696-7786. Current versions of Windows may also be available at electronics retailers.

Mac OSX

Mac OSX is the operating system specifically designed to be used on Macintosh (Mac) computer hardware. Macintosh is a brand name under the Apple, Inc company for a computer hardware line of business. Other business hardware by Apple, Inc includes the currently (2015) popular iPod and iPhone. Mac OSX can be installed on non-Mac hardware but this is very uncommon. It is a simple and straightforward to use operating system originally based on an older operating system called Unix. Current versions of the operating system can be found at retail outlets (Apple Store), or through the Apple company at http://www.apple.com. Major OSX versions, historically nicknamed after animals, include:

  • Cheetah
  • Puma
  • Jaguar
  • Panther
  • Tiger
  • Leopard
  • Snow Leopard
  • Lion
  • Mountain Lion
  • Mavericks
  • Yosemite

Linux

Linux is an open-source (meaning that the computer code is available for a public user to look at and alter) operating system originally created by Linus Torvalds to work similar to the Unix operating system (which was no longer free) but not use the proprietary Unix code. Because it is open-source and can be changed by anyone who has the knowledge to do so, there are a great many versions of Linux in use; probably the most popular version at the time of this writing is the Ubuntu platform (available at http://www.ubuntu.com). Linux has been adapted to mobile technology in the form of the Android operating system. When possible, SCERA will specify which “flavor” of Linux is being used by an agency.

Because it is open-source and the open-source community is very devoted to the ideal of free-software, all (or almost all) programs running on Linux are free to obtain and to use.