Paradigm Shifts in the Conservation of Electronic Art

 

Based on a presentation by Howard Besser

http://besser.tsoa.nyu.edu/howard/

 

How long we keep things

·        Companies keep information for day, years, or decades

 

·        Individuals keep things for years or a lifetime

 

·        Archives, Libraries, and museums keep things for hundreds of years

 

Cultural Institutions have a much greater responsibility for preservation!

 

 

But who is preserving today’s “born-digital” works?

 

·        In the past, we knew about history by finding written documents:

–        Changes between different drafts of a scientific or literary paper

–        Letters and correspondence between a scientist (or literary figure) and colleagues (that both helps contextualize the work, and lets us see changes in thought processes or discovery)

·        But today, these documents are not on paper! 

·        They are in the form of:

–        Email correspondence

–        Word processing files that do not show changes between drafts/versions

·        Who will take responsibilityto save these works for future study?

 

 

Paradigms Shifts needed

 

 

Old

New

Physical preservation

atmospheric control

ongoing mgmt

What to save?

artifact

idea + ancillary material & documentation

Cataloging

Individual work in hand

FRBR (Functional Requirements for Bibliographic Records)

Later access

Artifact

Restaging, ancillary material, & documentation

Paradigm Shifts in the

Conservation of Electronic Art-

·        How are new works even more problematic than older forms of moving image material?

·        Issues with Digital Preservation

·        Issues with New Works

·        Technical & Conceptual Approaches to solutions

·        Efforts to watch (projects, standards)

 

 

Conventional Works

Manuscripts, books, paintings, sculpture      

We have a good sense of what the original object is

o       Objective is to make object itself endure (temperature/humidity control, chemicals/pigments/fibers/adhesives)

 

o       Goal is to keep object as close as possible to original state (though occasionally controversy arises over whether to let aging show)

 

 

Electronic Media

Video, audio, digital, new media

o       Often difficult to determine what the original object is

o       Difficult to make the original object endure (magnetic particle deterioration, warping, etc.)

o       Even if we could make the original object endure, we wouldn’t have the infrastructure to view it in the future

o       Need to develop a paradigm shift from preserving the original object to preserving info content

o       Need to pay more attention to maintaining authenticity and replicating user experience

 

 

Electronic Art in general is not like canvas paintings

 

May include

o       Moving image materials

o       Multimedia

o       Interactive programs (including hypertext novels & games)  

o       Computer generated art

o       Most electronic art works share some common characteristics with other “strange” works like Performance Art, Conceptual Art, Site-specific installations, Experiential Art

 

 

The Short Life of Digital Info:

Digital Longevity Problems:

  • Disappearing Information
  • The Viewing Problem
  • The Scrambling Problem
  • The Inter-relation Problem
  • The Custodial Problem
  • The Translation Problem

 

 

The Viewing Problem

  • Digital Info requires a whole infrastructure to view it
  • Each piece of that infrastructure is changing at an incredibly rapid rate
  • How can we ever hope to deal with all the permutations and combinations

 

 

The Scrambling Problem

Dangers from:

  • Compression to ease storage & delivery
  • Container Architecture to enhance digital commerce

 

 

The Inter-relation Problem

o       Info is increasingly inter-related to other info

 

o       How do we make our own Info persist when it points to and integrates with Info owned by others?

 

o       What is the boundary of a set of information (or even of a digital object)?

 

 

The Custodial Problem

In the past, much of survival was due to redundancy

 

o       How do we decide what to save?

o       Who should save it?

o       How should they save it?-

 

 

The Custodial Problem:

How to save information?

o       Methods for later access

o       Refreshing

o       Migration

o       Emulation

 

o       Issues of authenticity and evidence

 

 

The Translation Problem

Content translated into new delivery devices changes meaning

 

  • A photo vs. a painting
  • If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format?
  • Behaviors

 

Thinking of the Future

o       Screens will be different resolutions and different aspect ratios

o       CRTs won’t exist

o       A decade or 2 from now, today’s user interfaces will look like arrow-key navigation looks like today

 

o       Today’s streaming media are small windows, slow speeds

o       As bandwidth increases, viewers will expect higher quality streams

o       Creators may need to consider how they’ll be able to deliver higher-bandwidth streams

–         Delivery Derivatives vs. Masters encoded w/standards

–         May also want to re-edit the piece to take advantage of changes in technology, viewer expectations, society

 

 

Responding to serious Longevity Problems

 

o       Previous formats required little ongoing intervention (remote storage facilities, Iron Mtn)

o       Digital formats require intense ongoing management

 

o       Need for:

    • Preservation Repositories
    • Preservation Metadata

 

 

Issues with new works-

§         What is the work?

§         Complexity of rich media

 

§         Difficulty of making the work last

 

 

LeWitt: Wall Drawing 340

Time Lapse Video Installation

More Info about Installation

 

LeWitt Install Directions

Virginia Museum of Fine Arts Wall Drawing #541

 

 

LeWitt: What do we save?

o       The installation?

o       Documentation of the Installation?

o       The directions for the Installation?

o       What is the goal of our documentation and preservation?

 

 

e.g. Hole in Space (see http://1904.cc/timeline/tiki-index.php?page=Hole+in+Space )

  • Live two-way satellite connection using video screens to project life-size images.
  • A sidewalk facing window at New York City's Lincoln Center and in a display window at at the Broadway department store in Century City ( Los Angeles).
  • Both screens could accommodate the images of about 15 people.
  • Over 3 nights

 

 

 

Complexity of Rich Media

o       Works often have artistic nature (including video games)

o       Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to constructthe artifact)

o       Too complex to save every one of these aspects for every type of material

o       Importance of saving documentation

 

 

Special Characteristics of Electronic Works

o       What really is the Work?

o       Disappearing software

o       Enormous number of elements can, at times, be very important to preserve (randomness, interactivity, pacing, color, format, original artifact, elements used to construct the artifact)

o       Pieces and Boundaries

o       Recontextualization (Postmodernism)--which rendition to save?

o       Dynamic & Lack of Fixity (evolving works)

o       Interactivity

o       Historical context

o       Difficulty of authentication over time

 

 

Documentation & Preservation:

 

What is attempted to be done?

 

o       Show the work the way people saw and interacted with it when it was first created (may be impossible; in the past, the artifact and how one interacted with it didn’t change much, so preservation and documentation were relatively straightforward)

o       Show documentation of the work and people interacting with it when it was first created

 

o       Reinstall/Recreate/Reenact the work

 

 

What can be done specifically to Electronic Art?

o       Works themselves may no longer even exist; in many cases, what we can save amounts to forensic evidence

o       Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact)

o       Too complex to save every one of these aspects for every type of material

o       Importance of saving pieces, representations, and documentation

o       Involve the artists to capture their intentions

o       Importance of Standards

o       Familiarize ourselves with recent conservation developments (Who Knows?, TechArcheology, Tate, IMAP)      

 

 

Technical & Conceptual

Approaches to Solutions-

  • Save the Hardware & Software
  • Emulate
  • Migrate
  • FRBR
  • Artist Intentions

 

 

Save the Hardware

  • A huge undertaking
  • Computer Museum
  • Broderbund

 

Old Video Formats

 

 

Old Digital     Formats

 

 

Possible endless need for reformatting implies

o       Possible loss with each generation

o       Requires managed environment

 

 

Approachesto Solutions-

o       Save the Hardware & Software

o       Emulate

o       Migrate

 

 

Conceptual Approaches to Digital Preservation

 

o       Refreshing always necessary due to volatility of physical strata

–         Impact on evidential value

o       Migration -- advantages     & disadvantages

o       Emulation -- advantages    & disadvantages

o       And will need a long-term managed environment

 

 

Migration (e.g.)

Wordstar to Word 1 to Word 3, …

 

-Tables and complex features often get corrupted

-Need to repeat every 4-5 years (maybe forever)

+We know how to do this ourselves

+If there’s a problem, we can catch it soon

 

 

Emulation

o       Keep the Wordstar file format, but write emulators to make it work in newer environments

 

o       +A better chance of carrying over complexity

o       +Many more features can survive

o       Problems may not be caught until it’s too late

o       Specialists and a whole infrastructure of emulators required

o       Serious problems(reverse engineering?)

 

 

Managed Environment

o       More than temperature & humidity control

o       Periodic monitoring of the works

o       Periodic monitoring of the technical environment for viewing the works (software, systems, hardware)

 

o       Trusted repositories

 

 

Incorporate parts of Functional Requirements for Bibliographic Records(FRBR)

•  work

•  expression

•  manifestation

•  item

 

 

Standards for encoding creators’ intentions

(group efforts with Cultural Heritage community)

o       Matters in Media Art--New Arts Foundation, MOMA, SFMOMA, Tate

o       DOCAM (Documentation and Conservation of the Media Arts Heritage)

o       INCCA (International Network for the Conservation of Contemporary Art)

o       Past

·        Seeing Double Exhibition, & Symposium

·        Variable Media Initiative

·        Artists Interviews Project, Netherlands Institute for Cultural Heritage 1998-1999, Modern Art: Who Cares

·        TechArcheology: A Symposium on Installation Preservation (SFMOMA)

 

 

A few questions conservation community should address

 

·        Special issues raised by non-library institutions

·        Special issues raised by images and rich media

·        What is the work (or salient points we need to preserve)?

·        Bring the arts communities (artist intent, BAVC) together with the preservation repository communities and the preservation metadata communities

 

·        Specifically get Cult Heritage communities involved with the selected OCLC /RLG recommendations

·        Get cult heritage groups started on working to make sure that structure

·        standards incorporate our works

·        What organizations will take responsibility to save today’s digital “ephemeral” materials (online ‘zines, arts discussion groups, etc.)?

 

 

Standards, Metadata, & Best Practices to follow-

 

•   Risk Management

•   Best Practices for Reformatting

•   Preservation Repositories & Metadata

•   Other Metadata & Standards

 

 

Risk Management

·        We can’t say definitively that we can make every digital work persist

·        What we CAN say is that the more a digital work conforms to standards and best practices, the greater the likelihood that we can assure persistance

 

·        Our preservation repositories can even accept deposits of non -conforming works, but the less they conform, the less likely that they’ll be salvageable

 

·        Persistance is most likely for works that share standards, metadata, and best practices

 

 

Reformatting Best Practices (e.g. still images)

·        Think about users (and potential users), uses, and type of material/collection

 

·        Scan at the highest quality that does not exceed the likely potential users/uses /material

·        Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery

 

·        Many documents which appear to be bitonal actually are better represented with greyscale scans

·        Include color bar and ruler in the scan

 

·        Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)

 

·        Don’t use lossy compression

 

·        Store in a common (standardized) file format

 

·        Capture as much metadata as is reasonably possible (including metadata about the scanning process itself)

 

 

Preservation Repositories: Open Archival Info System Model

An Open Archival Information System (or OAIS) is an archive, consisting of an organization of people and systems, that have accepted the responsibility for the  preservation of information and to make it available for a Designated Community.

·        The entities within the OAIS are based on the concept of an information package: a conceptualization of the structure of information as it moves into, through, and out of the digital archives.

·        An information package consists of the digital information object that is the focus of preservation, along with metadata necessary to support its long-term preservation and access, bound into a single logical package.

The OAIS recognizes three primary types of information packages:

1.      The Submission Information Package (SIP), is the version of the information package that is transferred from the Producer to the digital archives when information is ‘ingested’.

2.      The Archival Information Package (AIP) is the version of the information package that is stored and preserved by the digital archives. The AIP consists of the information that is the focus of preservation, accompanied by a complete set of metadata sufficient to support the digital archives’ preservation and access services.

3.      The Dissemination Information Package (DIP) is the version of the information package delivered to the Consumer in response to an access request.

 

 

OCLC/RLG (OCLC Online Computer Library Center)

A nonprofit, membership, computer library service and research organization dedicated to the public purposes of furthering access to the world’s information and reducing information costs

Digital Repository Attributes

 

·        Administrative responsibility

·        Organizational viability

·        Financial sustainability

·        Technological suitability

·        System security

·        Procedural accountability

 

 

OCLC/RLG

Selected Recommendations

 

·        Policies, Certification processes, Risk management, Persistent ID, Migration/Emulation experiments

 

·        Stakeholders meet to decide how to describe what is in a dig repository

 

·        Examine special properties of particular classes of digital objects

·        Technical standards for exchange and interoperability btwn repositories

· Develop projects and case studies

· Copyright issues

 

 

Preservation Repositories: too difficult for small institutions

·        Too complex for small institutions to manage

·        Will be done through partnering (small museum with University) or through consortia (museum association, state-wide organization, …)

·        Archive or museum will direct what is needed, but digital repository will carry out the actual work (as defined in SIP/DIP/AIP)                            

 

 

OCLC/RLG Efforts: PREMIS Data Model

PREMIS Data Dictionary for Preservation Metadata was the first comprehensive specification for preservation metadata produced from an international, cross-domain consensus-building process.

–        Entities:

–        Digital Object, Intellectual Entity, Event, Agent, & Rights

–        Relationships are statements of association between instances of entities

–        Semantic Units are the properties of an entity, and have values

–        Digital Object = a discrete unit of information

–        Files = named and ordered sequence of bytes known by an operating system

–        Bitstream = a set of bits embedded within a file

–        Representation = the set of files needed for a "complete and reasonable" rendering of an Intellectual Entity

–        Intellectual Entity = a coherent set of content that can be viewed as a single unit

–        Event = an action involving at least one Object or Agent known to the repository

–        Documents actions that modify Digital Objects, records validity checks, etc.

–        Objects can be associated with any number of events

–        Agent = persons, organisations, or programs associated with preservation events

–        Not the main focus of the data dictionary

–        Rights Statements = assertions of rights  pertaining to Objects or Agents

–        WG concentrates on rights and permissions associated with preservation activities

–        Relationships:

–        Relationships between Objects:

•         Structural relationships, e.g. how files combine to make up an Intellectual Entity

•         Derivation relationships, e.g. resulting from format transformations or replications

•         Dependency relationships, e.g. when Objects depend on others, e.g. fonts, DTDs, etc.

•         1:1 principle

 

 

PREMIS Data Dictionary Example

Fixity: Property that a Digital Object has not been changed between two points in time.

 

 

Other Standards/Metadata Areas

·        Synchronicitybetween media/streams

·        Performance Archive & Retrieval Working Group

·        Performing Arts Data Service (PADS)

·        Persistent IDs-

·        Website management-

·        Technical Imaging Metadata-

·        Structural & Administrative Metadata-

·        Complexity of formats (storage & compression)-

·        Crosswalking Metadata-

o       A crosswalk is a table that shows equivalent elements (or "fields") in more than one database. It maps the elements in one metadata scheme to the equivalent elements in another scheme.

 

 

Persistent IDs--the Problem

o       Need to separate work ID from work location

 

o       Becomes a business process issue when one organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)

 

Approach for today:

PURLs (Persistent Uniform Resource Locators)

o       Web addresses that act as permanent identifiers in the face of a dynamic and changing Web infrastructure.

o       Instead of resolving directly to Web resources, PURLs provide a level of indirection that allows the underlying Web addresses of resources to change over time without negatively affecting systems that depend on them.

o       This capability provides continuity of references to network resources that may migrate from machine to machine for business, social or technical reasons.

 

Handles

o       The Handle System is a general purpose distributed information system that provides efficient, extensible, and secure HDL identifier and resolution services for use on networks such as the Internet.

o       It includes an open set of protocols, a namespace, and a reference implementation of the protocols.

o       The protocols enable a distributed computer system to store identifiers, known as handles, of arbitrary resources and resolve those handles into the information necessary to locate, access, contact, authenticate, or otherwise make use of the resources.

o       This information can be changed as needed to reflect the current state of the identified resource without changing its identifier, thus allowing the name of the item to persist over changes of location and other related state information.

o       The original version of the Handle System technology was developed with support from the Defense Advanced Research Projects Agency (DARPA).

 

HTTP redirects

 

 

Website Management

More issues with referencing IDs

•  References for mirror sites

•  References for back-up sites when main site is down or bottle-necked

•  References for off-site copies and archival copies

 

 

Ideal digital moving image file format

•  Non-proprietary file format

•  supports 10-bit/pixel

•  no compression or lossless compression using non-proprietary CODEC

•  supports multiple frame rates/frame sizes

•  supports time code data in file

•  supports audio (multichannel) and video in a single file

 

 

Limitations of present file formats

 

o       MPEG seems to be only non-proprietary format

o       AVI and Quicktime with extensions incorporate most features, but are proprietary

o       Not enough companies produce encoders for  Motion JPEG 2000 for us to feel comfortable about its long-term sustainability

o       Many quality questions

•  Quality of playback?

•  Theater experience?

 

What about newer formats & developments?

 

o       Moving images on DVDs becoming interactive; need for more extensive source materials

 

o       Video installation works

o       Net-based works incorporating moving images

o       Images & rich media; new media and multi-media works

•                   Inter-relationships between parts

•                   For Contemporary Art: What is the Work?

 

 

Which should be reformatted to digital today?

 

•  Born digital--need to be kept in digital form

•  Video – probably; at least soon

•  Film- Not very soon

•  A guessing game; we need more R&D, as well as education

 

 

Other Digital Preservation Activities/Projects-

o       Library of Congress National Digital Information Infrastructure & Preservation

o       The InterPARES Project:  (International Research on Permanent Authentic Records in Electronic Systems)

o       Electronic Literature Organization

o       Virtualization

o       ERPANET - Electronic Resource Preservation and Access Network

o       Emulation

o       Open Emulation Project: Nintendo

o       Stella - Atari 2600 Emulator

o       MESS: emulates a large variety of different systems

o       CAMiLEON Project

o       nedlib

 

 

Example: Assessing an old Website

o       Producer’s Intention

o       Physical Characteristics

• Structure

• Risks

• Documentation

• Recommendations

 

Website Preservation Tools

Archive-It: a subscription service from the Internet Archive, allows institutions to build and preserve collections of born digital content.

 

o       Build, manage, & search own web archive through user-friendly web application

o       No need for technicalexpertise or file-hosting

o       Subscription service of Internet Archive Designed for archives, museums, libraries, educational institutions, state organizations, individual researchers

 

Website Management

More issues with referencing IDs

o       References for mirror sites

o       References for back-up sites when main site is down or bottle-necked

o       References for off-site copies and archival copies

 

 

Conclusions for preserving all types of digital works:

Digital Repository Traditions & Services require

  • Sustainability
  • Interoperability
  • Access

And all of these require Standards and Metadata

 

From the technological point of view

 

Standards offer the best hope of overcoming Impediments

o       Easier to maintain a single set of standards over long periods of time

o       Puts all institutions in the same large boat who will face obsolescence and migration problems periodically throughout the future

 

For artistic and other challenging works:

How Best to save these works?

 

o       Use Standards wherever possible

o       Be aggressive about asset mgmt -- saving component parts and ancillary materials

o       Both creator and Archive should develop an institution-wide plan for saving electronic works

–        Refreshing and either migration or emulation – Standard encoding schemes

–        What is the work? And prioritize what needs to be saved – Save ancillary materials and records

 

What can we do specific to electronic media?

 

·        Works themselves may no longer even exist; in many cases, what we can save amounts to forensic evidence

·        Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact)

·        Too complex to save every one of these aspects for every type of material

·        Importance of saving pieces, representations, and  documentation

·        Involve creators & curators to capture intentions

·        Importance of Standards

·        Familiarize ourselves with recent conservation developments (Guggenheim’s Variable Media, Who Knows?, TechArcheology, Tate, IMAP)-

 

 

Towards a methodology for software preservation

Esther Conway, Brian Matthews, Arif Shaon, Juan Bicarregui,

Catherine Jones, Jim Woodcock (Univ of York)

 

Software Preservation

What is software preservation?

– Storing a copy of a software product”

– Enabling its retrieval in the future

– Enabling its reconstruction in the future

– Enabling its execution in the future

Not what most software developers and maintainers do.

 

Why Preserve Software ?

• Museums and archives:

–        Either supporting Hardware

• E.g. Bletchley Park, Science Museum,

–        Or in its own right

• Chilton Computing, Multics History Project

• Preserving the work

–        E.g. research work in Computing Science

–        Reproducible

• Preserving the Data

–        Preserving the software is necessary to preserve other data

–        Keep the data live and reusable

• Handling Legacy

–        Specialised code from the past which still needs to be used

–        Usually seen as a problem!

 

What to Preserve

Collect the source!

–        A cultural artifact

–        a form of literature (D. Gabriel)

–        beautiful programs are works of art (D. Knuth)

–        A view into the mind of the designer

–        intentions, assumptions, abstractions, mistakes, humor

–        little of this gets captured in any written form

–        This is the embryonic first 50 years of millennia of software development

–        The transition from cave painting to impressionism

–        A voluminous source repository can be analyzed to teach us about the evolution of software engineering

•         architectural evolution

•         data structure design

•         use of algorithms

•         optimization (and premature optimization)

•         locality of function

•         information hiding

•         coding style and idioms

•         defensive programming styles

•         software redundancy

•         failures and bugs

•         module decomposition

•         joint authorship

•         programming language use

Collect the binaries!

–        For use on restored, reconstructed or simulated old computers

Collect the documentation!

–        manuals, notes, papers, email

Collect the stories

–        interviews, reminiscences, websites

 

Preservation Approaches

• Adequacy: How do we know we have captured enough?

–        Depends crucially on Preservation Approach

• Technical Preservation. (techno-centric)

–        Maintain the original software (binary), within the original operating environment.

–        Sometimes maintain the hardware as well

• Emulation (data-centric).

–        Re-creating the original operating environment by programming future platforms and operating systems to emulate the original environment,

–        so that software can be preserved in binary and run "as is".

–        E.g. British Library

• Migration (process-centric).

–        Transferring digital information to new platforms before the earlier one becomes obsolete.

–        Updating the software code to apply to a new software environment.

–        Reconfiguration and recompilation – “Porting”

–        An extreme version of migration may involve rewriting the original code from the specification.

• Different preservation approaches required different significant properties

–        Use a notion of Performance to assess adequacy

–        Test case suites as tests of adequacy

 

Conceptual Framework

Three aspects to the framework:

• A Performance Model for software

–        Determine what it means to preserve s/w

–        Retrieve – Reconstruct – Replay

–        Adequacy of performance of s/w

• Model for describing s/w artifacts

–        As complex digital objects.

–        Versions and variants

• Properties for preservation

–        For retrieve, reconstruct, replay

 

Preservation Approach and Software Process

 

 

Performance Model for Software

 

 

·        Testing data performance to judge adequacy of the software performance.

·        Important to maintain software test suite to assess preservation of significant properties of the software.

 

Adequacy.

A software package (or any digital object) can be said to perform adequately relative to a particular set of features(“significant properties”), if in a particular performance (that is after it has been subjected to a reconstruction and replay process) it preserves those significant properties to an acceptable tolerance.

 

 

Software

Category

 

“Adequacy” Factor(s)

 

Scientific Data Processing Software

 

The adequacy of the behavior of this type of software may be measured by:

– Running the software to process some pre-specified test input data

– Comparing the output of the test run with the corresponding pre-specified test result;

– Checking if the output exceeds the acceptable level of error tolerance for the software. For example, the NAG Software Library publishes test cases.

 

Games

 

The adequacy of the behavior of a game may be measured by:

– Comparing its User Interface UI with the screen capture of its original UI.

– Comparing its performance against some pre-defined use cases. For example, the completion time of a particular level can be compared against the average completion time for that level in the original game.

For example, when playing the emulated version of the 1990’s DOS-based computer game Prince of Persia5, some of the operations do not always work on the emulator and the original

appearance of the game is also somewhat lost but it is still possible to play the complete game.

 

Programming Language Compilers

 

A compiler may be said to have been preserved adequately, if:

– it covers all features of the programming language that it supports, e.g. concurrency (i.e. threads), polymorphism, etc. .

– the application resulting from compiling its source code (written in a language supported by the compiler) using the compiler yields the expected behaviour.

For example, some programming languages (e.g. Fortran, C, C++ etc.), have ISO standards6 which describe the correct behavior of a software written in these languages. These standards also provide test programs that may be used to assess the adequacy of a compiler for rendering all

features of the programming language that it supports

 

Word

Processor

 

The adequacy of a word processor may be measured based on its ability to:

– render existing supported word documents with an acceptable level of error tolerance. For example, a word processor may be regarded as adequate as long as it clearly displays the contents (e.g. text, diagram, etc.) of a word document, even if some of the features of the document content, such as font color and size, may have been rendered incorrectly or even lost completely.

– enable editing (e.g. add/change/remove text, change font) and saving existing word documents

– enable creation and saving of new word documents

 

For example, OpenOffice Word is adequate for viewing and editing word documents originally created using Microsoft Word  with some level of error tolerance (e.g. images do not always appear as originally intended but viewable nevertheless).

 

 

A Framework for Software

·        Provide a general model of software digital objects Relate each concept in the model with a set of significant properties

 

·        For a different preservation approach, we need different significant properties to achieve a desired level of performance.

 

• Product

–        The whole software object under consideration

–        Could be single library module, or very large system (e.g. Linux)

–        Comes under one “authority” (legal control)

–        Defines “gross functionality”

• Version

–        Releases of the system

–        Characterizes by changes in detailed functionality

• Variant

–        Versions for a particular platform

–        Characterized by operating system and environment

• Instance

–        A particular instance of a particular variant at a particular location

–        Ownership

–        An individual license

–        Fixed to particular MAC or IP address, URLs etc.

 

Preservation Properties of Software

What to attributes do we need to take into account?

– Functionality

• what it does and what data it depends on

– Environment

• platform, operating system, programming language

• versions

– Dependencies

• Compilation dependency graph

• Standard libraries

• Other software products

• Specialized hardware

– Software is a Composite digital object

• Collection of modules

• Specifications, Configuration scripts, test suites, documentation

– Architecture

• Client/server, storage system, input / output

– User interaction

• Command line, User Interface

• User model

 

·        Software is highly complex with a lot of factors which need to be considered

·        We need a framework to organize and express software.

 

Relationship to the OAIS model

  • Open Archival Information System (OAIS) – ISO standard for the preservation of digital object.
  • Software preservation properties are related to concepts in OAIS
  • The OAIS defined Descriptive Info, Representation Information (RI) and Preservation Description Information (PDI) (ISO 2002) can be used to retrieve (discover and access), reconstruct (compile source code), and replay (verify authenticity and run) a software object respectively

 

 

Stakeholder Analysis

Software creator:  

–        Has detailed knowledge of the software

–        Can provide reconstruction and replay properties, to make it easier to maintain software in short and long term.

Software procurer:  

–        Funds the software creator.

Software user

Repository manager:

–        Collects and curates institutions software

 

 

Preservation analysis methodology and software

        

 

 

What Next?

 

References

An Annotated Bibliography: Approaches to Software Preservation