CS835 - Data and Document Representation & Processing

Lecture 1 - Convergence : Data, Documents, Delivery

 History of Printing

1.     China: The Technological Roots

2.     Gutenberg and the Historical Moment in Western Europe

3.     Print and Modern Thought

4.     Advances in Print Technology


Desktop Publishing (DTP)

Definition: Preparation of typeset or near typeset documents on desktop computers (personal computers). All text composition, page makeup, manipulation of digitized graphics and integration of text and graphics are performed on desktop computers.

Three activities of DTP

  1. Pure text preparation
  2. Creation and manipulation of graphic images, where text plays only a minor role
  3. Complex page makeup, in which text and graphic elements are united in a harmonious way within the confines of a single page


Components of a DTP system

Key stages in the process of DTP

  1. Need for publication: Conduct appropriate analysis to determine need for publication
  2. Purpose and audience: Consider the audience, content, style, language, purpose.
  3. Create text: Word processed, scanned or directly typed into program. Proof read text to ensure content is OK.
  4. Create graphics: Graphics created with appropriate software,scanner, tablet or digitiser.
  5. Design format: Determine grid, columns, headers and footers, page numbers, text style, design final layout.
  6. Load files and lay out publication: Text and graphics are combined, formatted, scaled and positioned.
  7. Print: Choice of a suitable high resolution printer, i.e. laser printer or imagesetter


1979 - Alto - Xerox PARC

1981 - Model 8010 (Star) - Xerox PARC


1983 - Canon develops the 'engine' used in low cost laser printers

1983 - Lisa - Apple


1984- Hewlett-Packard produces the HP LaserJet

1984 - Macintosh - Apple

1984 - Adobe introduce PostScript page description language (PDL)
1985 - Aldus develops PageMaker for Mac
1985 - Adobe builds PostScript hardware/software interface to Apple LaserWriter (cost $5000)
1986 - Microsoft release Windows 1.1


Mark-up Languages

Definition: A notation for identifying the components of a document to enable each component to be appropriately formatted, displayed, or used.

1967 - William Tunnicliffe paper-  titled The Separation of Information Content of Documents from their Format – separates content from formatting

1969 - Charles Goldfarb - GenCode project at IBM expanded this work to develop the Generalized Markup Language (GML) – by

1980, 90% of IBM documents formatted in GML

1973 – Joe Osanna - Unix operating System (PDP-11)

1977 - Donald Knuth – TeX – begun in 1977, evolved through early ‘80s - detailed layout of text and font descriptions to typeset mathematical books in professional quality.

1980 – Brian Reid – Scribe : a document specification language and its compiler

1986 - Standard Generalized Markup Language (SGML) extended GML and was accepted as an ISO standard

1991 - Tim Berners-Lee and Robert Caillau - HyperText Markup Language (HTML) - some SGML syntax, without the meta-language

1998 – XML – extended Markup Language

 Other Languages
Adobe PostScript
Adobe PDF
· Optimized PostScript
· PDF document attributes:

 Groupware and Computer-Supported Cooperative Work (CSCW)

Computer Supported Cooperative Work (1984) coined by Gireif and Cashman

A Paradigm Shift for Computing

Transformation from human-machine to human-human interaction

Results from several convergent phenomena:

 Widespread groupware:

CSCW Taxonomy


One Meeting Site

Multiple Meeting Sites

Synchronous Communications

Face-to-Face Interactions

  • Public Computer Displays
  • Electronic Meeting Rooms
  • Group Decision Support Systems

Remote Interactions

  • Shared View Desktop conference Systems
  • Desktop Conferencing with Collaborative Editors
  • Video Conferencing
  • Media Spaces

Asynchronous Communications

Ongoing Tasks

  • Team Rooms
  • Group Displays
  • Project Management

Communicationand Coordination

  • Vanilla email
  • Async conferencing bulletin boards
  • Stuctured messaging systems
  • Workflow management
  • Version Control
  • Meeting Schedulers
  • Cooperative hypertext, organizational memory

Asynchronous Groupware

Structured Messages, Agents and Workflow

Cooperative Hypertext and Organizational Memory

        1. Retention of knowledge
        2. Support for global collaboration and global discussion
        3. Enhanced communication

Synchronous Groupware

  1. Collaboration transparency - single user software made available to group

  2. Collaboration aware - rewritten software for group use