Document Formats for the Web: PDF, DWF, and CMS
Have you noticed how many documents you encounter online end in "PDF"?
The increasing prevalence of this file format is a testimony to the success
of Adobe’s "portable document format." Indeed, it’s
ubiquity and the free Adobe Acrobat Reader to view these files has led some
Web projects to standardize around Acrobat, the product suite that produces
PDF files.
Files in PDF format used to be the bane of accessibility professionals and
a disaster for those dependent on-screen readers to translate Web content into
interpretable formats (audio or large-type visual representation). PDFs presented
a "disabled not allowed" sign to information viewable to others.
So why has the popularity of PDF files grown so widely?
PDFs represent independence from the idiosyncrasies of software applications
and printers. In effect, PDFs claim cross-platform compatibility, application
independence, and content integrity for documents distributed via the Web. With
the recent successes of Adobe’s e-Paper division, the producer of Acrobat,
it seems that the newest release of Acrobat 6.0 addresses some of the criticism.
Making PDFs More Useful
As with most things involving structure and text that relate to the Web, XML
(eXtensible Markup Language) comes to the rescue. The newest release of Acrobat’s
PDF file format incorporates some features from the land XML.
One of the primary reasons that PDFs became popular in the first place was
the fidelity to the printed page of presenting on the screen what was formerly
only available on paper format. You can take a journal article or any other
printed document and make a high-quality digital representation of it, including
the ability to reprint it exactly like the original. The PDF Reader is easily
called by other applications, whether as a browser plug-in to view a file, or
as an alternative print driver to create a PDF document from another application,
e.g., Microsoft Word. So far so good.
PDFs are much more readable than a scanned TIFF image, and provide value-added
display control. Until relatively recently, they were just a bag o’ bits
with a simple presentation interface.
XML provides a mechanism for adding structure to the parts of a document.
In particular, with XML, business logic can be imbedded in the document so that
it can be acted on in a workflow process. XML forms and their associated logic
can use PDF as a familiar interaction interface. Perhaps most interesting is
the ability to add digital signatures, metadata, and schemas to the PDF. This
is particularly important to enable the PDF document to interact with search
tools and cataloging systems for query and archiving—definitely a step
in the right direction. It’s no surprise that Adobe has decided on this
strategy to make the PDF a major player in complex document management environments,
leveraging its strength as an interface to present digital images as faithful
representations of the printed page.
Alternatives or Complements to PDFs?
While Adobe has exceeded Wall Street expectations in its recent quarterly report,
other file formats are being presented to address perceived weaknesses, including
advocates restricting the use of the PDF format altogether.
Autodesk, makers of AutoCAD design software, have an open file standard called
Design Web Format (DWF), that claims to create, display, and print multi-sheet
computer-aided design drawings faster, with higher resolution, and in smaller
file sizes, than can be done with PDFs. It’s limited currently to design
documents, but if you need to work with them, you’re probably already
working with AutoCAD files. Others can view them with the Autodesk Express Viewer
or the Autodesk Volo View application.
Then again, why exactly are you putting these files up on your Web site or
course management system? Jakob Nielsen of useit.com (Usable Information Technology)
argues that PDFs are really good for one thing: printing. If your intention
is anything but that, you should consider using something else, he argues. His
suggestions hinge around presenting the best user experience possible. That
is, give users enough information about the document to justify downloading
and opening up a helper application to proceed further. In addition, Nielsen
urges you to prevent search engine spiders from indexing the words in PDF that,
on hyperlinking to them, dump the user to the PDF’s front page, nowhere
near the indexed word.
CMS and The Digital Paper Chase
One of the surprises faculty encounter when switching to using a CMS in teaching
is the extent to which we rely on paper for course materials. The life cycle
of material selected for teaching a course is more complex than we tend to acknowledge.
Some of the critical elements include:
· Conceptualizing what is required;
· Finding it;
· Digitizing into some acceptable file format (if not originally electronic);
· Placing it on the CMS;
· Deciding what to do with it when the course ends.
These are steps we go through in preparing for our teaching in general. Now
that we’re using CMSes to extend the interaction with course material
online, the full life cycle of electronic documents becomes an issue for faculty,
for IT departments running CMSes for their institutions, and for the CMS vendors.
Is your CMS a ‘roach motel’ for your course materials? You need
a personal, as well as institutional strategy for CMS electronic document management.
We’ll approach this in future columns.
Phil Long, Ph.D. ([email protected]) is senior strategist for the Academic
Computing Enterprise at MIT. He is also a senior associate for the TLT Group
of the AAHE.
Briefs
Office services king Kinko’s will offer course-packs, compilations
of course materials used to supplement traditional textbooks, to the higher
education market.
The National Science Foundation will award $9 million to U.C. Irvine and
$3.5 million to U.C. San Diego to develop information sharing tools and
organizational strategies for first responders and emergency service providers.
A Student design teams from the University of Minnesota, U.C. San Diego,
and Cornell won first, second, and third place in a chip design contest
sponsored by the Semiconductor Research Corp.
A federal jury awarded CollegeNET $1.2 million in damages for infringement
of two patents covering its Universal Forms Engine.
For these and other news stories, or to subscribe to our eNewsletters,
Syllabus News Update and Syllabus IT Trends, visit www.syllabus.com.