HTML or PDF? Why Are We Asking This Question?

Speaking of his desire to publish more documents on his university's Web site in HTML instead of PDF, a university Web master who shall remain anonymous recently posted the following on the UWEBD discussion list: "There are a lot of things I'd love to do with the school's Web site if it weren't for my job getting in the way." There is a lot of wisdom in that sarcastic statement.

Whether to publish documents online in PDF or in HTML is now a great conversation starter at parties with lots of geeks present. Passion fills the air, and tempers flare. But I think it's only a temporary issue, dependent on our current stage of Web evolution; a question that will be going away in the next decade. Formerly an HTML fanatic, at the moment, I am a big fan of publishing in PDF. I guess if I were running for president, the current administration would be able to accuse me of waffling.

When my employer-organization's journal, Planning for Higher Education, moved to publishing online as well as in print, I was a tyrant about ensuring that we publish it in HTML, not in PDF format. Why? Well, primarily because at the time - late 2000, at the very end of the last millennium - it seemed as though there were many users who found it difficult to keep a fairly current version of Acrobat Reader on their computer. It was commonplace for us to send out a PDF file to a committee or other group of volunteer leaders and hear back from one or two that they couldn't open the document, could we please fax it to them.

At the same time, in well-funded organizations like the giant pharmaceutical company that my spouse works for, every machine could be, and was, made to have Acrobat Reader, and she was churning out a regular, weekly internal newsletter in PDF - which was even formatted to a height and width per page that suited viewing on screen. (I thought that was pretty clever, and I speculated at the time that print publications would modify their height and width to match monitor specs, but it seems not to have caught on as some clever things never do - although much of what we see in print nowadays has remarkably converged with Web documents' look and feel. I also remember when Ford Motor Company eventually gave up trying to drop the second "e" in the word "employee" in all of its communications. Apparently, it figured that over a year and in millions, maybe billions of printed or typed pages, it could be saving some money, somewhere.

But we published our journal in HTML, which led to some fairly awkward contortions. For example, I wanted to reduce the likelihood of people printing out a complete article instead of ordering a reprint (or subscribing) and I wanted to increase the likelihood of citations from the Web matching those in print, so we publish on each HTML page only the same text as was published on each printed page, with appropriate pagination. Somehow, this looks fairly strange in a browser where it d'esn't on the printed page. (See this page, for example.)

But, at the time, we thought it was worth it. In fact, I was passionate about it. How I feel about it right now, in early 2004, reminds me of the final lines of Meatloaf's classic tune "Passion by the Dashboard Light." I'm sitting here kind of "waiting for the end of time, to hurry up and come along" so we can make our decision to start publishing PDF instead of HTML!

In a time of increasing demand on our staff time and other resources, I really resent the time it takes to manually move text from Quark into the database which publishes the online version of our journal, and the various manipulations needed to make that work. As that happens, and as my staff spends time it really seems as though it d'esn't need to spend every three months doing this, I think to my self, "We could just print from Quark to PDF and be done with it!" As one member of the UWEBD list put it, "How do you justify the 2-3 hours it'll take you to HTML-ify articles when to the normal person it's a couple of button pushes with Acrobat and you're on your merry way?"

I am certain that if we were making the move today we would just publish non-printable PDFs linked to an HTML index page. PDF has come a long way, accessibility issues aside, as important as they are, you can now provide active form fields and validation in a PDF document, and the newest photocopiers can output to PDF nearly as easily as they do to paper with sprinkles of baked-on carbon dust.

Of course, newer trends like the use of XML and serious content management systems will render all of this moot, as this one of many information technology convergences moves us to a future where end users can select their content and their format. And Bill Gates has recently expounded on how he believes that reading will move from paper to the screen in the next ten years. We'll see if Microsoft can force the kind of cultural shift that Ford could not with that extra "e."

But at the moment, it's not "there" for smaller organizations, including colleges and university departments. And the move to having "there" seems, at this end, to be a long and arduous path with requires lots of money to be spent, and lots of staff time spent analyzing and changing staff procedures. My staff has, for example, the savvy and the knowledge to move to XML and better content management, but we also have strongly competing demands on the time of the staff who would have to drive it. So, for at least the short- to medium-term, I am going to be looking more favorably towards publishing PDF online versions of also-printed-on-paper publications far more favorably than I am going to be looking at HTML.

As another UWEBDer put it, "Bingo. The issue is time and resources. A PDF is only seconds away. If you don't have a CMS, or you're not a full-time Webbie, converting print to HTML is a chore. Who likes to do their chores?" He g'es on to point out that the very pragmatic things for people like me to be considering now include:

- Use PDFs when people expect or want to see the original, printed product;
- Use PDFs if the only other option you can do is no Web version at all;
- Realize that for some, PDFs are slow and cumbersome and can create crashing problems; and
- When you do take the time to do HTML, publicize the heck out of it and take credit for it!

How about you? I'd like to hear your opinions on this and would be happy to share them, in an unbiased and fair kind of way, in a future opinion piece here. Don't hesitate to e-mail me at [email protected].

Featured