What's Happening with the Web
![]() Greg Marks [GM] |
![]() Terry Calhoun [TC] |
![]() Howard Strauss [HS] |
Oct. 22, 1998
Audio
• Streaming
MP3
• Download
MP3 (Download
Tips)
GM: Welcome to the CREN TechTalk series. You are here because it's time to discuss the core technologies in your future. This is Greg Marks, your CREN host for today. Today we have Howard Strauss as our guest expert to get another update from Howard on "What's Happening with the Web."
Our co-host for today is Terry Calhoun, who is the Internet editor for the Society for College and University Planning. Terry is an expert in electronic media communication, and we are pleased to welcome him here today. Welcome, Terry.
TC: Thanks, Greg, and I'm really glad to be here. As we were talking before we went live, I just remembered that it was a session that I attended that Howard gave at, I think, CAUSE '93 that led me to say to myself, "Hey! I can do this stuff too," and got me into this whole field, so it should be a pleasure talking with him.
GM: Great. As everybody listening in now understands, our featured speaker today is Howard Strauss. Howard is the Manager of Advanced Applications at Princeton University, and is as close as anyone on the globe to the pulse of what's happening now with the World Wide Web and how it relates to Howard education -- not Howard education, higher education!
HS: Totally different topic!
GM: Yes, that's right. You'll soon get a dose of Howard education! Usually, Howard is the technology anchor for these TechTalks, but today we're pleased to have him in the role of expert and we can grill him.
Welcome, Howard.
HS: Thank you, Greg, and thank you, Terry.
I thought I'd open by telling you that lots of people who talked to me about the Web lately have been asking me about what the next killer application is going to be, since everybody knows something's going to replace the Web and be the next killer application. And the answer I give them might be of interest to listeners -- that the next killer application is going to be the Web!
The Web continues to re-invent itself, continues to change, and I think as long as it's able to do that -- as long as it's able to look quite different than it looked a year ago or even a few months ago -- that it'll continue to be the killer application that it is. Today we're going to get a chance to talk about some of the ways it's changing.
GM: Great. I think there's great wisdom in what you're saying.
Before we launch into the TechTalk, I'd like to remind everyone that you can ask questions directly of Howard during this Webcast at the address expert@cren.net. If we don't get to your questions during the session, watch for an expanded Q&A sequence on the event Website afterwards. And let me also remind you that you can pick up on archive sessions, both of this event and earlier events, on the CREN Website at www.cren.net. Howard, I think from our prep for this session, we have enough news about what's happening to really deal with many sessions on many topics, so let's dig in today and let's go after I guess in particular publishing without HTML. Let's start with that topic. Go ahead.
HS: Sure. One of the reasons that I actually put this on the list of topics that we might discuss is that, although people when they think about the Web or new stuff on the Web, they're always thinking about all the great new features and complicated things and stuff going on. One of the things that's happened on the Web is that just a little while ago, it was enough for people to be able to browse the Web. And I think that a lot of universities were struggling to make sure that all their faculty and staff could browse the Web.
But that has really changed. Today, people expect that all ideas and data and knowledge and everything is going to be on the Web, so it means if you cannot get your ideas, your data and your stuff on the Web -- if you can't publish your stuff on the Web, for all intents and purposes, it appears that your stuff doesn't exist.
So it's really important that everybody be able to publish on the Web. Unfortunately, it's hard to do (or people think it's hard to do), but there are many ways today that people can publish on the Web without knowing HTML or without knowing much of anything beyond how to use the word processor or spreadsheet or whatever it is they use every day.
TC: What do you think is the easiest to teach a non-techie?
HS: I think the easiest thing to teach a non-techie -- or to teach anybody, even a techie. There's lots of techies out there who don't know one particular product or another -- but the easiest thing to teach them is something they already have on their desktop, that they already use. And for an awful lot of people, that's Microsoft Word.
So if somebody today knows Microsoft Word, if they really know it, they can -- without learning much more than how to save the file as HTML (which is on the menu). You probably never noticed it, but if you pull down the FILE menu and look, you'll see there's a "SAVE AS HTML" thing there. If you take any old Word document and just save it as HTML, that's it. You have a Web document. Get it on the Web!
GM: Are there any features that, when you've done that, you need to be careful about because they don't work?
HS: Yeah, that was certainly a gross oversimplification of what one has to do. Although not that much of a simplification in that there's a lot of things that Word can do in its standard form that are not available on the Web. For example, the simplest, most obvious thing is that you can easily put together 64 different fonts in, say, 64 different sizes. You could create something that looks like a ransom note in word.
GM: Exciting!
HS: Right. And a lot of people familiar with the Web realize that you can't have 64 different sizes of fonts on the Web. You're lucky to get about seven of them. So if you built something in Word that does not convert to HTML, what happens when you do a SAVE AS HTML, Word either ignores the thing entirely or it finds something close.
In the case of different font sizes, it looks at the font size and says something like, "Well, this font's really big, so on the Web, we'll make it big." -- though not necessarily 72 points, which is perhaps what you asked for. And if you have some point size that's very tiny, it'll be converted to something that's tiny on the Web, though not necessarily the right point size.
You have to get some sense of what you can do and what you can't do, but most simple documents convert without any further ado. That's it. They just convert very nicely.
GM: Suppose I've used the graphic tools or used tools to create a table or some kind of copy of a form or something like that. What happens then?
HS: Okay. Now we get into a more interesting area, the whole area that you mentioned, Greg. You actually asked about six questions.
GM: Lucky you!
HS: Right. But when it comes to something like Word Art, which if you haven't used on the Web, you really should -- not on the Web, in Word, excuse me. If you haven't used Word Art, it's wonderful stuff. You can produce really nice effects with fonts and things.
TC: My kids love that.
HS: Which lets you draw flow charts and organization charts and all that kind of stuff. If you just blindly go in and draw those things, and they look great on your Word page, and save them as HTML, they're ignored because Word does not know what you really plan to do with them. Why it doesn't know what you plan to do with them is beyond me, but it seems not to know.
But it's only a couple very simple steps to turn those things into GIF files and have them -- I know this sounds complicated for people who are faculty members whose expertise, say, is in biology, not in building Webpages. But all one needs to do is just inset a little box into the document and put an object in there which is called a Word Picture that's really just one step on the menu. Go over to the INSERT menu, insert an object called a Word Picture and then if you put inside that little frame called a Word Picture, if you put AutoShapes or Word Art or draw anything you can imagine in there that you can draw with Word, then everything gets converted correctly.
In fact, it's such a nice feature that lots of times, if I want to build a GIF file -- I'm not even interesting in building a Word file or whatever -- I want to build a GIF file, I'll have some art that I'm going to sketch, I'll just go into Word, open up a little Word Picture, sketch the thing in there, save it as a GIF image, and then I will use it in other Websites that in fact have nothing whatever to do with Word. So even if you're an experienced Web designer, you might find that there's tools in Word. (Experienced Web designers are cringing at the thought.)
GM: Yes.
HS: But there's tools in Word that even experienced Web designers can use to really turn out some interesting stuff.
GM: I'd love to know how many of the folks listening in have just opened up Word on their desktop and are trying this out.
HS: Okay, I should say that if you have opened Word on your desktop -- and I urge you to do it. Go off to your desktops immediately -- if you do do it, one of the things you might discover is that if you're building a Web form or you're building anything on the Web, of course, you want a pretty background. And you know -- I say "know" in italics, if one can speak in italics -- you know that you can't produce pretty backgrounds in Word.
But all those people out there who have Word open, just click on the thing that says FORMAT, the little item that says FORMAT. Go down to BACKGROUND. Probably something you never realized was there! And look at all the pretty colors you can set as the background. This has nothing whatever to do with the Web. This is just with Word you can always have pretty-colored backgrounds. You've been able to have that for a long time, including little marble things in the background and all kinds of textures and that kind of stuff. It's out there. Go try it!
And of course, if you then save it as HTML, it'll save it with the pretty background.
TC: Oh, it will?
HS: Yeah, it will.
TC: That's great.
HS: I've really taught this kind of stuff to many people and showed them all the complicated stuff that you can do, and everybody comes back to the fact that, "Wow, I didn't know you could do backgrounds! That's kind of amazing!"
GM: How about such pedestrian things as Excel spreadsheets or PowerPoint presentations on Access databases? What about those?
HS: With them to do simple things, it's just as simple.
So for example, if you have a PowerPoint presentation, there's a Wizard that will convert the thing to HTML for you. You just do a SAVE AS HTML. The Wizard asks you a few questions. The questions are of the order of, "Would you like the little navigation buttons to be on the bottom of the slide or the right edge of the slide?" That kind of thing -- doesn't ask you questions you can't answer.
The questions that have to do with formatting and things like that. Answer those questions. When you get to the end, what's very nice is, you've probably answered six or eight questions, and you can save the answers of those questions as a profile. Which means the next time you go through the thing, your next slide presentation will be converted identically and you don't even have to answer the questions again.
So it does a very, very nice job of doing that. It even lets you include a little link so that people can not only see the thing as HTML, but they could download the original presentation over the Web, if you want them to. That's very, very nice.
TC: That's a nice feature. Are there programs that people might use regularly that have this hidden ability to produce Webpages that they might not know about?
HS: Well, almost every program today has the ability to do that, not just Microsoft products. We're talking about Microsoft products because, number one, I use them a great deal. And also, obviously, lots of other people use them.
But in WordPerfect, for example, you can save things on the Web. In all the Corel products, you can save things as HTML. Because as I said in the beginning, if you don't have something on the Web, for all intents and purposes, it doesn't exist. I think all the software manufacturers realize that, and so everybody is giving you the ability to convert things to HTML and put them on the Web.
I'd like to say one word of caution about some of these things. For example, Excel, a spreadsheet program which also has a SAVE AS HTML thing. You can take an Excel spreadsheet, just do a SAVE AS HTML, and there it is -- a very pretty thing that appears on the Web. But a limitation of Excel is, if you go back and change the Excel spreadsheet, your thing on the Web does not change. If that's what you're expecting, if you're expecting the thing to be dynamic, it doesn't happen.
TC: Next year!
HS: That's not unreasonable, because with Access, you can actually save a database table in one of two ways. You can save it as a static table, which means that when somebody makes a change to the table, it's not reflected on the Webpage. But with Access, you can save it as a dynamic table, so whenever the database is updated and you look at it on the Web, this HTML file that you thought was just HTML and was just going to stay there in fact has little hooks directly back into your Access database. And when your Access database changes, this thing changes, too.
It's a little more complicated to do, but it's only a little more complicated in that the Wizard asks you a few extra questions.
GM: Given all this stuff that you can readily do, are there still reasons to try to dig in and get some familiarity with HTML?
HS: Yeah, there's actually lots of reasons to learn some HTML, as it turns out. First of all, whenever you do any of these kinds of things, very often what happens is it doesn't do things exactly the way you want. For most faculty, staff, students, whatever -- for most uses, it doesn't matter. You say, "Well, the spacing is a little off or this little thing is not the exact font I wanted or something like that. Or things aren't quite aligned the way I want." But from time to time, you want these things to work exactly.
And also, there's some things -- for example, if you'd like to add a JavaScript to something. So there's lots and lots of cases where you want do something that just can't be done with some of these easy things. Remember, for most of what your faculty and staff want to do, Word is probably enough, or Office 97 is probably enough.
Another reason why you might want to learn HTML is that if you wander out among the community that uses the Web, you're going to discover some people seem to have this Web intuition, and I think what a lot of Web intuition is is people understand HTML. Intuition is not intuition, it's understanding what's underlying the thing.
Also, for people who want to search the Web effectively. You wouldn't think that searching the Web has anything to do with understanding HTML, but in fact, it does. A lot of the advanced features in search engines have to do with looking at particular tags, particular HTML tags. So it's real helpful, for example, if you're going to use the Advanced Search features of Alta Vista to know what an anchor tag is and to know what an image tag is and to know what a title tag is and that kind of stuff.
If you're going to ever code JavaScript, you've got to know HTML because Java Script is an HTML scripting language. VB Script is an HTML scripting language. You can't very well do an HTML scripting language without knowing HTML.
I'd like to give one more reason to learn HTML, if I haven't given enough here yet. And that is, someday (and we may talk about it in a bit) you're going to have to deal with XML -- the EXtensible Markup Language. And XML looks a great deal like HTML, and in fact, it produces HTML. So you're never going to understand -- in fact, it's hopeless, forget it, you'll never understand XML unless you understand something about HTML. So there's lots of reasons to go off and learn it.
GM: Let's kick up the discussion. Let's kick it up a notch here, to use a food paradigm. We have a question from Dave Hannam, and he says, "How long do you think it will be before Netscape and Microsoft come together on standards like DHTML and JavaScripting?"
TC: Tough question.
HS: Right. It's one of these things where I know I give talks, in fact, on the future of the Web and things, and I've always felt that I really can't predict right now what I'm going to have for dinner tonight. So it's a little dangerous for me to make predictions about the Web.
But I'll make one anyway because not being able to do something has never stopped me from doing it anyway. And I think when it's going to happen is when Netscape gets pushed out of the browser business, which I think is on the way of happening. As those of you who've read the recent news have seen, Internet Explorer now has more than half of the browser market and the share is growing and growing and growing. I think as long as there's Netscape and Microsoft in there, they both have what they believe to be competitive advantages to having proprietary stuff out there. I don't see them ever stopping it. As long as they're both there, I think things are going to be different. And this is a mess. It means that every time you write HTML, you're going to discover that your HTML is not identical for IE (Internet Explorer) as it is for Netscape.
Now, another possibility is that XML will take over before these folks push one of the other out of the market, and XML is a standard. It's not like HTML. I guess HTML's a standard, too. The interesting thing about the HTML standard is it's completely ignored by everybody who builds a browser. I think the nature of XML is such that you can't ignore the standard.
GM: One of the questions that I have about these statistics is, it's one thing to count desktops; it's something else to figure out what active use patterns are. And I don't remember where, but I have seen some numbers that indicate that Websites see more actual use from Netscape than they do from the Internet Explorer. But then there's also the question, for us in particular, is what's the ratio in higher education? Those are always fun things to try to figure out where you can get statistics that help you understand any of that.
TC: Yeah, it is.
HS: Even a place like Princeton University, where Netscape is the standard -- I mean, that's the supported browser here. I guess I'll get some folks from Princeton telling me I don't really know what's going on, but I think it's the supported browser, anyway. But almost everybody I know is either using Internet Explorer or using both Internet Explorer and Netscape, as I do.
GM: Let's go on to the Web publishing environment and ask about how you view the service from a university for Web users in terms of putting up their own Websites and so on.
HS: Sure. The point about a good Web publishing environment is that since we've sort of decided (and if you haven't decided, you really have to decide it real soon) that everybody has to be able to publish on the Web. You can't do it for them. They've got to be able to do it themselves. And you could tell them all to run off and just learn Word and be able to publish and things like that, but if you want people to really publish on the Web (and I think we really do want them to), you have to create an environment that's conducive to letting them do that.
And to do that, I think you have to sit back and say, "What do your users really expect? What do they want out of the central organization?" I can and will give you a few points as to what I think your users expect. I know that many people just go off and say, "We'll build whatever is comfortable for us to build. Why talk to our users? What do they know?"
My guess is first of all that users really want to be able to get their ideas and their data on the Web, and they want to be able to do that really easily. You've got to think about how you're going to help them along, but in addition to that, I think that users want to control their own data and Webpage. Somehow, you've got to give them a central site. I believe you have to give them a central site because you don't want them to do too much work, but on the other hand, they have to control their own data. They have to own it. They have to have their own data available all the time. They have to be able to update it whenever they want.
TC: They don't want to do something and get permission from you and take three days to have it live.
HS: Absolutely. That's the worst possible thing in the world. Also, you don't want that. Every time your user changes a Webpage, you don't want to have to be in the loop. So you'd really like them to do it on their own.
They'd also expect that their data's going to stay private if it's private, and their transactions are going to be secure. They expect the systems are going to be up all the time, they expect everything's going to be backed up. You know, all the kind of things that you would expect from some service organization. And the trick, I think, is to make it so that users think they own the data, to give them this great illusion that the data is theirs and they can update it all the time. Well, you actually provide a bunch of services for them.
The way we do that at Princeton, and I think it's a way that lots of folks could do it, is we have a shared file system. And at Princeton, it happens to be a great deal Novell or UNIX, but it doesn't matter what it is. And what we do is we give users space on the shared file system. To somebody sitting on a PC, looking at a shared file system, it looks like another disk. It looks like their V disk or their X disk or whatever disk they want to do it as.
And what they do is -- let's say they're in a Word document and they save it as HTML, where do they save it? They save it to their V disk. But their V disk really is on the shared file system, and this shared file system is attached by some central Web server. And there it appears on the Web. So whenever I save anything on my V disk (or whatever I've called the disk), the thing just appears, right on the Web. As far as I'm concerned, it looks like a local disk for me.
As far as the system's concerned, you step back, oh, no! It's hardly a local disk. It's a shared disk that belongs, really, to the Web server. So I really get the best of both worlds. I change the thing a million times a day, if I care to. It looks like it's my stuff. And yet it's sitting on a UPS, on an Uninterruptible Power Supply. It's backed up. There's a really good Web server there. As a user, I don't have to worry about security, Web server features and all that kind of stuff.
GM: How do you maintain security, though, in terms of what you may let people do as CGI scripts or things of that sort? Do you do it for them?
HS: The whole issue of CGI scripts is one of the reasons that a lot of people run their own Web servers -- is because who wants to let them run a CGI script on the main server?
CGI scripts, for those folks who are unaware of the danger of these things -- if the CGI script is poorly written or maliciously written, what it can do, it can take any server and bring it down or destroy data or do horrible things out there. So in general, you just won't let people put CGI scripts on a central server. You could sit there and say, "We'll check out the CGI script," but then you're going to be in a business you don't want to be in, which is checking other people's code. Who can ever figure out? I mean, you can hardly figure out what your own code does. Try and figure out what somebody else's code does!
So a solution that we have is, first of all, we go off and we write a lot of CGI scripts that are common to lots of people. So we just sat down and said, "Okay, what are the most common kind of CGI scripts that people want to write?" And we write them.
TC: Like e-mail forms?
HS: What?
TC: Like maybe e-mail forms.
HS: Right. Somebody sits there and they have a survey and what they want to do is collect the data and send it somewhere. They want to e-mail it somewhere. So we thought of the three or four most common kind of things that people want to do. We write it for them.
TC: Wow!
HS: It's better than writing your own CGI script. This thing exists. Lots of people who could never write a CGI script if their lives depended on it can suddenly use one because we have one. So those things are safe. We can do that.
No matter how many CGI scripts we write or ones we can dream up, obviously some user somewhere -- actually, lots of users lots of places -- are going to come up with CGI scripts that we haven't thought of and are really only useful for one or two people and therefore wouldn't be cost effective for us to write.
So what we do, we have built a CGI script server. We take another server that's not the main server. We tell people, "Gee, if you're people we know about -- if you're faculty or staff who we believe we trust to some degree, we'll let you put your CGI's on the CGI server." Worst case, your CGI's will hurt other CGI's. On the CGI server, everybody's aware of the risk out there and it means you don't need your own server. You're still on a server that's on a UPS, backed up, whose security we worry about and things like that.
And so far, our experience has been, I believe, perfect. I don't know of a case where one of these CGI scripts has brought the server down or has really done any real damage. But again, if it did, it doesn't bring the central system down. What it does is it hurts these other CGI scripts. We would attempt to turn it off and bring it back up again.
So we do lots and lots of things to make it so that people have no need to have their own servers. Of course, if they wanted them, we'd let them, but who really wants to be in the server business? I mean, we barely want to be in the server business! I can't imagine why somebody whose profession was biology or accounting or business or law would want to be in the Web server business.
GM: Let's move on to another of the things that a faculty member or students might want to put on their Websites, and that's forms for collecting data or doing surveys or just filling out forms to get information. Can Word be used to do that?
HS: Actually, it can. And one of the things that we did not mention when we were talking about Word is actually that Word has two modes, and if you come in and you just go over to the FILE menu on Word and you click on the item that says NEW, what you'll probably do is just click on the little OK button without even looking at what's up there.
And what's up there is something that says "Open a General Document." And if you open a general document and say OK, you get this kind of Word document. When you save it, you save it as a .doc document. But if you look more carefully the next time you do that, way over to the right on the top (this little area that you've never, ever looked at probably), it says "Open a Webpage." And if you click on that thing and you open a blank Webpage, what will happen is you will see that the menus on Word change to some degree.
Now before, we mentioned that when you build a Word document, one of the bad things was that Word could create things that could not be converted to HTML. If you open the blank Web document, you get a different set of menus. The set of menus is such that you can't build anything that Word can't convert to HTML. The things that don't work are gone. They're missing from the menu. So you can't even set the font size. All you can do is make the font bigger or smaller. You get the seven sizes of type that HTML supports. And in addition, if you go over to the INSERT button, what you'll see under INSERT, is you'll see something that says FORM FIELD. And if you go under FORM FIELD, you'll see all the check boxes and option buttons drop down. And all the stuff that's inside forms.
So yeah, you can build a Web form.
GM: Once again, I can imagine people on this Webcast clicking on Word, opening it up. "Gee, I didn't know that was there!"
HS: They didn't know that stuff was there. And that's usually the reaction I get. When I show folks this stuff, first they say, "You must have a different version of Word than I have," because they've never seen the stuff, but it's there, and it's been there. I mean, this stuff is hiding on your desktop. It's been hiding all the time.
Actually, I think that people are too busy changing the backgrounds in their Word document now, and you've really lost everybody on the Webcast here. I should really have never mentioned that. Are you going to try that later yourself, Greg?
GM: Absolutely!
HS: Okay. So I'm going to assume that every time I get a Word attachment from anybody from now on, it's going to have a pink marble background.
TC: I just imagine what e-mail attaching can do to the background.
HS: Oh, well.
GM: Oh, well.
HS: I've never tried that. A topic for another time, probably, when we talk about e-mail and attachments.
GM: How about using HTML to create forms?
HS: It turns out that when you use Word to create a form, it actually creates an HTML form, and there's problems with HTML forms. (We'll get to the point in a few minutes where we talk about some solutions to that.) But whether you build the thing yourself by coding HTML (which turns out to be a little messy, building forms is kind of ticklish in HTML). or whether you build the thing with Word, the fact is that you can't get an HTML form to look exactly like the form that you probably want it to.
That is, if you take, say, a medical form or something like that, or some interoffice form that you use for purchasing or for travel or whatever, the HTML forms, unfortunately, look like HTML. They don't look like forms, and you only do get seven different font sizes and you can't get the real tiny type, and you can't get them to align exactly the way you want.
TC: And I know that's important at many institutions because the budgetary and financial people, among others, want what they get to receive permission for something to look like a real form.
HS: Yeah, absolutely. And also, it turns out that a lot of people just process forms, and some of the forms -- the real ones and these HTML things -- are the wrong size and the wrong shape and everything like that. So there's lots of reasons to make the thing look the way it ought to look.
GM: But you're building the suspense towards there being yet another answer. What's the even better answer?
HS: Well, the even better answer, and it's been around for awhile except you couldn't fill them in, is to use PDF. PDF, which is Adobe's Portable Document Format, takes anything, whether it's a form or a document or whatever, and it says, "I will convert it so that anybody who looks at it on the Web sees it in the exact font, the exact size, shape, format, everything looks perfect." So if you were to take a form and somehow get it into PDF format, it looks perfect.
Now, that's nice, except that, as I said, in the past you couldn't fill this thing on. So you have a perfectly-looking form that you can't do anything with except print. You could print it, and then if you had a typewriter (if there are any typewriters still around, and I'm not sure that there are) you could take this form, put it in your typewriter and fill it in, which seems silly.
You have the thing on the Web. It'd be really nice to just fill it in. Well, there's a new product from Adobe called Adobe Exchange, and Adobe Exchange has a feature in it that lets you build forms that you can build in. And the very nice thing about this is the most technically-challenged person out there can probably do this. The whole thing can be done, just if you can draw a box with any kind of drawing program. If you can click your mouse at one spot and then drag it off to the other corner so you can create a little rectangle on the screen, and you can pull down a couple pull-down menus, then you can do this whole thing and play with it more.
What happens is, once you get the form into PDF format -- and that's really relatively easy to do, you can even build something in Word and convert it to PDF format where you can scan it in and convert it to PDF format. So you get forms from a variety of ways, or if you have it in Postscript, which means you'd have it in PageMaker or anything that can generate Postscript, you can get into PDF. So it's relatively easy to get a form in PDF.
Once you do, then using this little form creator that's part of Adobe Exchange (which is available for both PC and the Mac), if you just draw little rectangles on the screen on the thing, and once you've drawn these little rectangles, you tell it what kind of fields you want. You do this with a pull-down menu. You say, "This is a text field, this is a button, a check box, a list box," all this kind of stuff.
And what's neater is it goes far beyond that. It lets you do all kinds of formatting, editing, restrictions. You can say this field is required. If you say it's required, it will not let you get past this. It will keep fussing with you, trying to make you fill this thing in. If you say it's a Social Security number, which it knows about, for example, it makes sure that people type in nine digits. It will put the two dashes in if they leave them out. It'll convert dates from one format to another. It lets you add fields together, subtract them, do computations on them, all this kind of stuff.
And it does this very nicely using JavaScript, which you never see. The reason it uses Java Script, which you can't see, is that if you wanted to do something it can't do -- and it's a stretch to imagine what it is you might want to do -- but if you did, you can actually code your own JavaScript, insert it into this thing, since it really is JavaScript based.
TC: What about behind the scenes? Is it very difficult technically for you to tell it what you're hooking it into -- a CGI script or a database at the other end. When you fill the form out, how does it know where to tell the data to go?
HS: Yeah, here's a case where we get back to the good Web publishing environment. If there's standard CGI script -- let's say that somebody has written a standard CGI script that says, "Gather up all the data fields in HTML form." Not one of these funny PDF forms, but an HTML form. "And do something with it." It turns out those scripts work identically. This thing assumes that you probably have CGI scripts written for HTML forms and they interface with those things correctly, exactly.
So we have a CGI script that gathers up all the fields in a form and mails them to somebody. We take that thing, unmodified, and in here, it's kind of cute. You just click on a little button and it says, "Where do you want me to send this?" You just give it the URL of the CGI script. That's probably the most technical thing you have to do here. And it puts it in the right place. It builds the right tag, it does all this stuff, it makes the button a SUBMIT button. And you click on the button on this form and it does it. So it's at least no more difficult than doing it with anything else because this has standard interfaces to it.
GM: You're beginning to make me feel like life is good.
HS: Yeah. Well, certainly this is not the be-all and end-all. But it is something that -- for example, at Princeton University right now, we have an Office Administrator who's taking all the Dean of the Faculty forms which she had built as HTML forms with a huge amount of effort, and I thought nothing on earth would convince her to take those things and get rid of them. But she's now replacing them with this technique because her forms didn't look good. There weren't any editing facilities. It was very clinky and cumbersome and, I guess, it was fine last year. It was last year's technology. This year's technology, the forms look correct, and she's doing this herself. This is no big deal. In fact, she's told me a couple of times (in fact, Chris, if you're listening, you know you've said that, gee, this is stuff that you really could give to some lower-level clerk than yourself).
So it really is very simple to do. Once you've done one, if anything, you're going to say this is kind of boring to do. But it's really nice to be able to do wonderful, technical stuff that you think might be a little boring, rather than I need three programmers and a psychiatrist to help me do this thing.
GM: We have about a little over ten minutes to go here in the Webcast, and I want to make sure that anybody that's got questions realizes that they can send us e-mail. Send it to expert@cren.net and we'll either try to respond on the air or get back, put answers on the Website afterwards. So send in your e-mails if you've got questions. Howard, let's go on to the beauties of XML. Tell us about that.
HS: Okay, first you ought to be aware of what XML stands for. When you look at something new, the first thing is to learn some of the jargon. If you're looking for a word that starts with X, it's not out there. XML stands for EXtensible Markup Language. So the X is the second letter. I guess EML didn't sound good or something. I'm not sure why they did that.
But I'd like to talk about the status of XML before I really tell you what XML is, just so you'll understand what this is. A few people here and there are using XML. If you plan to go out tomorrow and start writing XML, you're going to have to do it in some very specialized ways and places. There are things out there. There's something, for example, out there called XPublish which lets you write something in XML and it converts it to HTML for you, like that.
But the reason you'd learn about XML today is not because you're going to be able to use it today, but because you're going to be able to use it, say, six months or a year from now. It's something that's likely to begin to replace HTML. It has in some really selected small areas right now. But it's definitely the direction the Web is going.
GM: So this is information that for an information technology professional is key in terms of understanding how they ought to be doing their own planning. For an end user, it may not be quite as relevant.
HS: That's right, but it's going to be, a year from now, that's going to all be different. In fact, I wish I could hear the Webcast we're going to do a year from now on XML. It's going to be quite different, I'm convinced.
TC: I wish I could hear it, too!
HS: Me, too. I'd like to hear it.
GM: Come on, you guys, keep moving!
TC: We'd all be rich!
HS: We'd all be rich, right!
GM: Let me tie XML back to this --
HS: I don't think you're going to get back to this topic here, Greg. We're off on something else here!
GM: Tie back XML to the database topic and also look ahead a little bit to the searching topic, which we're going to do next.
Basically, as I understand it, XML really sets me up to be much more effective in understanding what's in the information resource and organizing in a way that I can do much more efficient searches, including searches of things that come in on databases -- forms and so on.
HS: Yeah, in fact, if you wanted to think very simply about what the difference between HTML and XML is, HTML really describes what a document looks like. It's the formatting language for the document. Whereas, XML tells you what a document means. It tells you nothing about the formatting and so when you go out to the Web, I guess it's very nice to know what a document looks like because, in fact, you have to see it. But if you want to search on it or you want some small part of the document, it's very difficult, really, to search on a document.
In fact, all the search engines, what they do by and large is they do full text searches. They take every word, including words like "a, an, the, or, to," all these little words. They take every word, they index every single word including the position of the word -- and great! You can search on the words, but you can't search on the meaning of the document.
That means that, for example, if you were trying to search on a document that had a bunch of cars for sale, or if you wanted to find out about cars for sale -- you're interested in old Corvettes or something like that, and you went and you searched HTML for old Corvettes, the best you could do was to search on words like Old or 1955 or Corvette. And what you're going to do is you're going to kick up stuff independent of the meaning. If there were XML documents out there, then the meaning or the abstraction of the document would be out there and you could very effectively find the old Corvettes that cost whatever you want because all the meaning would be out there.
A real problem with XML is because it has no formatting information, you could say these things are not going to look too pretty on the Web. They're not. You don't look at an XML document directly on the Web. What you do is you look at an HTML document on the Web, so you need some way of taking XML and then taking the part of the XML document which is very easy to search and subset. You want to be able to take that part and you need some way to format it. And the way that's done is using some kind of style sheet.
So when you write HTML, HTML has taken the browser, interprets it, and voila, it's on the Web. If you get XML, you want to take XML which is the abstraction of the document, pass it through a style sheet, which results in a rendition which, if it's in HTML, then appears on the Web. You've got two things to write. You've got to write the XML and you've got to write the style sheet.
The nice thing about the style sheet is you can write lots of them. So you can have one abstraction of the document written in XML, lots of spreadsheets, and as a result, lots of different renditions of the documents. One suitable, perhaps, for publishing or printing, one suitable for the Web, one suitable for saving in the database, etc. So forth.
TC: XML sounds like something that obviously isn't here yet, but that higher education institutions especially are going to have to use. What would your chief advice be to someone in the IT department in such an institution now in terms of what to do with XML? To just watch for it, keep an eye on it?
HS: Well, first of all, there are some little things out there, like the one I mentioned, XPublish, for example, which if folks want to look at, it's at a place called -- it's a URL which we can provide later, but it's at interaction.in-progress.com. If you go out there, you can see something about XPublish, which is a Macintosh program that lets you write things in XML today -- lets you build spreadsheets, and then turns these things into HTML.
So there are some things happening out there. Both Internet Explorer 5.0 and Netscape 5.0 say that they're going to support to one degree or another XML. And those things should be out in the spring. So if I were an Information Technology professional, which I guess I am, I would at least dash off and grab some book on XML or at least look on the Website that we have records on at the CREN site. I'd begin to learn something about this because a year from now or pretty soon from now, you're going to be doing this. This is going to become your business.
TC: Let me point out to listeners that there's a book and one of Howard's own slide shows on the Website -- are linked on the session page to go find out more information.
GM: We've got less than five minutes to go. Let's go on to Web searching and particularly some of the techniques that we can look at to do more powerful and effective searching.
HS: The first advice I have for somebody who wants to search effectively on the Web is to pick a search engine. Just find one. I prefer AltaVista, but there's lots of other good search engines out there. But what you really ought to do is pick one and learn all about it and then use that. If you learn all about it, you'll at least have some powerful search techniques.
The tendency of most people is just to go off and just try this one and try that one or even use some of these metasearch engines which take a search and send the search out to every search engine under the sun. A problem with that is that these metasearch engines only support the kinds of searches that are common to all the search engines, and the kind of searches that are common to all the search engines tend to be the simplest possible kind of searches. So you never get to do really powerful searches.
When I do a search, if I get more than a dozen or a few dozen hits on the thing, I consider the search to be not successful. I bet that's not your experience. I bet your experience is that you get 17,000,000,000 things back and you're pleased that there's 17,000,000,000. Right? Nobody's pleased that there are 17,000,000,000 things -- you can't do anything with them.
TC: Is that Greg's experience or mine when you say you?
GM: I know how to go through 10,000 things really fast.
HS: Yeah, I don't know, I'm never happy getting 17,000,000,000 things back. When I talked to the folks at AltaVista, they tell me that most of the searches that they get are one or two words. I've talked to lots of folks who tell me that, gee, they used to use one-word searches, but now they use two-word searches and that makes things better!
They ought to be aware that when they use two words, those two words get searched for together. In other words, if you go out and just search for "peanut," you'll get a lot, a lot of hits. And if you search for "peanut (space) butter," you'll get even more hits because you'll get every document with the word "peanut" or the word "butter" in the thing.
GM: So put it in quotes, please.
HS: Right, so put it in quotes. In fact, I have some numbers if I can get them quickly here. The very first thing you want to learn to do is, if you're searching for words that you want next to each other, just put quotes around them.
The example I have here is of some country and western lyrics. If you want to search for some very common country and western lyrics, namely the lyrics "She Done Me Wrong," and you just wrote "She (space) done (space) me (space) wrong," you get about 17,000,000,000 hits if you use AltaVista. This is great? You want to go through 17,000,000,000 things to look at this? If you put "She done me wrong" in quotes, then you only get 85 hits. Well, 85 hits, you can actually look at those things, or you can look at a bunch of them.
So that's the very first thing one should do. But in general, you should just learn what the facilities of the search engine that you're using are. AltaVista has very powerful features, but there are other search engines that also have very powerful features.
GM: I think we're -- END OF SIDE A�
HS: Just that the Web keeps changing every day, and if you're going to deal with the Web, it's really a full-time business. You never know what's coming around the corner here, but I think for people who have to do this in the information technology business, I hope you realize it's a very interesting and exciting thing to do. And that the new stuff is, I think, what makes our job so interesting.
GM: Well, thank you, Howard, and thank you to all of the Web participants out there who have listened in on this TechTalk. Again, please send any follow-up questions to expert@cren.net and look in at the Website for responses.
Do be sure to mark your calendars for the next TechTalk event, that's November 5, two weeks from today at the same time, 4:00. That session features Greg Jackson from the University of Chicago on the topic of "Authenticating Users: What are the Issues?"
If you would like to receive announcement messages for these sessions, send e-mail to cren@cren.net or sign up on the CREN Website, www.cren.net.
Thanks to all who have helped make this Webcast possible today: the board of CREN; our guest expert, Howard Strauss; co-host Terry Calhoun; Paul Bennett at University of Michigan Web Services for the encoding; and to all of you for being there. You were there because it's time. Bye, Howard.
HS: Bye, Greg, bye, Terry.
TC: Adios.
GM: Bye-bye, everyone.