Open Menu Close Menu

XML in Higher Education

SMIL: Multimedia Rides the XML Wave

SMIL (pronounced "smile") is an acronym for Synchronized Multimedia Integration Language, an XML-based dialect for describing the layout and synchronization of multimedia applications.

For educators, SMIL opens the door to sophisticated multimedia development. With minimal effort, SMIL makes it possible for authors to:

  • Add audio commentary to images and text
  • Animate slide presentations that dynamically change as different elements become the focus of attention
  • Add on-screen controls that allow users to stop and start a presentation
  • Create courseware that integrates audio, video, animation, and text

As illustrated in the figure below, individual multimedia components can be stored either on a user's PC or delivered from a Web server. SMIL presentations may play in a browser with a SMIL plug-in or in a standalone player such as RealOne or QuickTime that reside on consumer devices and are independent of browsers. Because SMIL documents are text files, SMIL files can be customized on a server manually with a text editor or by using a script, such as AppleScript or PERL, or through the use of XML transformation tools such as XSLT. What's exciting for the aspiring multimedia author is that anything that can generate text can create a SMIL document.

A New Way of Authoring
SMIL's ability to dynamically assemble multimedia represents a departure from the standard "top-down" approach to authoring adopted by many multimedia presentation packages. Because SMIL describes how individual multimedia "chunks" are assembled, SMIL opens the door to collaborative project development, encouraging an exploratory approach to content creation. With SMIL, authors can update presentation content without the overhead of expensive and elaborate post-production processing. If an error is found in any multimedia component, only that one component needs to be corrected, rather than the entire presentation.

The SMIL Standard
As a W3C standard, SMIL follows a standards process where competing ideas are hammered out in Working Groups and after agreement, are submitted to the W3C for approval. Proposed standards, when approved by the W3C, become Recommendations—the highest approval stamp a specification can receive. SMIL 1.0 was first standardized as a Recommendation in June 1998. After use and feedback by the community, SMIL 2.0 was proposed, adding new capabilities for timing control, layout animation, and transition effects. In September 2001, SMIL was officially established as a W3C Recommendation, allowing companies to safely begin development of compatible products.

SMIL in a Nutshell
To understand the simplicity and power of SMIL, let's look at the structure of a SMIL document, which in many ways resembles an HTML document:

Like HTML, a SMIL document has a head and a body, each serving a specific function. The head is where the layout is specified through the declaration of screen regions for the display of different media. The body element is where timing is specified. There are two aspects to timing—things that happen in sequence and things that happen in parallel. Sequences are surrounded by the and tags. Media elements defined within a sequence are presented one after the other—with each element starting up after the previous one ends. For example, the following simple sequence plays three audio tracks one after the other:

For media elements with no inherent duration, such as images or text, explicit durations may be assigned, as in:

Here, the first image ends after being displayed for five seconds, after which the second image appears and is displayed for seven seconds.

Authoring Tools

In addition to the various SMIL players there are also a number of SMIL development environments that help an author assemble a final multimedia product. These include:

Parallel Multimedia
More complex multimedia displays are made possible through the use of the parallel or

tag. Elements enclosed in a

tag are started at the same time and can run to completion or be terminated after a specific time interval. For example, to play an MP3 audio file while simultaneously displaying a JPEG image and some text, simply place the three media references in a

element. In the following example, the image and the text are displayed for 30 seconds, while the audio element ends when the MP3 finishes playing.

Complex Multimedia
Complex multimedia presentations can be built up by combining parallel elements within sequences. Each parallel group is treated as a single element in its enclosing sequence. All elements of a parallel group begin together. When the last element in the parallel group ends, the sequence continues. For example:

In the above example, "" plays first. The narration and the slides start together as soon as "" ends. When both the narration and the slides have ended, the credits are displayed.

Dynamic Content Control
What sets SMIL apart from other multimedia presentation schemes is SMIL's support for dynamic content. Appropriate SMIL media can be selected during a presentation based on user preferences or hardware capabilities. For example, a presentation can be easily adapted for international audiences with the use of SMIL's switch tag, which can trigger different audio tracks based on the setting of a system language variable:

In the above example, a SMIL player will select the first item in the list that matches the user's system language attribute. Similarly, SMIL can select an item based on connection speed. As the following example shows, an appropriate audio file will be selected based on the bandwidth capability of the target system:

SMIL in Practice
When considering SMIL for multimedia applications it's important to keep in mind that, as with all specifications, the proof is in the implementation. Currently, Internet Explorer 5.5 and 6.0 support many of the SMIL 2.0 features. SMIL is also supported by Real Networks' RealPlayer, Apple's QuickTime player, Ovatrix's GRiNS for SMIL 2.0, and InterObject's SMIL Player.

However, as with any specification implementation, there are compatibility differences, similar to the problem of rendering HTML code in browsers. While this makes it difficult for an author to always be able to predict what a SMIL presentation will do across all implementations, there is considerable momentum behind SMIL and differences are minimal. Sticking to the basic features of the SMIL specification is always a safe bet.

The Future of SMIL
The recent finalization of the SMIL 2.0 specification, coupled with significant industry support, has made SMIL an attractive option for educators. As authors gain experience using SMIL, expect new ideas to emerge that leverage SMIL's capacity for delivering dynamic content based on the assembly of individual multimedia components. Currently, SMIL is stewarded by the SYMM Working Group, a mix of experts from a wide range of industries including CD-ROM manufacturers, Interactive TV, mobile communications, and audio/video streaming—all interested in bringing synchronized multimedia to the Web. A recent initiative includes bringing SMIL content into the hands of the mobile user via PDAs and even cell phones. For example, the SMIL 2.0 Recommendation includes a simplified version of SMIL targeted for mobile devices. Known as the SMIL 2.0 basic profile, it includes features that map to the limitations of hand-held devices. This is an exciting development for educators, because it opens the door to the reuse of those same multimedia chunks used in the creation of desktop multimedia presentations.


While SMIL is often compared to HTML, there are significant differences between the two tag-based languages. As illustrated in the figure below, SMIL is based on XML while HTML is based on SGML (Standard Generalized Markup Language), an XML precursor with a long-standing history in the document community. Although SGML has been widely used by organizations seeking to structure their documents and documentation (e.g. the General Motors Parts Catalog), its pre-Web complexity has been the main obstacle to widespread use and acceptance by the Web community. That's where XML comes in.

The XML proposal, begun in 1995 and approved by the W3C in 1998, represented an effort to simplify SGML, the ISO standard for defining data vocabularies. Technically, XML is a subset of SGML designed to simplify the exchange of structured documents over the Internet through the definition of tags that add semantic meaning to text. While XML's rules are simple, much of XML's strength derives more from what it d'es not address.

There are three key design elements that by omission contribute to XML's success:

  1. No display is assumed. XML makes no assumptions about how it will be rendered in a browser or other display device.
  2. There is no built-in data typing. Ancillary XML technologies such as DTDs and XML Schema provide support for defining the structure and data types associated with an XML document.
  3. No transport is assumed. The XML specification makes no assumption about how XML is to be transported across the Internet. This has opened the door to creative ideas about delivering XML over HTTP, FTP or SMTP.

Interestingly, these omissions are what have allowed XML to flourish. Since XML's approval as a W3C Recommendation, hundreds of XML vocabularies have been used to standardize information exchanges across a wide range of industries. Microsoft Corp. has even rebuilt its entire software infrastructure around XML in the form of its .NET initiative.

Insight into the breadth and scope of XML's reach can be obtained by visiting, and


comments powered by Disqus