This is the home page for Structured Text and XML in Medicine (BIOINF 2991, 1 or 3 credits), a directed study in text markup taught as part of the University of Pittsburgh Center for Biomedical Informatics medical informatics training program, from 2002-2005. The course contents are maintained here for reference purposes. The course description, goals and grading policies are outlined below. Resources for the course including the schedule and assignments are listed in the sidebar to the right.

Course description

This course is an introduction to text markup, expecially XML markup, and applications of markup in medical informatics. Students will learn to create and work with well-formed and valid XML documents and to develop new XML vocabularies using document type descriptions and XML Schema. The course will review the use of XML in both data-centric and document-centric applications, including communications and native XML databases, and will survey the current status of XML development in areas related to biomedical science. Students may elect to take the course for one hour or three hours. There is 60 minute lecture per week. The three hour version of the course includes additional online tutorials, assignments and a more extensive project.

Instructor:James H. Harrison, Jr., M.D., Ph.D.
Days/Times:Fridays, 9:00 to 10:00 am
Location:The Center for Biomedical Informatics Conference Room (inside FT 8084)
Prerequisites:One course or experience with text markup such as HTML
Recitations: None
Expected class size:4-12 students

The course is usually offered in the spring term.  Special permission from the instructor is required to register for this course.

Goals, expectations and grading

The goals of this course are to introduce students to the key concepts of text markup and the family of standards associated with the Extensable Markup Language (XML), including Document Type Descriptions (DTD), XML Schema, Extensable Style Language (XSL), XSL Transformations and document navigation using XPath expressions. Students completing this course will be able to:

  • Write valid XML documents based on existing DTDs or schemas using text editors and specialized XML editors
  • Create new XML DTDs and schemas
  • Display and transform XML documents using CSS and XSLT
  • Access specific sections of XML documents using Xpath and Xpointer
  • Understand the use of XML in modeling information in document- or data-centric settings
  • Understand and use the special characteristics of native XML databases
  • Describe the current status of existing XML development related to biomedical science, including HL7 v.3 messaging, the Clinical Document Architecture and BioML

In addition, students will be equipped to discuss markup-related topics with web and software developers or vendors. This is a directed study that may be taken for 1 or 3 hours. Students taking the course for 3 hours will complete two assignments, several online tutorials and a more extensive class project.

Students' responsibilities

All students should attend the weekly lecture at 10 am on Fridays, participate actively in the class discussions and complete the assigned readings (see ClassSchedule). Students should also participate actively in the online class discussion using individual session pages, including posting questions and responding to others' questions, and they should make sure they understand each week's material thoroughly. Assignments should be turned in on schedule (see ClassSchedule). Class projects should be carried out during the second half of the term (see ClassSchedule) and presented during the last two weeks of class. Students taking the class for three hours credit should also complete the assigned online tutorial each week and send any required work to Dr. Harrison.

Assignments (required for 3 hr credit)

Students taking the course for three credit hours will complete two limited assignments in addition to the larger project described below. The first assignment is to create a DTD appropriate for a standard document type of the student's choosing and write an XML document that validates against the DTD. The second assignment is the create a well-formed XML document of any type and write a cascading style sheet allowing the document to display within the Firefox browser. Due dates for the assignments are noted on the ClassSchedule.


All students should select a standard document of some type as a basis for their project. Medical documents are good candidates, but any type of document can be used. A document that is relatively structured, such as a clinical consultant's report or a prescription, is best. Non-medical documents such as a purchase receipt, pay stub or short form tax return could also be used. For one hour students, the project should consist of a DTD in correct format specifying tags and attributes appropriate for the document, an example document marked up with the tags and a cascading style sheet supporting the display of the document in a web browser. Students taking the course for three hours should create an XML Schema, instead of the DTD, and an XSLT stylesheet, instead of the cascading style sheet. The stylesheet should transform the example document into XHTML for display. Projects should be presented and demonstrated at the assigned times during the last two class periods. Presentations should include an overview, rationale and demonstration of project components.


Sixty percent of the grade is based on the quality of the class project, 30% is based on the two preliminary assignments, and 10% is based on class and online discussion participation. With respect to the latter, all students should plan to post questions and comments regularly to the online discussion; the number and quality of these postings will affect the grade.


Note the minor changes in the project assignment (see the bottom of this page).

Class information

Required resources
• Textbook: XML in a Nutshell, 3rd Ed.
• XML editor: Morphon
• Web browser: Firefox

Instructor contacts
Jim Harrison, 647-5529
Jan Walker (Admin asst) 647-5380

Dr. Harrison's office hours
• Mon, 1 - 3 pm, Canc Pav #310
• Thu, 1 - 3 pm, Canc Pav #310
• Fri, 8 - 11 am, CBMI, Forbes Tower

Previous courses

Structured text and XML in medicine, 2004

