Silverchair's Blog

Thursday, December 13, 2007

Interview with B. Tommie Usdin, President, Mulberry Technologies

Reprinted from Silverchair's newsletter, Context Matters.


  • Interview: B. Tommie Usdin, President, Mulberry Technologies

    Tommie Usdin is the founder and president of Mulberry Technologies, an XML and SGML consultancy in Rockville, MD, that writes XML Vocabularies (DTDs and schemas), teaches about XML and related technologies, and helps with hardware/software selection and workflow/process reorganizations. In addition, Usdin is the founder and organizer of many XML conferences, including the GCA’s SGML conferences from 1991 through 1997, the Extreme Markup Conferences through 2007, and the new “Balisage: The Markup Conference.” She was kind enough to take time out from her busy schedule to grant Silverchair this interview:

SC: How did you get into SGML, and later XML?

BTU: Before SGML, I was working in a group that was making full-text searchable databases. This was back in the days when to make a book into a searchable database you waited until the book was printed, then sent two copies to be re-keyed and re-proofed, and then loaded a database. Once you had tested the database, you submitted a query and, less than a day later, you got a list of the documents that met your search criteria. And the users thought this was wonderful: real information from our documents, in under a week! So we were right there when SGML came along, with its promise of re-purposable tagged content. I led teams building SGML applications for 10 years.

I followed XML by list-serv during its development and it really did seem to be the good-parts version of SGML. Mulberry added it to our toolbox as quickly as we could, teaching a number of XML-for-SGMLers classes.

XML came to dominate my consulting practice as soon as we had XSLT. XSLT made good on the promises we made about how easy things would be with SGML. In the SGML world it was possible to use the same source in multiple ways, and it was reasonable to assume that your content could be tool-independent, but working with SGML documents was certainly not easy. With the addition of XSLT—a powerful, popular, easy-to-learn language for transforming documents from one structure to another and from one tag set to another—it became EASY to create content that would outlast tools and work for multiple, possibly unknown, applications.

(There are people who call me a Luddite because I didn’t drop SGML as soon as the first XML tools were available. But I don’t believe in dropping an old technology until and unless it no longer meets the users’ needs or there is a clear business advantage to changing to a newer one.)

SC: Tell us what Mulberry Technologies does.

BTU: Mulberry Technologies, Inc. is a consultancy specializing in XML (and SGML) text-based applications. That means that we help organizations who are working with marked-up prose: books, journals, reference works, technical documentation, teaching materials, legislative documents, medical and drug information, and a wide variety of other text applications.

Mulberry services include:

  • Consulting, including helping organizations figure out:

    • where XML will help their workflow
      who needs to learn what in order to add XML to their processes

    • what vocabulary/tag set(s) would best meet their needs; if a public tag set fits (either as published or with customization)

    • what tools they should buy and/or build



  • Training, including:

    • conceptual training for managers, focusing on the business issues relating XML

    • hands-on XML, XSLT, Schematron, and XSL-FO for production people and programmers

    • training on specific tag sets and applications



  • Vocabulary development:

    • creation and documentation of tag sets/vocabularies
    • customization of public vocabularies to meet organization/project needs
    • expression of models in the language(s) appropriate to the use, user, and selected tools, including DTD, XSD, RELAX NG, and Schematron



SC: Do you remember the first DTD you wrote? How would you improve on it with what you know today?

BTU: I was on the team that wrote the AAP DTDs, the first published DTDs that were developed in the early 1980s while SGML was a moving target. (I didn’t write those DTDs, and don’t want to steal Joan Knordel and Sperling Martin’s thunder; I was a minor supporting player.) The AAP DTDs were widely adopted (and adapted) by publishers worldwide for modeling books and journal articles. These DTDs later became, and are still in use as, ISO 12083.

I have been enormously fortunate in that I don’t have to speculate on how I would improve on that effort with what I know today; I am on the team that developed and now maintain the journal and book tag sets for the National Library of Medicine. These XML tag sets, developed 20 years after the AAP DTDs, are in many ways aimed at the same uses and users: publishers, archives, and libraries. The differences are significant, and based on the benefit of 20 years of tag set experience:


  • XML, not SGML
  • more transparent naming
  • more flexible models
  • variations optimized for different purposes: archiving, publishing, and authoring
  • easy customization
  • modeling that is compatible with current practice in commercial publishing.


SC: You wear many hats (President of Mulberry, conference organizer, frequent speaker, industry thought leader). What is your favorite role, and why?

Conference organizing gives me an opportunity to meet and work with a lot of people I wouldn’t be likely to encounter otherwise. While working on a conference, people are positive, thoughtful, and cooperative; conferences bring out the best in people. Competitors work together to present coherent panels; academics and vendors listen to each other with respect.

Even more fun than the conference itself is the development and preparation process, including: talking with would-be speakers about their topics, matching papers to peer reviewers (who provide critical reviews for the organizers and helpful advice for the author), and cooperating to select a program that is excellent, varied, and balanced.

This year I am working with a group of people I really respect and admire to develop a new conference called “Balisage: The Markup Conference.” Balisage will focus on both practical and theoretical aspects of XML and other markup technologies.


SC: Where do you think we are right now in making progress toward the Semantic Web?

BTU: Well, if I knew what “the Semantic Web” was, I could probably give a sensible answer to that. The problem is that I have heard way too many definitions, and all they seem to have in common is that they include “something really cool.”

There are a few “islands” of rich semantically marked-up content on the web, in which users can do highly precise, targeted searches and retrieve small amounts of information that meet their needs, and follow that information to other sources. And I think it is likely that there will be more of these “islands” in the future. However, creating these (and their links) takes a lot of planning, a lot of work, a lot of money, and the knowledge and cooperation of a lot of people; in other words, it doesn’t just happen. And “the semantic web” will not just happen; people who need rich retrieval will create the content to meet their needs if and when there is sufficient need.

SC: What are the advantages of semantics in data?

BTU: Especially in the publishing world, moving from proprietary systems to XML-based systems is justified because the publisher wants to create “value-added information products.” Usually, they mean that they want to enable users to find exactly the content needed at a particular moment, quickly and easily, and perhaps also to make the user aware of other related information that may be useful to them then. Maybe they can interact with the data and make it their own. Semantically tagged data provides the “hooks” necessary to create such information products. By knowing, for example, what words or phrases are drug names, or product names, or place names, it becomes possible to index to this information, to traverse it, and to link to it and from it imaginatively.

SC: Generally speaking, how are publishers and information providers doing with smart implementations of XML and related technologies? What kind of good things are you seeing? What are you NOT seeing?

BTU: I see significant variations in the degree to which publishers and information providers are implementing XML and related technologies. I hesitate to characterize any publisher’s technology decisions as “smart” or “not smart;” technology decisions are business decisions and must be made in concert with a complete business plan. Some publishers are moving XML into the very beginnings of their workflow and are planning new products based on the library of rich semantically tagged content they are building up in their repositories. Others have made a well-informed business decision to stick with traditional publishing process for most, or even all, of their material and at the time that they want to develop an electronic product using some content they are converting that content and only that content into XML optimized for that application. I think well informed publishers can make a variety of good technology decisions.

Now that you ask, I realize that in the last few years I have seen fewer and fewer technology decisions made on the basis of an article in an in-flight magazine or a recommendation from a tool vendor; publishers seem to be less susceptible to trends and making better informed business decisions than they were a few years ago.

SC: If a publisher or information provider was just setting up to deliver a new product or content set, what kinds of things should they think about?

BTU: The same things publishers have always thought about: Who will use this product? What will they do with it? How will it differ from other products on the market? What will the users expect from such a product? And only after they are comfortable with the answers to those questions, they should think about how to create and maintain the content, which will include both human and technological aspects.

In other words, publishers need to be publishers first, and work with technology second, and in support of the publishing activity. There is a tendency, especially among technology tool vendors, to encourage enthusiasm for the technology. I think this is harmful; no technology will make a business successful if it isn’t used to support a sensible business model.

SC: What trends or technologies are you seeing that have you excited about the future, either short-term or long-term?

BTU: I see less and less hype and more and more calm use of sensible technologies. I see people using XML, not because it is the NEXT BIG THING but because it is a technology that allows them to do things they need to do with their content. It gets them out of just print-on-paper and lets them move their content to the next level(s).

I also see fewer and few people getting religious about tag sets (DTDs or schemas) and more and more talking about converting content from one tag set to another as needed.

In other words, I suppose the community of markup users is maturing. And I like it.

SC: What is your favorite “gadget?”

BTU: The “time remaining” clock. It doesn’t tell you what time it is; it doesn’t display any numbers at all; it is an analog device that shows, graphically, how much time remains for what-ever task you have set for yourself. We use it at conferences, where it tells speakers that they have lots more time, or a little more time, or that they have better wrap it up – all with an ever shrinking wedge of red. It is also useful at workshops and in other group activities, where it tells people that time is slipping away and they need to get to work.

0 Comments:

Post a Comment



<< Home