Member Interview: Thane Kerner, Silverchair

(republished with permission from STM News, July 2011, www.stm-assoc.org)

Founded in 1993, Silverchair delivers advanced semantic technologies, publishing platforms and e-learning solutions to scientific, technical and medical publishers and other organisations. We asked Thane Kerner, Co-founder and CEO to tell us about the background to the organisation, and why we should all be thinking ‘semantically’ when it comes to publishing.

Q1. You were a medical journal publisher before founding Silverchair. Did your experience in the medical information arena provide the catalyst for the creation of Silverchair?

It was mostly instinctive. I had been originally pulled into STM from consumer publishing by a society editor who wanted to enliven his journal with a more market-oriented sensibility. And I was lucky enough to guess that the publishing technologies we were experimenting with circa 1990 were going to accelerate change in professional publishing, which had been a relatively staid industry. So two trends underpinned our strategy—if you can call those inchoate notions a “strategy”—the shift in publishing from being product-focused to being market- and customer-focused, and the potential for technological change to substantially change our products and services and our business models. And, I was fascinated by medical information in particular because in many ways it is the most mission-critical, high-value domain of all.

Q2. Silverchair has a large and expanding taxonomy, focussed on the biomedical industry. How important are taxonomies and how can they be reliably developed?

Taxonomies are key infrastructure of the next generation of networked information systems. Just as XML created a media neutral format to express syntactic structure, taxonomies are essential to creating a neutral way to express semantic infrastructure. XML tells systems what a data object is; semantic metadata tells systems what an object is about. The central purpose of the semantic web is dismabiguation, a way to make content descriptors simultaneously comprehensive and exclusive. For example, if I seek to retrieve a content set about “myocardial infarction,” I want the system to include content about “MI” and “heart attack” (and, by the way, “myocardial inFRACtion”), but to exclude content about Michigan. From there, we can develop ontologies, which in simple terms use taxons (root terms in a taxonomy) to create triples–a kind of grammatical statement. We can say, for instance, that “Drug_Name” “Treats” “Infection_Name.” These triples are essentially tiny applications; they make abstract (programmatic) connections between information sets. So they provide the capability to infuse content sets into work flows in real time, they allow systems to develop intelligent real-time user analytics and responses (super-personalization), and they have myriad other potentialities that we are just discovering. The current frontier, business-wise, is to deliver the precisely correct unit of information to a user at precisely the correct moment, which potentially improves efficiency and quality by orders of magnitude. Silverchair does indeed have what we believe is the most robust biomedical taxonomy ecosystem available. It’s important to distinguish between a classic taxonomy–which is a retrospective classification system that reduces a concept to a single declared nomenclature–and a taxonomical ecosystem delivered to content applications in real time. The former is a helpful starting place, but only that. The distance between a taxonomy and a semantically-optimized information platform is long and complex. Silverchair has spent the past 12 years building out that complete infrastructure and deploying it from end to end, learning first how to do it properly, and then how to do it at scale. Establishing this ecosystem is, to understate it, non-trivial. And while our roots were grown in the fertile soils of biomedical information, we are rapidly building out across STM domains by abstracting the principles, methods, and best practices we’ve learned and created. Currently Silverchair is addressing the physical sciences, engineering, computing, and other core STM domains with our semantic ecosystem approach. One other important dimension of taxonomies in general and the trajectory of these technologies in our industry: the world is moving beyond enterprise taxonomies and into domain taxonomies. Any content organization working to develop and deploy semantically-optimized information products and services must make a crucial decision in this respect. I am frequently confronted with companies and societies that want to develop an enterprise taxonomy to help them better organize and deliver their content. However, if the normalization process is only enterprise-wide, then it is fatally limited. Information services increasingly offer value based more on enhanced discovery, navigation, usability, and productivity than they do on the underlying content artifacts. Publishers should not make the mistake of a half measure by moving from one silo (the book, journal, etc.) into another slightly larger silo (the organization). STM Information consumers never single-source their content needs organizationally. Ergo, the ability to connect content assets inter-organizationally will drive business models, and the semantic architecture required to drive that is consequently domain-wide rather than enterprise-delimited. Finally, I’d add that taxonomic and ontologic infrastructure is best delivered in a service architecture. Taxonomies and ontologies are living, breathing, growing, mutating datasets. One doesn’t just create a taxonomy and then turn it on and let it run. It must be continually modified and supplemented, and best practices for updating these datasets require their own sophisticated real-time connections to content origination workflows and user interactions. So economically, it’s quite challenging for any single organization alone to support the specialized skill sets and technologies that create and advance the ecosystem needed to realize the full gamut of business benefits made possible by semantic technologies.

Q3. Silverchair has recently announced SCM6, the new ‘natively semantic product delivery platform’. What are the key advantages of a ‘natively semantic’ platform as opposed to a ‘traditional’ platform?

We’ve been down a winding path on this front. From 1998 through 2007, we created products on previous versions of SCM (also natively semantic; “native semantics” are not new to SCM6) from end to end. A great deal of meticulous (and non-scalable) effort when into creating very high end, bespoke products and services for publishers. We learned how to do it well before we learned how to do it big. We then achieved some tremendous technical breakthroughs in automated enrichment, and decided we would productize our semantic infrastructure, offering taxonomy, enrichment, and other semantic application services to publishers on other platforms. It made a lot of sense based on the theoretical rhetoric of the industry, but at a practical level the theory had not been operationalized. It turned out these other content delivery platforms were not equipped to use the semantically enriched data in meaningful ways. In part, these platforms were not designed to write functionality based on that kind of metadata, but it’s also a reality that the conceptual/ideation skills had not been cultivated. As we realized that neither the technologies nor the expertise were in place on other platforms, we decided we needed to deliver a high-scale semantic product development platform, one that could support the complete content portfolio with best-of-breed interfaces and functionality, and could use the semantic infrastructure to speed up feature and product deployment. A platform constructed on a semantic foundation does a lot more than just enable better products. Operationally, it enables an optimized product development process. If you think about classic N-tier systems architecture (data-application-interface), the heaviest investment (time/effort/money) is in the application rules and then the interface. Data was just considered a by-product of traditional production workflow. Semantic architecture inverts that. We make a much larger investment in the data layer, which in turn reduces the investment required in the application and interface layers. Now, why is that important in our industry? Because the days of the giant build are over. The original giant build was the printing plant: turn it on and run it for 25 years producing exactly the same product output. When the industry initially moved online, the orientation echoed that: we built big platforms that took 24-48 months to plan, develop, and deploy, and then expected these platforms in all their high-scale (but generic) glory to run the same functionality more or less indefinitely. Anyone who is now trying to do effective web/mobile product development on that timescale is doomed. How can a service that was designed for the web three or four years ago and is just being launched today hope to be compelling? Moreover, where is growth going to come from in the next decade? Institutional budgets are exhausted. The first great wave of web productization in STM has past its zenith. To grow revenue (organically) with the current platforms and models, we’ll have to take budget away from our competitors. That’s expensive and time consuming, and doesn’t make the pie any larger—it just makes margins smaller. So the next big wave of web productization will need to be different. We think the basic themes are [a] service layers that emphasize integration and utility over traditional content IP, [b] new forms of dynamic content mobility that will reach far more deeply into professional and educational workflows, and [c] the revival of individual customers. So a natively semantic approach is actually appropriate not merely for the contemporary technical milieu, but for the fundamental strategic environment in which information providers all operate. Rapid service and feature iteration is essential. Low-investment experimentation, in an ecosystem where we can instantly absorb and interpret user response and modify, is the product development process that will work in this frenetic atmosphere.

Q4. Do you anticipate that semantic publishing will revolutionise scholarly and professional publishing?

Understand this: the train has left the station. Semantic technologies are already driving product development and marketing all over the web. Quite honestly, professional publishing is typically something of a laggard in adopting market-changing technology infrastructure. Companies and technologies most identified with web leadership—Amazon, Facebook, eBay, Google—have been deploying these technologies for a long time. As has the ultimate canary-in-the-coal-mine web industry: pornography. At our peril we think of our own space as immune to these complex but essential evolutions; we rationalize that perhaps we are too small, or too contextually particular, or that there exists some other dispensation that will allow us to avoid the effort. Developing a semantic footing seems expensive, requires complicated new skills, and, like any infrastructure project, has an opaque ROI. Nonetheless, this wave is inexorable. It’s hard to imagine any major player in the content/knowledge space thriving without developing semantic architectures.

Q5. Who within a publishing organisation should be thinking about semantic technologies and what advice would you give to them?

First, this begins as a business imperative, not as a technology or production initiative. It is a technology that actuates a new set of strategies. Second, a semantic orientation should be foundational, rather than a bolt-on tactic. Centering semantic technologies in post-production data management has wasted more time and money than most other rabbit holes combined. Third, a semantic ecosystem is very difficult to pilot. It’s axiomatic that the value created by semantics increases exponentially with volume of content and desired services or features. Fourth, as I mentioned previously, taxonomies should be approached at a domain (rather than enterprise) dimension. Finding or establishing de facto domain leadership, either as the biggest player or (more likely) as a consortium of leading organizations in a domain will lay the track for a multiplicity of second-order opportunities and models. Given its importance and its cross-functional implications, senior leadership in strategy, technology, marketing, and editorial must all develop a working understanding of semantic infrastructure and must create organizational momentum for change. It really isn’t easy—it’s just imperative.

Q6. As a member of STM, what are the key benefits STM membership offers for Silverchair?

We certainly value the opportunities we get to exchange perspectives with our colleagues, partners and competitors. Because our business is focused on enabling and supporting the aforementioned shifts in strategic product development, it’s vital that we participate in the idea flow that is generated in the STM context. I firmly believe that the models of the future will originate in creative “co-opetition,” as the axes of value fluctuate and morph. We aspire to contribute commensurately with what we educe, and to keep building intellectually valuable relationships with our counterparts across the industry.