Ask the Right Questions: A Technologist's Advice on MCPs

There is significant energy in the scholarly publishing industry right now around Model Context Protocol (MCP). Publishers are exploring it, societies are being pitched on it, and a growing number of technology vendors are presenting it as a natural extension of content strategy. Most of that conversation has focused on opportunity. Not enough of it has focused on mechanics.

That gap matters because MCP is not just a new distribution channel. It is a layer of infrastructure with direct bearing on content authority, access rights, competitive positioning, and trust. Before the industry commits to building, buying, or partnering around it, there are questions worth sitting with.

If you are still getting your footing on MCP fundamentals, our colleague Sam Green's post Demystifying MCP is a good place to start before continuing here. The short version: MCP is an open standard that allows AI assistants to reach outside their training data and query live, verified sources in real time. For scholarly publishing, that means peer-reviewed research surfaced at the moment a researcher needs it, grounded in authoritative content rather than whatever the model already knows.

But the technology itself is only part of the story. The harder questions are about coverage, governance, and control.

Whose Corpus Are You Actually Connecting?

When a researcher opens an AI assistant and asks a question, the quality of the answer depends entirely on what the AI can access. And right now, the scholarly publishing MCP landscape is fragmented in ways that matter.

The first model is a single-source MCP: one society or specialized publisher connecting its own content. This approach has real strengths. A focused, authoritative corpus in a defined domain can deliver depth and credibility that broader coverage cannot match. The flagship journal of a leading scientific society, or the complete back catalog of a specialized publisher, is not just content; it is the canonical record of its field. A well-constructed MCP built around that kind of corpus has genuine value that is distinct from scale.

The second model is a commercial or aggregator MCP, where a large, corporate organization connects its own catalog and potentially layers in licensed content from other sources. This approach offers breadth, but breadth has a ceiling. Most major publishers have only begun connecting portions of their catalogs, and even a fully connected aggregator MCP represents a curated slice of the literature shaped by existing commercial relationships. A researcher asking a question at the intersection of molecular biology and climate science is unlikely to get a complete picture from a single aggregator, regardless of how large it is. There is also a structural question worth asking: when content selection and result ranking are controlled by a party with a direct commercial interest in certain content's visibility, researchers and their institutions should understand the relationship between what surfaces and who decided it would.

The third model is a subject-domain MCP that aggregates data from multiple societies and publishers to cover the field comprehensively. This is conceptually the most powerful option for the researcher, since a single query surfaces the best available evidence from across the relevant literature, regardless of who published it. But it raises its own questions about who governs the aggregation, how content is weighted, and whose business model shapes what appears at the top of the results.

None of these models is inherently superior. The right answer depends on the use case, the audience, and, critically, who controls the underlying logic for selecting and returning content. From what I’ve seen, that last question is the one not happening loudly enough in the current conversation.

The Financial Reality

The library community is a natural long-term home for scholarly MCP integrations. Libraries have the institutional relationships, the existing subscription infrastructure, and the mandate to connect researchers with the best available information. But it is worth noting where the early energy is actually coming from: the corporate sector, and pharmaceutical companies in particular, are asking the most pointed questions right now. What is available, when will it be ready, and what will it cost? In industries where the ability to rapidly synthesize research literature translates directly to competitive advantage, companies are not waiting for the market to mature. They are actively evaluating.

That dynamic matters for how the business model gets built. The way MCP access gets priced and packaged in its early commercial deployments will shape what libraries can expect when broader institutional adoption follows, and the way it gets settled will have lasting consequences.

The risk is that MCP access becomes what a digital banner ad was initially to advertising in the late 1990s: a throw-in with the purchase of a print ad. Publishers and technology vendors that are anxious to demonstrate adoption may offer MCP connectivity as a complimentary feature bundled with an existing subscription. That feels like good news for libraries, but it sets a precedent that is hard to walk back. If MCP access is a zero-price feature rather than a value-priced capability, the investment required to build and maintain it well will not be sustainable. The content that gets surfaced through underfunded MCP integrations will not be current, will not be properly authenticated, and will not honor the complex entitlement structures that decades of library subscription management have built.

The more durable model treats MCP connectivity as a distinct service with its own value proposition. Libraries that adopt it early will shape what that value proposition looks like. That influence is worth something, and the institutions that engage seriously now, rather than waiting for a market to form, will be better positioned to negotiate terms that serve their communities.

The Mechanics That Shape What Researchers Find

The technical mechanics of how an MCP operates within an LLM workflow are not widely understood, and the gaps in that understanding have direct consequences to publishers, societies, and the researchers they serve.

When a user’s AI assistant has an MCP turned on, the LLM does not automatically consult it for every query. The LLM decides, based on the MCP’s structure and description, when to invoke the tool. That decision is heavily shaped by how the MCP is written. The instructions, the tool descriptions, and the framing of what the MCP does and when it should be used all influence whether the LLM reaches for that tool. Anyone who lived through the early days of search engine optimization will recognize the pattern: the same optimization techniques used to influence when search engines surfaced particular content can be applied to MCP descriptions to influence when and how often the AI invokes a particular tool. That optimization can be designed to serve the researcher. It can also be designed to serve the operator.

Once the MCP is invoked, the query is passed to the operator’s infrastructure. The operator sees the query. The search executes according to the operator’s defined logic, which may be semantic, keyword-based, vector-based, or some combination. The operator’s system returns a set of results.

And here is the critical point: those results are entirely within the operator’s control before they reach the LLM. The MCP can reorder results. It can filter content in or out. It can prioritize titles, authors, and publication types. What the LLM requests and what the operator returns are not necessarily the same. The LLM treats the results as the best available answer to the query. The user sees whatever the AI surfaces from those results. Neither the LLM nor the user has visibility into what was filtered, reweighted, or withheld by the MCP.

This has direct implications for how publishers and societies should evaluate MCP reporting. There is a meaningful difference between reporting on what the LLM discovered on behalf of the researcher and reporting on what the MCP returned to the LLM to surface. The first tells you what your content was allowed to do. The second tells you what the LLM considered genuinely relevant in context. The gap between those two numbers is a signal worth understanding, and operators control which number you see. If your reporting only reflects MCP returns, you may be looking at an inflated or deflated picture of how your content is performing in AI-assisted research workflows, without visibility into what the model set aside.

The entity operating the MCP has substantial influence over what researchers find, and do not find, when they use AI-assisted research tools. Understanding that clearly is part of making good decisions about how to participate.

The Trust Layer: Incentives, Infrastructure, and Content Integrity

The MCP space in scholarly publishing currently includes in-house publisher development, independent technology companies, and startups moving quickly to establish a position. Each carries a different set of capabilities and incentives, and understanding those differences is a question of content integrity, not just vendor preference.

The subscription and entitlement infrastructure of scholarly publishing is among the most complex in any industry. Decades of institutional licensing, consortium agreements, geographic restrictions, open access overlays, embargo windows, and individual user authentication have produced an intricate system. It is also an important one. The financial model of the entire publishing ecosystem depends on that system functioning correctly.

Building authentication correctly in scholarly publishing is slow, careful work. A technology partner that has not encountered the edge cases of a society publisher running split models across three different access types on a single journal will face a steep learning curve at implementation. Those costs tend to fall on the publisher or society that trusted the partner to get it right. Due diligence on implementation experience matters as much as due diligence on features.

Commercial publishers operating their own MCPs face a structural tension worth naming: their incentive is to surface their own catalogs effectively, which does not always align with surfacing the most relevant content for a researcher regardless of who published it. That is not a claim about bad faith; it is an observation about incentive structure. When a society’s content is aggregated into a commercial publisher’s MCP alongside the publisher’s own journals, the terms of that arrangement determine whose content gets prioritized when a query could reasonably surface either. Those terms deserve the same scrutiny that any distribution agreement would receive.

The compensation question deserves equal attention. If a society’s payout is tied to how often its content is surfaced through the MCP, the operator’s influence over what gets returned suddenly has direct financial consequences. Performance-based compensation sounds intuitive, but in a system where the operator controls the results, it creates an incentive structure worth examining carefully. A lump-sum annual licensing model removes that particular pressure, but it also removes the signal. And once the signal is gone, something more consequential goes with it: you no longer have an independent measure of what your content was actually worth to researchers in that moment. Your value is not something you can observe, benchmark, or negotiate from. It gets handed back to you as a number produced by the party that controlled your visibility all along.

The right questions for any publisher or society evaluating an MCP partner go beyond features. They are about what the operator can see, what they can change, what constraints are built into the system, and what recourse exists when the system does not behave as expected.

Getting It Right

The questions raised here are not arguments against MCP. They are arguments for approaching it with the same rigor the scholarly publishing community has always brought to consequential infrastructure decisions.

The pace of commercial deployment is moving faster than the governance frameworks around it. That gap is closeable, but it requires the right questions to be asked early: Who controls the corpus? Who controls the ranking logic? What does the reporting actually measure? What happens to entitlement infrastructure when a new intermediary sits between the researcher and the content? These are not technical questions for technologists to answer in private. They are business and mission questions for publishers, societies, and libraries to put directly to anyone they are considering partnering with.

MCP is a significant development in how research is accessed and used. Getting it right requires the same care that the scholarly publishing community has always brought to the infrastructure of knowledge. That care does not disappear because the interface is conversational.

Ask the Right Questions: A Technologist’s Advice on MCPs

Whose Corpus Are You Actually Connecting?

The Financial Reality

The Mechanics That Shape What Researchers Find

The Trust Layer: Incentives, Infrastructure, and Content Integrity

Getting It Right

Contact Info

Silverchair

Ask the Right Questions: A Technologist’s Advice on MCPs

Whose Corpus Are You Actually Connecting?

The Financial Reality

The Mechanics That Shape What Researchers Find

The Trust Layer: Incentives, Infrastructure, and Content Integrity

Getting It Right

Contact Info

Silverchair

Sign Up For Our Newsletter