It’s difficult to get an exact count of how many journal articles are published each year.
- Elsevier publishes more than 500,000 articles annually.
- PubMed comprises more than 30 million citations just for biomedical literature.
- It’s estimated that approximately 2,500,000 articlesare published in 30,000 journals per year.
- PLOS One published 16,371 articles in 2020.
A journal publisher’s entire collection is deep and complex. It can be difficult to get a handle on attributes of that content particularly if collections span several decades. How do you strategize and plan for organizational change to the development, production, and distribution of your content if you don't have a clear picture of what you already have? What legacy issues might be embedded in your content structure from various DTDs used over the years? Do you know how many links are broken in your corpus?
Deep Analysis and InsightAs a long-time partner to Silverchair, Data Conversion Laboratory (DCL) conducts deep audits of a publisher’s entire catalog in preparation for content migration onto the Silverchair Platform. Over the years, we’ve uncovered a long list of content structure “clarity checks.” Once issues, conflicts, and errors are identified they can be corrected in the files. This process ensures a smooth migration of a publisher’s library onto the Silverchair Platform and helps publishers maximize the value of their content and the experience of researchers using the platform.
Last week, we rolled this service out more widely to the market: Content Clarity.
The clarity checks consistently reveal insights that improve content interoperability and discovery
- Invalid assets
- Missing callouts
- DOI conflicts
- Missing DOCTYPE
- Invalid XML/Parsing errors
- DTD list
- ISSN info
- Bad date
- Missing ref-list title
- Missing article title
- Duplicate IDs within an article
- Missing self-uri PDF
- Missing volume/issue
- Subject categories
- Number of files
- Byte count
- Full text vs header-only count
- PDF pages
- and so much more
Metadata InventoryRevisiting metadata enrichment in light of contemporary understanding and updated taxonomies is important for publishers. Navigating content based on keywords that were chosen decades ago may miss out on engagement with today’s researchers. Access to past publications, although supported by the latest technology, may be based on outdated taxonomy thus limiting access. Content Clarity unveils metadata across current and legacy content and is the first step when updating subject metadata to conform to new taxonomies or ontologies. Following a Content Clarity audit, DCL can programmatically clean up the files and enrich the content so it is optimized for modern platforms and researchers.
"Content Clarity helps us triage issues across content platforms. The reporting was thorough, and more information was provided than I expected."
Join Our DiscussionWant to learn more? Silverchair’s Craig Griffin and Data Conversion Laboratory's Brian Trombley invite you to join them for a conversation about what you can learn when diving deep into your content structure to understand issues that impact downstream interoperability and discoverability.
Bring plenty of questions for Craig and Brian, and join us for:
The Silverchair Universe Presents Data Conversion Laboratory: CONTENT CLARITY: AUDIT, ANALYSIS, AND INSIGHT ACROSS YOUR CORPUSMarch 16, 2021 at 11:00 am ET