Chip Gettinger and I teamed up last November to tackle the topic of Global Structured Content Strategies. We had so much fun that we decided to keep the conversation going on topics common to our respective backgrounds in structured content and globalization. There’s a lot to talk about, so we have divided our conversation into three parts. Here in the first part, we’ll talk about taxonomy and structured content.
Jessica Roland: Chip, to start off, let’s give the readers a quick recap on what the terms “taxonomy” and “terminology” mean.
Chip Gettinger: These two words are related and sometimes confused. I hope everyone saw the excellent blog post Val Swisher of Content Rules wrote in February. Val stated that a taxonomy is a way to classify words in hierarchical groupings. Terminology is a system of words that belong to something in common. She discussed the way terminology and taxonomy relate to each other and their point of intersection.
JR: It seems like taxonomy is dependent on, or at least highly related to, metadata. How do these concepts fit together?
CG: A traditional approach to product documentation has been to embed metadata as part of the content itself. For example, in DITA, the prolog is where authors add tags and attributes to a topic or map. This is commonly used when publishing content with navigation, indexes and search.
It is challenging to manage changes or additions to the metadata updates later. You have to go back to the source topic to make changes, and that can be onerous. A best practice is to externalize the metadata and make it separate but associated to the topic. This is similar to the concept of separation of presentation from structured content. XML has always promoted separation of structure from presentation…a taxonomy can be organized the same way.
Consistent taxonomy across content silos is an emerging goal for many organizations today. Whether it be knowledge articles, rich media, DAM, web content, SharePoint, Wikis…you can share a taxonomy across many repositories where you’re storing and managing content.
JR: So why is the concept of taxonomy important today?
CG: Taxonomies are frequently in use by organizations and are a critical part of the web for focused search, navigation, facets and classifications, online purchases, etc. When you are shopping online for wine, shoes, electronic equipment – whatever – it’s taxonomy driving your selections. For example, if you’re shopping for shoes and you want to narrow your selection to find just the right ones to buy, you might first select athletic shoes, then running shoes, then memory foam insole, then size 10 EE, and so forth. Most eCommerce sites are totally driven by taxonomy – it’s well established in the eCommerce world.
But much of this is absent in the documentation world, which traditionally has deployed full text search in a simple tri-pane HTML help or PDF files, not fully integrated with your larger company website. These approaches follow a table-of-contents structure with breadcrumbs, but often online customers can easily become lost when navigating in your structure.
We see product content posted on websites converging from multiple sources within your company – documentation, videos, training, support articles, release notes and so on. So if you have a rich taxonomy, it doesn’t matter what the source is because customers can locate what they need by using a taxonomy-based discovery and navigation approach they have become familiar with thanks to its popularization by eCommerce. The content is not only organized in a logical way, but it is tagged with metadata to make it easily discoverable.
Even more cool is when a taxonomy can start to reflect relationships between content. That’s called “ontology.” This deeper type of taxonomy can recognize patterns and habits, like “oh, you bought this…so how about this?” You’ve seen it on eCommerce sites, and now you’re starting to see that in online product documentation too, with suggestions of a knowledgebase article for example. Some advanced DITA users are using the Subject Scheme map to display content based on classifications, but the metadata may not be easily managed or integrated with the taxonomy in use by other parts of a web site.
JR: All the classification and figuring out metadata and tagging sounds like it could take a lot of time if done manually…
CG: Yes, and that is why there’s now a category of software called “taxonomy management systems,” which allow you to synchronize taxonomy across silos and centralize it so you only have to make changes in one place. This is a big benefit. For example, if you need to update a product name or add a platform, you can do all that metadata management in one place. There are established standards from ANSI, NISO, ISO and others.
Another consideration is: who applies the metadata for documentation? In a traditional world, the tech writer is responsible for the content as well as the metadata. We may find in the future that the person writing the content may not be applying the metadata, and instead there could be a group of experts doing the tagging. Or, increasingly, we’ll see companies automating the task. Our partner Smartlogic provides this type of system. It can crawl your content, intelligently derive metadata, and apply the metadata based on relevancy. This takes a lot of the painful manual work away.
JR: What are some potential pitfalls in making this taxonomy connection?
CG: Like everything, it always comes down to people and governance! It’s often challenging to get people to work together on taxonomy, even when the benefits are clear. The real pitfalls are old methodologies, old ways of doing things. And deciding to really organize your content with a taxonomy is a huge shift in strategy that impacts the whole company. It requires governance and for that, people have to work together to make decisions and communicate, as it is too much responsibility for one person to understand all content types.
You also have to think ahead to publishing. The metadata needs to travel along with the content as it’s published. In SDL Knowledge Center, if you attach metadata to content, it will travel together throughout the content lifecycle, and you can update and version it as you go along.
JR: What do we need to be thinking about in next few years?
CG: Enterprise content management has evolved. But there is too much specialization required in the variety of systems needed to bring all the content together. We won’t see the content silos go away within companies, but content WILL work in unison via taxonomies. It’s virtual enterprise content management, and taxonomy is the common language to unify our customer experience.
Thanks to Chip for his comments and insights! Check back soon for Part Two of this interview, which will explore how taxonomy can be a huge leverage point for multilingual content and localization.
To learn more about taxonomy, see our Connected Content Resource Page.