Everything You Thought You Knew About Meta Data…But Were Afraid to Ask

Kudos to Thad Mcllroy for his article on Meta Data. He tells you what it is, why it’s important and what to do to take advantage of it in a publishing environment. He’s written a a book:: The Meta Data Handbook


It’s no surprise that there’s a lot of confusion around metadata for books. It’s complicated. If only they hadn’t used the “M” word—metadata. It reeks of digital complexity. And then you read the standard definition: “Metadata is data about data.” Gee, thanks. As if your eyes hadn’t already glazed over.

Here’s what metadata really is: It’s all of the title information that used to reside just in your title catalog and your EDI (electronic data interchange) system, like Pubnet. Since online selling became tops for combined digital and physical books, the title information became paramount, as it’s the only way publishers can guarantee online retailers list their books correctly.

To put it another way: Metadata is your book online.

Online, your customers can’t grab a copy off the shelf, read the back cover blurb, and thumb through the pages to scan a selection of the text. That’s all metadata now: The back cover becomes the title description. The advance reviews are carefully tagged for online. “About the author” is now a separate web page on Amazon.com. And that’s the easy part!

Read this story in our digital edition, or downloadOpens in a new window to get free access to all our great magazine content on your tablet.

More challenging are, for example, making sure that the online preview doesn’t waste precious pages on the prelims; that your author Brian Smith doesn’t link by mistake to Brian W. Smith , author of My Husband’s Love Child—a Novella; that the video shows up online; and many more concerns.

Basic metadata is pretty easy, but ONIX, the metadata standard, now defines more than 200 fields. And these days there’s intermediate metadata, and then there’s advanced.

Today we’re going to test your skills: Beginner, Intermediate and Advanced.

You Think You’re A Beginner

You’ve got the basics in hand—if you’re in the U.K. you know they’re called BIC BasicOpens in a new windowOpens in a new window in the U.S., it’s BISAC Core Data ElementsOpens in a new window. Did you know:

1. Metadata Matters

No matter what you might think, metadata matters. A lot. It won’t make bad books sell well. But with literally millions of books for sale online, it makes a huge difference for those deserving titles that are slipping down your backlist, no longer carried in bricks and mortar stores, waiting to be discovered online.

Most people think about famous titles when they think about metadata. Like 50 Shades of Whatever. Perversely, metadata doesn’t matter quite as much for books that feature the best metadata—new books and bestsellers. Readers are going to find those titles one way or another. But Pets and Heaven—What the Bible Says About Our Animal Friends could use some additional metadata to move up the “Pet Loss Grief” section of Amazon from its current position at No. 852,554 in books.

2. Know which metadata matters the most.

You’ve got to walk with metadata before you run. Metadata mastery takes time. Title and author are searched more often than publisher, subtitles more than the number of pages.

Different national organizations that oversee book metadata have different names for the most important metadata. The U.S.’s BISAC/BISG talks about the “Core Metadata Elements.” There are 31 in all. Some are mainly important to resellers—like “Case Pack/Carton Quantity”—rather than to readers. BIC in the U.K. is more down-to-earth with its 11 BIC Basic elements. They’re mostly what you would expect: title, price, pub data, ISBN, and so on.

Included in both lists is something you may not think of as metadata: the cover. Nielsen’s metadata study, “The Link Between Metadata and SalesOpens in a new window,” released in early 2012, proved that the cover is in fact the most important metadata element. Sales for titles with all 11 elements, including the cover, were 473 percent higher than for titles missing the cover. No other metadata element comes close for sales impact. Covers still sell books.

3. There’s a difference between findability and discoverability.

Discussions about the value of metadata move quickly to proclaim that metadata is essential to discoverability. Discoverability isn’t defined: We’re left with the vague sense suggested by the word’s root “discover.” Most people know that there are too many books and too much information, so the idea that it’s challenging to find a book resonates. Certainly everyone who’s part of the publishing supply chain, whether author, publisher or reseller, is well aware of the problem. They know that just because a book is good doesn’t mean it will ever be found.

Metadata’s first task is mere findability—and the distinction is important. For argument’s sake let’s assume that half of the books purchased or borrowed from libraries are searched for just by title or by author. Those books must be found, not discovered. It’s the other half that will be discovered, whether by wandering through the stacks at the library or strolling the aisles of a bookstore (or the online equivalent).

Findability is the challenge of locating exactly what you’re looking for (even if you have incomplete or inaccurate information about the book). Discoverability is the process by which a book appears in front of you at a point where you were not looking for that specific title (although you are looking for something in the same direction).

Metadata plays an important role in both of these tasks, although the role it plays is substantially different in each.

4. Metadata is your sacred duty.

These days most publishers are doing a pretty good job of assigning basic metadata to their new titles as they’re published. You won’t find many bestsellers missing a description and a sampling of the advance reviews. The book cover shows up loud and clear.

It’s the backlist that’s a mess.

The larger and wealthier publishers have mostly digitized their backlists, which afforded them an opportunity to refresh the basic metadata for those titles. It’s the millions of print titles that have yet to be digitized that suffer most. These days I can’t help noticing metadata when I’m ordering books online and I cringe at the mess of metadata afflicting most publisher’s print backlists.

I understand that the ROI isn’t certain. Fixing all that muddled metadata is a big task. But you owe it to the book. You owe it to your authors. Most of them don’t know enough to complain. If a book is worth keeping in print it’s worth maintaining the metadata via your regular ONIX feeds, particularly the cover image. It’s your sacred duty as a publisher.

You Think You’re Intermediate

Your company looks to you to make sure the metadata is accurate, not just on your site, but on Amazon, Apple, Barnes & Noble, Google and more. Did you know:

1. A partner can relieve the pain of good metadata.

If at any point while reading this article you feel that your inner pinball machine just tilted and froze, it’s time to look for a metadata partner. Potential partners offer services from basic to holding your hand, and their prices are thoroughly reasonable. You just need to find the right one for your scale and mission.

If you use a distributor for print or digital books, speak to that distributor. They probably can help you pin down accurate data. Meanwhile Bowker in the U.S. and Nielsen in the U.K. are each national agencies for book data. BookNet Canada fills the same role you know where.

Read this story in our digital edition, or downloadOpens in a new window to get free access to all our great magazine content on your tablet.

If you’re farming out your e­book conversions there’s a pretty good chance that your vendor will also be able to help with metadata. They’ll certainly have some names to recommend.

There are also companies in North America and the U.K. that live and die by the excellent quality of the book data they produce. Two companies that I’ve had good experiences with are Firebrand in the U.S. and BooksoniX in the U.K.


2. Different retailers support different metadata.

And they support it differently for self-publishers loading one title at a time versus larger publishers feeding data coded in ONIX format.

I watch the way Amazon, Apple, Barnes & Noble, Google and Kobo each handle metadata and I scratch my head. This is just data. It’s based on a standard called ONIX. Why do they treat the subject categories differently? Why do they treat reviews differently? Goshdarnit, why can’t they even respect the book’s subtitle?

Sadly, when you reach intermediate status as a metadata technician you’ll need to learn the idiosyncrasies of each online reseller’s handling of the main metadata elements—if indeed they handle them at all. Keeping up with these changing idiosyncrasies is another reason to find an able partner.

My pet peeve is Amazon’s handling of subtitles for self-published e­books. Amazon insists they be included in the same data field following the title. For example “The End of the Line: Romney vs. Obama: the 34 days that decided the election: Playbook 2012Opens in a new window (POLITICO Inside Election 2012) (Kindle Single) [Kindle Edition]Opens in a new window They’re then treated as part of the title, rather than their lesser role. It leads to all kinds of trouble in findability.

3. Foreign editions are a leading cause of chaos.

God created the world in a week, and soon thereafter publishers began treating the two largest English-speaking markets as completely separate. As a result, most upper-mid-list titles and higher find a home with a separate publisher in each country. The two national publishers rarely release the same book at the same time (unless it’s Harry Potter ). The price is set separately. Sometimes the subtitle changes. Even the title.

This arrangement made reasonably good sense in the world before the Inter­net. It doesn’t make sense today. As I know publishers have no intention of discontinuing the practice, all that can be asked is that they think hard about the metadata. ONIX offers fields that define the situation. You must specify in which countries you hold rights and chase etailers that ignore your specification.

4. Enhanced metadata matters

As we point out in The Metadata HandbookOpens in a new windowthe term “enhanced” when used with “metadata” is—like the term “basic”—also subject to variations from one national agency to the next. It’s best just to think the term through logically. If basic metadata includes the descriptive elements, how can you enhance that “data?” We list several choices, some of them obvious:

Tables of contents

Author and contributor biographies and interviews

Digital images beyond the cover image

Video trailers and author interview on video


One way to think of it is that basic metadata brings customers into the store while enhanced metadata closes the sale.

You think you’re advanced.

You’re in charge of data services for a mid-size publisher, or maybe larger. Metadata is a core responsibility, and yours alone. Did you know:

1. You need to get to work with ONIX 3

Metadata has a standard. It’s called ONIX. Most publishers are not using any version of it. Those that use ONIX are usually using version 2.1, released in 2004. They’re not yet using version 3.0, released in 2009.

Version 3.0 was established mainly because of the “need to improve the handling of digital products.” Digital products are routinely 20 percent of publisher sales, but publishers are describing their products with an older metadata standard designed for print.

The publishers are not solely to blame. The prime culprits are the big online resellers, from Amazon to Sony. They’re taking their own sweet time migrating to version 3.0.

Nonetheless, advanced metadata mavens need to get started creating ONIX 3 because it will soon be supported more widely and because it does make a difference for findability. It takes time to relearn ONIX, so now is the time to get started.

2. International metadata varies.

Publishing is by nature language-centric, and countries are dialect specific. American English is different than Canadian English and different than British English, and that’s ignoring the regional dialects in each country. Authors have struggled with this for many years, but publishers always played to their home market. Those days have ended.

With the vast internationalization of the retailing of English books, publishers will soon see that their market is as much the millions of Chinese readers with English as a second language as it is the folks back home. At this stage it’s a fine point, but soon publishers will look to language experts who know how to describe a book with a reduced and simplified vocabulary that plays as effectively for a native speaker as it does for the student in Serbia.

3. The book’s web site should be the No. 1 source of metadata

If you have one of those authors who never bothered to build a web site, you’ve probably built one for them, or at least for their new book. The web designer you assigned to the task might be contracted to maintain it for a few months after the book is published. Afterwards the site will be abandoned. t will become a site that advertises how little the author and the publisher now care about the book and its readers.

But not to worry. Most people looking for the book will never see the site. Amazon’s listing for the book will grab the top spot on Google and Bing, further solidifying Amazon’s stranglehold on book retail.

That’s all wrong. Because you don’t control the content on Amazon. Amazon won’t publish the full text of the rave review that sells a copy every time it’s read. Amazon won’t link to the author’s articles and professional CV.

The book’s web site should be at the top of the list whenever anyone searches for the book. If it’s not, you’re doing something (probably several things) wrong.

4. The book’s contents are its richest mine of metadata.

This is an act of faith: The entire book must be searchable online. Yep, the whole thing. What if five years from now a reader can’t remember the author’s name or the book title—they only remember there’s a character named Chet who says “The night had written a check that daylight couldn’t cash.” Yep, Panama by Thomas McGuane . If the author weren’t so well-known, the only listing would be the full text of the book searchable online.

The easiest way to make a book searchable online is through Google Books. The best way is on the book’s own site. There are many ways to partially obscure the text or to make it challenging to read via a browser, so theft should not be an issue. Not being found; that’s the issue. You have to pull out all the stops.

I hope that you picked up a trick or two when you passed the test. You can see that there’s an awful lot more to metadata than the ISBN and suggested retail price. Take your time—the task is far bigger than most publishers assume.

Metadata won’t turn a dog into a pony. But you don’t publish dogs, do you? Metadata will make your very fine frontlist and backlist both findable and discoverable, and will give you the opportunity to market every title to its full readership potential. BB


Thad McIlroy  is an electronic publishing analyst with The Future of Publishing, based in San Francisco and Vancouver, BC. He is the co-author with Renée Register of The Metadata Handbook (DataCurate, 2012)

Read this story in our digital edition, or downloadOpens in a new window to get free access to all our great magazine content on your tablet.