Insight | Marketing

A guided introduction to structured data

Structured data is no longer a nice to have but an essential part of your SEO strategy. Take a look through our guided introduction to structured data.

Google, and the other major search engines, are changing the rules of the SERPs. The increased prevalence of enhanced search features mean that search engine marketers need to think beyond the traditional ‘ten blue links’ approach.

Key to this is structured data. All of the search engines’ rich result features are pulled from sites’ structured data, so without it, you’ll stand little to no chance of featuring in these increasingly important rich results.

Despite its importance, however, there’s quite a bit of confusion as to what exactly structured data is, and the correct terminology to use both for structured data itself and the enhanced search results it drives. 

The information you need also tends to be scattered around all over the place, with lots of excellent in-depth articles, but no real overall guide as to what structured data is and how it works. So, we decided to put together our ‘in a nutshell’ guide for you.

We’ll cover:

  1. The Fundamentals: The World Wide Web & Search Engines
  2. Structured Data: What It Is
    1. Structured & Linked Data
    2. Schema.org
    3. Microdata
    4. JSON-LD
  3. Structured Data: What It Does
    1. Better Targeting
    2. Rich Results
  4. Rich Results: The New-Look SERPs
    1. Old Terminology
      1. Rich Snippets
      2. Rich Cards
    2. New Terminology
      1. Rich Results
      2. Enriched Rich Results
      3. Knowledge Graph Results
      4. Carousel
      5. Featured Snippets

Structured data

  • Structured data is additional information added to the HTML of a page. It is based on name value pairs, and helps search engines both to better understand the unstructured information of a web page, and to show that content for relevant user queries
  • Schema.org is an easy-to-use vocabulary understood by all major search engines that you can use to add structured data to your page
  • Microdata is one method of adding structured data, by adding in-line markup to the HTML
  • Schema.org JSON-LD is the standard recommended method of adding structured data to a page that has several advantages over microdata.

Rich results

  • Rich snippets were the original name for enhanced organic results 
  • Rich cards were an expansion on rich snippets that provided a new way to showcase snippets of information from your page
  • Rich results is the new umbrella term for all Google’s enhanced search features
  • Enhanced search results are special rich results that act as interactive portals for users (e.g. Google for Jobs)
  • Featured snippets are snippets from a top 10 ranked page that Google thinks best matches a user query
  • Featured snippets are unrelated to rich snippets and do not require structured data
  • Knowledge Cards use data from the Knowledge Graph and are different to both featured snippets and regular rich results.

The fundamentals: the World Wide Web and search engines

To understand structured data, you first need to understand how it fits into the basic principles of search engines and the World Wide Web.

Spotting dark patterns in digital design

The World Wide Web, since its first conception, has been about the sharing of information: making information open and accessible to all. It was an immensely ambitious project and to this days remains one of the world’s finest achievements. 

As the web exploded in size, however, it raised a key problem: there was simply far too much information to be able to navigate it easily. Search engines try to solve this problem by indexing the whole web but only returning the most relevant information for user searches.

To be able to do this, search engines therefore have to be able to do two key things: 

  1. Collect, index and categorise all the information on the internet
  2. Interpret what a user is searching for, and return the content that best matches their search

This may sound simple enough, but it's actually a vastly complicated and difficult process, and structured data plays a key part in both of these steps.

Understanding the content

Step one for a search engine is to understand the information it has indexed. Information can be roughly pooled into two categories: structured and unstructured. Structured information is in a nice, clean, machine-readable format - e.g. an Excel spreadsheet. It’s predictable and easy for computers to understand. 

Unstructured information is any information that doesn’t follow a clearly defined format. In the context of the internet, that is mostly text, but can include dates, places, infographics, and more. 

The problem for search engines is that most of the information on the web is unstructured: messy, unpredictable, and resource intensive both to index and return results from. 

Structured-vs-unstructured - Why you can't afford to ignore structured data -  Distinction Thoughts

Technologies such as HTTP, URI and RDFa help with organising and structuring this data, and are all core components of the web.

However, additional protocols and methods have been developed to build on this core functionality and to help categorise and organise the unstructured information of the web.

One of these methods is structured data.

Understanding the query

Step two for a search engine is to understand the user query. There are three main types of queries: informational, navigational, and transactional (or "do, know, go", if you prefer).

To best answer a user’s query, search engines use two key tactics: semantic searches, and semantic queries. 

9ae01708-606b-01f8-9c66-dbca3224f18c

Semantic search relates to interpreting the semantics of a user’s search and natural language processing. It’s a fascinating subject, but it’s not that relevant here so we won’t focus on it. 

Semantic queries, according to Wikipedia, ‘enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data’. In short, this involves the search engine understanding the context of the indexed information, and relating it to the context of the user’s query.

To do this, a search engine must be able to understand how information relates both to real world things, and to other information in the index.

Structured data (and the related linked data) are key tools that search engines use for this.

Structured data: what it is

Now we’ve covered the basic principles of what structured data is for, we can now look in more detail at how it works and how you can implement it.

Structured data and linked data

Structured data is additional HTML content added to a site that doesn’t change the visible content of the page, whilst providing search engines with machine readable data on that page.

At the top level, it works by pairing a name with a value: 

  • i.e. “[name]”: “[value]”
  • e.g. “url”: “example.com”

As long as you use a recognised vocabulary to provide these names and values in one of a list of accepted formats (more on that below), you will give search engines a way to better understand the content of your site.

This additional information helps search engines understand how a piece of information relates to a real world thing, and gives them pointers about how to index and categorise your content. 

Linked data, meanwhile, is a method of publishing structured data that allows it to be linked to other structured data elsewhere on the web. This gives search engines contextual information that helps them understand how the content of a page relates to other content around the web. 

Knowing this allows them to use the structured query method we mentioned earlier, and serve your content to relevant, related user queries. 

Schema.org

Because structured data is so useful for search engines, it makes sense for them to encourage content publishers to use it. So, in 2011, Google, Bing, Yahoo and Yandex collaborated to produce Schema.org: a universal structured data vocabulary that was both machine readable, and easy for non-technical users to read and write.

There are various different vocabulary you can use for structured data, but Schema.org has become by far the most widely used, understood, and accepted vocabulary.

So, whenever you see the term ‘schema markup’, or ‘schema structured data’, or just ‘schema’, what this generally means is structured data added to the HTML markup of a page, and written in Schema.org vocabulary. 

The full list of Schema.org vocabulary and the various things you can use it for are found on the Schema.org website.

There are two main ways of implementing Schema.org markup on your page: microdata, and JSON-LD. You could also use microformats and RDFa, but these are much less widely used, so we won’t cover them here.

Microdata

Microdata is just a way of adding additional information to HTML by nesting it in-line in the content.

For this, we’ll use a very simple example from an imaginary ‘About Us’ page:

‘We are Distinction, a full service digital agency creating outcomes, not outputs. We were founded by James and Greg Bloor in 2001.’

This in its raw form is unstructured data and not much use a search engine. This is what it would look like with Schema.org microdata added:

<section itemscope itemtype="http://schema.org/Organization">
	We are <span itemprop="name">Distinction</span>, 
	<span itemprop="description">a full service digital agency creating outcomes, not outputs</span>.
	We were founded by
	<span itemprop="founder">James Bloor</span>
	and
	<span itemprop="founder">Greg Bloor</span>
	in
	<span itemprop="foundingDate">2001</span>.
</section>

 

Microdata works with HTML 5 and is understood by all major search engines. However, it is trickier to manage as it can be spread out over a page, is harder to implement (as you need to add it in-line), and less human readable.

For this reason, the recommended way of implementing Schema.org markup is via JSON-LD (see below). 

JSON-LD

JSON-LD (JavaScript Object Notation for Linked Data) is built on the popular and widely used JSON, but specifically for econding Linked Data (see above). Unlike microdata, where structured data is added in-line directly into the HTML, JSON-LD injects the structured data into the head of the page.

JSON-LD is very flexible, easy to understand, and easy to manage, even for non-technical users. As a result, it is the most widely used way of implementing structured data on a site. When you hear people referring to ‘schema’ or ‘schema markup’, what they are generally referring to is Schema.org JSON-LD.

Using the above example, this is what the Schema.org markup would look like in JSON-LD format:

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Organization",
  "name": "Distinction",
  "description": "A full service digital agency creating outcomes... not outputs",
  "url": "https://distinction.co.uk/",
  "founder": "James Bloor, Greg Bloor",
  "foundingDate": "2001-06-26"
}
</script>

As you can see, this is a much cleaner and much easier to manage format, and we can easily add to it to showcase, for example, information about our businesses addresses or the awards we have won.

Because all the structured data is housed in a single place in the head of the page instead of being in-line, it is less scattered and much easier to manage than microdata. It is also easy for search engines to read, and loads more quickly than microdata. Plus, you can inject via tag management systems like Google Tag Manager for even easier management.

That’s why we and most search engines recommend using JSON-LD for your structured data implementation.

Structured data: what it’s for 

What does adding structured data to your site mean for you, and what can it look like in practice?

We won’t fully cover why it’s important to have structured data on your site, as we go over this in depth in the second article in this series. There are, however, two key practical benefits.

Better quality traffic

First things first: structured data is not currently a direct ranking factor (at least for Google), so you won’t get better ranking positions simply for having structured data on your pages. 

However, what you can get is improved targeting. Because structured data helps Google and other search engines better understand the content of your pages, it therefore means they are more able and more likely to rank and serve your content for relevant searches. This can result in better quality, more valuable search traffic.

Rich results eligibility

The second benefit is that it significantly improves your chances of (though does not guarantee) earning one of the enhanced search features that are becoming increasingly dominant in the SERPs.

Search engines, and in particular Google, have been steadily moving away from the well-known ‘ten blue links’, and adding more and more enhanced search features to their SERPs. 

This is not without its drawbacks for content producers: you can see my own thoughts on what this could mean for smaller brands here. However, these features are here to stay, and getting one of these rich results can certainly bring real life practical benefits. These include increased traffic and CTRs, greater authority, and the opportunity to pick up traffic you may otherwise not have got.

Rich results: the new-look SERPs

Most people are familiar with what the different search features look like, but there is a lot of confusion over what exactly these rich results should be called. 

In forums and articles about structured data and search features, you’ll often see things like ‘rich snippets’, ‘featured snippets’ and ‘the knowledge graph’ used interchangeably. In truth, however, these are all very different things and it’s important to know the difference. Collectively, they’re often referred to as @SERP features’, which is fine, if a bit of a catch-all name.

Google hasn’t helped matters by changing their mind about what to name things and not always being 100% clear about its own nomenclature. So, to help, here is a breakdown of the old and the new, currently accepted terminology* on rich results. 

* Partly for ease, and partly because Google is far and away the largest search engine and the one as an SEO you should care about the most, we’re only focusing on Google terminology here. We’re also ignoring paid ad elements.

The old terminology

Rich snippets

A ‘snippet’ is literally a small or brief extract. Google uses this term very broadly, and it encompasses everything from the humble meta description all the way up to the most complex of their rich results.

‘Rich snippets’ were first introduced back in 2009, and they had fairly humble beginnings: starting just as a way to enhance your standard search results (using structured data), with reviews etc.

rich snippet introduction - Distinction introduction guide to structured data

This steadily expanded over the years to recipes and more. You will frequently hear of rich results being referred to as ‘rich snippets’, but this isn’t accurate (more on that later). There’s also a lot of confusion between rich and featured snippets, but they’re actually quite different things (see below) and should not be conflated. 

Rich cards

Rich cards were rolled out in 2016. They were sold as an expansion on rich snippets, as a new way of previewing your site content. They tended to appear in carousel format. They still exist, but the term itself has now been superseded.

Rich cards on mobile - Distinction introduction guide to structured data

The new terminology

Rich results 

Partly because of Google’s major new focus on enhanced search features and also probably due to the general confusion surrounding proper terminology (rich vs featured snippets, rich snippets vs rich cards, rich cards vs Knowledge Graph cards, etc.), Google mercifully decided to simplify things in 2017.

They rolled all their rich search features under a single umbrella: rich results. This can apply to a variety of content types (articles, books, recipes, products etc.) with a variety of available features (breadcrumbs, reviews, cards, etc).

So, from now on, if you refer to any non-standard organic feature as a ‘rich result’, you’re probably on the safe side. However, there are still a few important exceptions (see below).

Enriched search results

Slightly confusingly, Google has decided to distinguish between ‘rich results’ and ‘enriched search results’. Enriched search results are enhanced, interactive classes of regular rich results, and are (at the time of writing) eligible for job postings, recipes, and events. 

rich search results on mobile - Distinction introduction guide to structured data

Users can then interact with and filter these search results (e.g. searching for recipes under a set amount of calories, or job listings from a particular employer or in a certain wage bracket, etc).

They effectively act as separate ‘portals’ that you can browse and interact with without ever needing to leave the SERPs

New jobs schema - Distinction introduction guide to structured data

Knowledge Graph results

The Knowledge Graph is a form of ‘knowledge base’, a system which stores a complex mix of both structured and unstructured information. The Knowledge Graph is a huge collection of information pulled from lots of sources that Google uses as reference point to answer many of the billions of user queries it receives each day. 

Using data from the Knowledge Graph, Google will display relevant information in a Knowledge Card for certain searches. This can be a big boost for a brand who manages to acquire one of these elusive results, but it’s not easy to get into it. 

Two of the major sources of information Google uses to populate the Knowledge Graph are Wikipedia and Wikidata, so just having all the usual steps fulfilled (good markup, local business data submitted etc.) will probably not be enough if you aren’t deemed noteworthy enough to have your own Wikipedia page.

Carousel

A carousel is a list-like display containing multiple rich results. They tend to appear only on mobile, and currently the only content that can appear in them are recipes, courses, and articles. You’ll need to follow additional structured markup guidelines if you want to stand a chance of being featured in a carousel - see Google’s developer guide for more info.

Recipies - Distinction introduction guide to structured data

Featured snippet

That leaves us with the anomaly: the featured snippet. People frequently, and incorrectly, conflate this with rich results. The two are not the same thing.

All a featured snippet is is a snippet of content from a page that Google promotes to the hallowed ‘position 0’ above all the organic results. 

Featured snippet - Distinction introduction guide to structured data

 

Unlike rich results, featured snippets do not rely on structured data. Structured data won’t hurt, but there’s no evidence of a strong correlation between the two. Well-written content, particularly if it’s written to answer a specific question, is key.

You will need to be in the top 10 pages, though, to stand a chance of having a featured snippet.

 

Summary

We’ve covered quite a lot there, so let’s just recap briefly:

Structured data

  • Structured data is additional information added to the HTML of a page. It is based on name value pairs, and helps search engines both to better understand the unstructured information of a web page, and to show that content for relevant user queries
  • Schema.org is an easy-to-use vocabulary understood by all major search engines that you can use to add structured data to your page
  • Microdata is one method of adding structured data, by adding in-line markup to the HTML
  • Schema.org JSON-LD is the standard recommended method of adding structured data to a page that has several advantages over microdata.

Rich results

  • Rich snippets were the original name for enhanced organic results 
  • Rich cards were an expansion on rich snippets that provided a new way to showcase snippets of information from your page
  • Rich results is the new umbrella term for all Google’s enhanced search features
  • Enhanced search results are special rich results that act as interactive portals for users (e.g. Google for Jobs)
  • Featured snippets are snippets from a top 10 ranked page that Google thinks best matches a user query
  • Featured snippets are unrelated to rich snippets and do not require structured data
  • Knowledge Cards use data from the Knowledge Graph and are different to both featured snippets and regular rich results.

And that, in a fairly lengthy nutshell, is structured data. 

If you want to learn more, we’d recommend you read up on why structured data is so important for SEO.

We hope this has been helpful. Let us know your thoughts in the comments below!

Need help with implementing or optimising your structured data? We offer a comprehensive SEM service for clients across all different industries. Get in touch today and see how we can help.

Henry France

Written by:Henry France