Google uses structured data that it finds on the web to understand the content of the page, as well as to gather information about the web and the world in general. In this post, I will explain how I create JSON-LD structured data for my blog, powered by Jekyll.

Include JSON-LD in Post Generation

In order to include JSON-LD structured data in the blog post, you need to find out which HTML template(s) generate such page for you. In my case, the HTML is handled by:

/_layouts/post.html

So I need to edit the page post.html by adding a new HTML element <script>:

---
layout: default
---

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "url": "{{ site.url }}{{ page.url }}",
  "name": {{ page.title | jsonify }},
  "headline": {{ page.title | jsonify }},
  "keywords": {{ page.tags | join: ',' | jsonify }},
  "description": {{ page.excerpt | strip_newlines | strip | jsonify }},
  "articleBody": {{ page.content | strip_html | jsonify }},
  "datePublished": {{ page.date | jsonify }},
  "dateModified": {{ page.last_modified_at | default: page.date | jsonify }},
  "author": {
    "@type": "Person",
    "name": {{ site.author_name | jsonify }},
    "givenName": {{ site.author_first_name | jsonify }},
    "familyName": {{ site.author_last_name | jsonify }},
    "email": {{ site.email | jsonify }}
  },
  "publisher": {
    "@type": "Organization",
    "name": {{ site.title | jsonify }},
    "url": "{{ site.url }}",
    "logo": {
      "@type": "ImageObject",
      "width": 32,
      "height": 32,
      "url": "{{ site.url }}/icon/favicon.ico"
    }
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "{{ site.url }}{{ page.url }}"
  },
  "image": {
    "@type": "ImageObject",
    "width": {{ page.img_width | default: site.img_width }},
    "height": {{ page.img_height | default: site.img_height }},
    "url": "{{ site.url }}{{ page.img_url | default: site.img_url }}"
  }
}
</script>

Once you’ve added the code above, the generated JSON-LD structured data snippet should be included in your blog post as follows (simplified):

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "url": "https://mincong-h.github.io/2018/08/21/why-you-should-use-auto-value-in-java/",
  "name": "Why You Should Use Auto Value in Java?",
  "headline": "Why You Should Use Auto Value in Java?",
  "keywords": "java,auto-value",
  "description": "Auto Value generates immutable value classes during Java compilation, including equals(), hashCode(), toString(). It lighten your load from writing these boilerplate source code.",
  "datePublished": "2018-08-21 07:22:49 +0000",
  "dateModified": "2018-08-21 07:22:49 +0000",
  "author": {
    "@type": "Person",
    "name": "Mincong Huang",
    "givenName": "Mincong",
    "familyName": "Huang",
    "email": "mincong.h@gmail.com"
  }
}
</script>

Quite easy, right? In the following sections, we’ll see how to choose the right schema and test the generated data. If you’ve questions about Jekyll expressions or Liquid expressions, I’ll explain them at the end of this post, at section “Advanced Configuration”.

Which Schema Should I Use?

I use BlogPosting for my blog posts. Some websites use NewsArticle, such as Medium. I’m not really sure which is the best choice, but I believe we should use Article, or any schema derived from it.

https://schema.org is a website, useful for choosing the right schema. Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond. A schema can be found in the following URL pattern:

https://schema.org/${mySchema}

https://webmasters.stackexchange.com is another website, useful for choosing the right schema. There’re many questions and answers about schemas, or about web-masters in general. Typically, there’s a discussion about Using Schema.org for blogging: Article VS BlogPosting.

Test Your Structured Data

Google Structured Data Testing Tool is an easy and useful tool for validating your structured data, and in some cases, previewing a feature in Google Search. Try it out:

Google Structured Data Testing Tool

You can either provide an URL or directly the HTML source code. In my opinion, submitting HTML source code is a good choice, which allows validating the previewed version (localhost) before pushing the changes into production.

Google Structured Data Testing Tool is also useful for knowing which fields are required by the target schema. My technique is to submit an empty schema to the testing tool, where only the schema name is filled, then let Google tell you which fields is missing.

Note: image URL will always fail when submitting your blog post generated in localhost, because Google does not recognize the image URL. But it does not matter, this problem will be fixed once the changes are pushed to your production.

Advanced Configuration

Now, let’s talk about the Jekyll and Liquid expressions used in JSON-LD. If the above code fits your needs, you can skip this section.

Use “jsonify” to convert data to JSON. You can apply a jsonify filter to a string or an array to create a valid JSON value. Note that the generated output already contains the double-quotes ("), do not add them again yourself.

{{ your.property | jsonify }}

For example, the input message:

Line 1
Line 2

will be converted as output (with double quotes):

"Line 1\nLine2"

Use “default” to provide fall-back value. Liquid’s default filter allows you to provide a default value for your blog post. It’s useful for optional properties like image URL, or modification date. They might not present in your post: therefore, fall-back to default image URL (image of your blog) or creation date.

"{{ site.url }}{{ page.img_url | default: site.img_url }}"

Limits of JSON-LD

Even though JSON-LD might be the best format for structured data format, not every actors recognize it. Social networks, like LinkedIn, does not read the schemas for their metadata creation. So using only JSON-LD snippet is not enough, you still need to have some HTML meta tags.

Next Steps

So far, we added the JSON-LD snippet into blog posts generation. Is it good enough? No, not yet. You still need to:

  1. Test the real blog URL in Google Structured Data Testing Tool to ensure everything works well, including the images.
  2. Follow the structured data discovery on Google Search Console to ensure that Google Search really discovered them. This process might take a few days.
  3. Continuously improve your JSON-LD snippet to ensure they’re relevant to the content and match user’s expectations (their search queries). For example, you can edit the content, provide more optional fields, or add new schemas.

Conclusion

In this post, we learnt how to create JSON-LD (JSON for Linking Data) for Jekyll, the choice of schemas, testing the result, the limits of JSON-LD and the remaining tasks to do once JSON-LD is embedded. By doing this, there’s chance that your blog posts will rank better and have more views in the next months. Hope you enjoy this article, see you next time!

References