Combine Drupal, HTML5, and microdata

crateleftInternet and Web Development

Dec 4, 2013 (4 years and 7 months ago)


Combine Drupal,HTML5,and microdata
Make your content easier to find and reuse
Skill Level:Intermediate
Lin Clark
Drupal Developer
Digital Enterprise Research Institute,NUI Galway
01 Nov 2011
With Google,Yahoo,and Bing's announcement of,microdata is quickly
gaining ground as a way to create applications that rely on data from many different
websites.In this article,learn how to use Drupal to add microdata to your pages.
Easily make your content available for use in applications such as Google's Rich
In May of 2011,the triumvirate of Google,Yahoo,and Bing announced
and got everyone talking about structured is a new way for search
engines to understand web pages.If web content authors add a little bit of metadata
to their pagesjust a few vocabulary termsthen their search results show up
better in all three search engines.
The extra markup hasn't yet changed the way search results are displayed for many
sites that have implemented content authors are still eager,
though,to get their pages marked up and ready for consumption by the big three. poses a challenge for web authors who don't have experience with the
different syntaxes for adding structured data to HTML.The syntaxes are:
 Microformats
 RDFa
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 1 of 18
 Microdata
To add to the challenge,Google (the most influential search engine for many web
authors) indicated that it will only process microdata.Microdata,which is the newest
of the three syntaxes,does not yet have much tool support.
In this article,learn to use Drupal to add microdata to your pages.Prepare your
content so it can be used in applications such as Google's Rich Snippets.
Download the source code for this article.
What is microdata?
Frequently used abbreviations
 FOAF:Friend of a Friend
 RDF:Resource Description Framework
 RDFa:RDF in attributes
Microdata is a simple way to add structured data to pages.It defines a few
attributes,such as itemtype and itemprop,that can be placed on HTML tags to
indicate what the page is about.Microdata was introduced by Ian Hickson,the editor
of the HTML 5 specification,in 2009.The roots of the idea existed much earlier than
Microdata is based on RDFa,which is a way of placing RDF in HTML.The idea for
RDFa was introduced by Mark Birbeck in 2004 with a note published by the W3C.
The idea was then incorporated into the next version of XHTML.RDFa introduced
several new HTML attributes,such as property and about,and reused some
attributes,such as rel.
RDFa is powerful,but it can be difficult for authors to know if their RDFa is correct
due to the sometimes-complex interactions of the attributes.RDFa also inherited
some features of XML,such as namespace prefixes,which can be confusing.
Microformats,another version of structured data in XHTML,was launched a little
more than a year later by a grassroots group of developers.In contrast to RDFa,
microformats reuse existing XHTML attributes that web content authors were
already used to,such as the rel attribute on links.Microformats also add a little bit
of semantics within those attributes.An emphasis was placed on only marking up
visible content;it's easy for invisible content to be abused or go out of sync with
visible content.
One problem with microformats is that there isn't a generic way to parse them.
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 2 of 18
Instead,support has to be added for each microformat.For example,if you want to
process both calendar data and address data,you have to make sure your parser
supports both or use two different parsers.It can also be difficult to get a new
microformat published through the community process.
Microdata brings together good ideas from both microformats and RDFa.Microdata:
 Reduces the complexity of RDFa by reducing the number of attributes
and the options for their placement.
 Eliminates the namespace prefixes.
 Maintains the generic parsing of RDFa,which makes it much easier to
make tools that work on top of published data.
 Maintains the ability for different groups of people to create their own sets
of attribute values,called vocabularies,to use with microdata.
Placing the vocabulary with microdata is a vocabulary that works well with microdata.Because no approval
body is in charge of vocabularies,the search engine owners were able to devise
their own vocabulary to meet their needs.Most of the vocabulary deals with the
kinds of things Google already focused on for its Rich Snippets:people,places,
events,entertainment,and commerce.
Several good examples (seeResources) demonstrate howto place
terms on a site.For example,Listing 1shows simple markup for a description of a
movie enhanced with terms.
Listing 1.Simple markup for a movie enhanced with
<div itemscope itemtype ="">
<h1 itemprop="name">Avatar</h1>
<span>Director:<span itemprop="director">James Cameron</span>
(born August 16,1954)</span>
<span itemprop="genre">Science fiction</span>
<a href="../movies/avatar-theatrical-trailer.html"
What the extra markup does might not be immediately clear.To get an idea,publish
a page with this snippet to the web.You can then enter the URL for that page in
Google's Rich Snippets Testing Tool (seeResources),as inFigure 1.If you don't
have easy access to a web server,you can also copy and paste the snippet into the
live microdata testing tool provided by Opera developer Philip Jägenstedt (see
Resources). developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 3 of 18
Figure microdata extracted fromthe example in Listing 1
The tool pulled out information about two things:the movie and its director.
The two main concepts in microdata are items and properties of those items.A
property can be set either to a string or another item.For example,the movie is an
item.It has a name,which is a property with a string value.It also has a director,
which is a property with an item valuethe person.
To let the parser know you're starting to talk about an item,use the itemscope
attribute.You can also use the itemtype attribute to let the parser know what type
of thing you're talking about.
Use itemtype to determine which properties can be used in the itemprop
attribute.For example,on the page for the Movie itemtype you'll find a list of
properties that can be used on the movie (see Resources ).Other properties outside
this list can also be used if you use the full URL of the property.For example,the
FOAF vocabulary also specifies a name property.You could use
itemprop=""to use the FOAF name
property instead of the name property.
All of the properties inside of the Movie's <div> are understood as properties of the
movie until you reach the end of the div or until you reach an itemscope on a div
inside the Movie,as in Listing 1.The itemscope attribute indicates that you are
now talking about a different thing (a Person in this case),so the birthplace
property is understood as an attribute of the Person instead of the Movie.
Because you added a little structure to your content,it's easy for either of the tools to
extract the relevant information.By adding the attributes in the HTML,you made the
data in your page easy to processalmost as if it were in an Excel spreadsheet or a
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 4 of 18
Though microdata is fairly simple,it can still be difficult to place and maintain the
content by hand.Some tools support the production of microdata,including Drupal's
Microdata module ( see Resources ).
Using Drupal to add microdata to your pages
Drupal is a content management systemthat powers an estimated 2%of the web.
With its user interface,site administrators can create forms to collect content from
users.Drupal then automatically creates the appropriate tables and fields in the
database for the formdata and handles the display of the data in a configurable way.
Drupal is particularly well suited for outputting structured data because of the way
the content is handledas discrete things (called entities) that have properties in the
formof field values.With Drupal 7,the capability to add structured data to HTML
using RDFa was incorporated into the Drupal core.
Since the announcement on 02 June 2011,work has progressed to also
add the same support for microdata output.The microdata module is still under
development and isn't ready for use on live sites.For experimentation on testing
sites,you can use the microdata module to generate microdata for fields and test the
Rich Snippet displays based on that microdata.
Start by recreating the example above using Drupal.SeeResourcesto download
and enable the latest release of the following modules:
 Microdata
 Entity API
 CTools
Marking up the content type
A content type allows users to define what field values are collected and stored for
an entity.For example,you might create a product content type that has formfields
for collecting the price,available colors,sizes,and manufacturer's model number,
which makes it easy to maintain an inventory.
For this exercise,you'll create a movie content type.Go toStructure > Content
Types,click theAdd Content type link,and enter the following information.
 Name:Movie
 Description:A page describing a movie
 Comment settings:SelectClosed.You don't need the comment function developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 5 of 18
on that page.
 Microdata settings:Add the itemtype
The title is a special kind of field and does not have its own edit screen,
so you add the title here as well.Use the name property to mark up
the title.
You can test whether this example worked by creating a new Movie item.Go to Add
content to create the Movie.After you create it,use the Rich Snippets testing tool to
determine if you can extract the data from the page.You should see a single item
with a Type of and name of Cool Hand Luke as in
Figure 2.
Figure 2.Microdata extracted after mapping the content type and title
The content type was recognized as being a Movie with a title.However,there's
more information about this movie.
Marking up text fields
Fields are attached to content types to collect extra information about the content.In
the example,add the genre of a movie as its own field.
To add the genre to the content type,go toStructure > Content types and click
Manage fields for the Movie content type.You'll use a text field to collect the genre.
Enter the following information.
 Label:Genre
 Field name:genre
 Field type:Text
 Field widget:Text field
ClickSave field settings on the next page.At the bottom of the field instance
configuration form,you will seeGenre Microdata Mapping,as inFigure 3.Set the
field property togenre and clickSave.
Figure 3.Interface for mapping the text field
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 6 of 18
Edit your piece of content and add the genre of the movie.Refresh the Rich Snippet.
The genre now displays with the type and name.
Marking up image fields
Though the example didn't demonstrate images,you can add an image,such as the
movie poster,to this content type.A thumbnail of the image then displays for the
Rich Snippet.
To add the image to the content type,go to Structure > Content types and click
Manage fields.
 Label:Poster
 Field name:poster
 Field type:Image
 Field widget:Image
Use the image property for the poster.In the field property field,
enter image,as in Figure 4.
Figure 4.Interface for mapping the image field
Save and edit the movie to add an image.Retest the Rich Snippet.You should see
the image property with its URL,,as
in Figure 5.The single item also has a Type of,a
name of Cool Hand Luke,and a genre of prison drama.
Figure 5.Microdata extracted fromthe text and image field developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 7 of 18
You might also see a Rich Snippet displayed with a thumbnail of the poster,as in
Figure 6.Google's testing tool is under very active development;the display of the
Rich Snippet for the same markup changes over time.This Rich Snippet was
captured on 14 September,but the display changed by 19 September.
Figure 6.Rich Snippet displayed for movie
Enabling microdata in field formatters
Text and image fields cover a lot of the data that people usually put on a site,but
there are other types of data.To cover all kinds of data that a site administrator
might need,Drupal's field systemgives users a selection of basic field types and
provides an API so that modules can define new field types.Within these modules,
you can define different data collection forms (widgets),data storage,and display
(formatters) for each field type.Site administrators can then install such field
modules and configure the widgets and formatters without having to write any code.
Microdata has strict requirements about where to place the microdata attributes in
the HTML,so each field type in Drupal needs to define where to place the attribute
within its formatters.While microdata is supported for most field types defined by
core,many widely used field types still do not support microdata.
To use a field formatter defined in a contributed module,you can check the table
that tracks microdata support.Even if the field formatter isn't supported yet,that
doesn't mean you can't use it.It's easy to add microdata support to a field formatter.
You can even contribute microdata support back to the module by creating a patch
with your changes.This is a great way to get started with the Drupal developer
In the example,a link to the movie's trailer was marked up.At the
time of this writing,the link field formatter defined by the Drupal Link module doesn't
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 8 of 18
support microdata,but you can change that.
You'll add microdata support to the Link module.The examples below use the Link
module code from 20 September 2011,which is provided in the download file with
this article.(The current version of the Link module has changed and might already
contain microdata support.)
Registering properties
The link field has two different bits of data that you might want to expose using
 The URL for the link
 The text that is linked to that URL
At this point,you need to notify the system of these two properties through an Entity
API module:the Entity Property API.
You must add the information to the field definition,which is registered by
link_field_info.Add the property_type for the field itself and the
property_callbacks,as in Listing 2.
Listing 2.Add property information for the field to link_field_info
* Implements hook_field_info().
function link_field_info() {
return array(
'link_field'=> array(
'label'=> t('Link'),
'description'=> t('Store a title,href,and attributes in the database to
assemble a link.'),
'property_callbacks'=> array('link_field_property_info_callback'),
The property type lets the system know the data type of the field.Because
field_item_link isn't a recognized data type or entity,the data type defaults to
struct when it is processed.This struct acts as a container for the properties
that you mark up (the link URL and linked text).Because it is simply a container,you
don't enable microdata for the field itselfonly for its properties.
The property callback is a function that registers the same property type information
for the component properties.To mark up the properties with microdata,set
microdata to TRUE for each property,as in Listing 3.This provides the graphical user
interface for adding microdata for these properties. developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 9 of 18
Listing 3.Register the field's properties with the property callback
* Additional callback to adapt the property info of link fields.
* @see entity_metadata_field_entity_property_info().
function link_field_property_info_callback(&$info,$entity_type,$field,$instance,
$field_type) {
$property = &$info[$entity_type]['bundles'][$instance['bundle']]['properties']
$property['property info'] = array(
'title'=> array(
'label'=> t('The title of the link.'),
'microdata'=> TRUE,
'url'=> array(
'label'=> t('The URL of the link.'),
'microdata'=> TRUE,
if ($instance['settings']['title'] =='none') {
unset($property['property info']['title']);
The user interface pulls the label from the property information and uses the type to
determine which kind of form fields to display.If the property is an item instead of a
string,an itemtype field also displays.Figure 7 shows an example for two
properties of a trailer:the link title and link URL.
Figure 7.Link microdata mapping form
You can now specify which vocabulary terms to use for the field's properties on the
field configuration form.However,the attributes aren't inserted into the HTML until
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 10 of 18
you add a little more code.
Adding microdata to the themed output
To place the microdata,you need to change the HTML output for the field.For
example,to add a link to a software application,you might want the link text (the
name of the software) to use the name property and the link itself to use the url
property.Listing 4 shows how to do this by adding the itemprop of the URL to the
<a> tag and inserting a span with the itemprop of the text around the text content.
Listing 4.A link before and after adding microdata
<a href="">Drupal</a>
<a itemprop="url"href=""><span itemprop="name">Drupal</span></a>
Things are easier if you could get the Link module to insert these attributes.To
transform the content from the database for the field into HTML,each field formatter
module has its own view function.Within the view function,some formatters use
theme functions to generate the HTML.An example is
theme_link_formatter_link_default().Often,the microdata attributes
need to be passed from the field_formatter_view function into the theme
In the Link module,the formatter already passes an array of attributes to be placed
on the <a> tag using the item variable.You can add the URL itemprop to that
array to have it automatically output where you need it,as in Listing 5.
Listing 5.Adding microdata in hook_field_formatter_view
* Implements hook_field_formatter_view().
function link_field_formatter_view($entity_type,$entity,$field,$instance,
$langcode,$items,$display) {
$elements = array();
$microdata = array();
//If the microdata module is enabled,the microdata mapping will have been
//passed in via the entity.
if (module_exists('microdata')) {
$microdata = $entity->microdata[$field['field_name']];
foreach ($items as $delta => $item) {
//Add the url attributes to $item['attributes'] because the theme function
//will pass it through to l(),properly placing the itemprop for the url.
if (isset($microdata['url'])) {
$item['attributes'] += $microdata['url']['#attributes'];
//Pass the microdata array to the theme function so it can be used to place
//the link title's attribute.
$elements[$delta] = array( developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 11 of 18
'#markup'=> theme('link_formatter_'.$display['type'],array('element'=> $item,
'field'=> $instance,'microdata'=> $microdata)),
return $elements;
There is no automatic way to place the attributes for the text content,however.You
have to pass them into the theme function and change the theme function to use
After you pass the microdata variables to the theme function,you can add the
<span> tag containing the itemprop around the title.The code checks to see
whether there is an itemprop for the text and,if there is,you add the microdata,as
in Listing 6.
Listing 6.Add microdata in the theme function
* Theme function for'default'text field formatter.
function theme_link_formatter_link_default($vars) {
$url = $vars['element']['url'];
$microdata = $vars['microdata'];
//If there is an itemprop set for the title,wrap the title in a span and
//add the itemprop to that span.
if (!empty($microdata['title'])) {
$title ='<span'.drupal_attributes($microdata['title']['#attributes'])
else {
$title = $vars['element']['title'];
//Create the array of options to pass to l().
$link_options = $vars['element'];
//Display a normal link if both title and URL are available.
if (!empty($title) &&!empty($url)) {
return l($title,$url,$link_options);
//If only a title,display the title.
elseif (!empty($title)) {
return check_plain($title);
//If only a url,display the full url as a link.
elseif (!empty($url)) {
return l($url,$url,$link_options);
You can now test the microdata output for the formatter.
Contributing your changes back to the community
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 12 of 18
One of the things that makes Drupal a powerful technical solution is the large
number of contributors that make up its community.Contributors aren't just people
who live and breath Drupal;many contributors make the occasional code fix for their
own sites,which they then post as a patch for others to use.
If you add microdata to a field formatter for your own project,you can contribute that
work back to the Drupal community.Simply post an issue in the issue queue for the
module and suggest that the module support microdata.This type of issue is called
a feature request.You can then post a patch with your changes on the issue.(There
are some great tutorials that demonstrate how to create patches for Drupal projects.)
Once you've posted the patch,mark the issue as"needs review."
In this article,you learned to use Drupal to add microdata to your pages so your
content can be used in applications like Google's Rich Snippets.With the new
microdata module you can configure microdata output for basic field types and add
microdata output to custom field types.Now your data is available for others to make
applications on top of it. developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 13 of 18
Article source code 820KB HTTP
Information about download methods
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 14 of 18
 more about this collection of schemas,which are HTML
tags that webmasters can use to mark up their pages in ways recognized by
major search providers.
 Getting started with these tutorials,learn to mark up your
content using microdata and to use the vocabulary.Advanced
topics are also covered.
 Itemtype URL:Find the properties that you can use on a item by
visiting the itemtype URL (,for example).
 Microdata support:Find out if a field formatter has microdata support.
 Data types:See how microdata in Drupal uses the entity properties.
 The Semantic web,Linked Data and Drupal,Part 1:Expose your data using
RDF(Lin Clark,developerWorks,April 2011):Make your web data more
interoperable and your data sharing more efficient.An example shows how to
use Drupal 7 to publish Linked Data by exposing content with RDF.
 The Semantic web,Linked Data and Drupal,Part 2:Combine linked datasets
with Drupal 7 and SPARQL Views(Stéphane Corlosquet and Lin Clark,
developerWorks,May 2011):Learn to use the existing Linked Data available
today on the web of data,and how to enrich a Drupal 7 site with data coming
from different endpoints.
 Creating patches for Drupal projects:Learn what patches are and how to work
with them in the context of the Drupal project.From Drupal's HTML5 initiative
lead,Jacine Luisi.
 Scientific American article on the Semantic web:Read this seminal article by
Tim Berners-Lee,James Hendler and Ora Lassila.
 Linked Data:Read the ReadWriteWeb interview about linked data with Tim
 Linked Data Design Issues:Learn more about linked data from Tim
 Rich snippets (microdata,microformats,and RDFa) - Webmaster Tools Help:
Learn more about Google Rich snippets and how to label your web content to
indicate clearly the data type,such as a restaurant name,an address,or a
 Implement Semantic web standards in your Web site(Rob Crowther,
developerWorks,May 2008):Create a simple social networking site using PHP developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 15 of 18
and MySQL,which implements Semantic web standards such as hCard and
Friend of a Friend (FOAF) as part of a semantic Uniform Resource Identifier
(URI) scheme.
 Developing Drupal publications to support standards-based XML (Garrick
Bodine and Stephanie Schlitz,developerWorks,Feb 2011):Learn how to
customize your Drupal installation to support the publication of TEI (or other)
XML documents.
 Drupal Installation Guide:Read about preparing for installation,running the
installation script itself,and the steps to do after running the installation script
 Install Drupal 7 with the Acquia Stack Installer:Get step-by-step instructions in
this video.
 FOAF Vocabulary Specification 0.98:Explore the FOAF language,defined as a
dictionary of named properties and classes using W3C's RDF technology.
 Dublin Core Metadata Initiative (DCMI):Learn about this open organization
engaged in the development of interoperable metadata standards that support a
broad range of purposes and business models.
 SIOC (Semantically-Interlinked Online Communities) Core Ontology
Specification:Learn the main concepts and properties required to describe
information from online communities (such as message boards,wikis,or
weblogs) on the Semantic web.
 SPARQL Explorer for a demonstration query
interface available on the web.
 New to XML?Get the resources you need to learn XML.
 XML area on developerWorks:Find the resources you need to advance your
skills in the XML arena,including DTDs,schemas,and XSLT.See the XML
technical library for a wide range of technical articles and tips,tutorials,
standards,and IBM Redbooks.
 IBM XML certification:Find out how you can become an IBM-Certified
Developer in XML and related technologies.
 developerWorks technical events and webcasts:Stay current with technology in
these sessions.
 developerWorks on Twitter:Join today to follow developerWorks tweets.
 developerWorks podcasts:Listen to interesting interviews and discussions for
software developers.
 developerWorks on-demand demos:Watch demos ranging from product
installation and setup for beginners to advanced functionality for experienced
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 16 of 18
Get products and technologies
 Acquia Drupal:Get the freely available packaged distribution of the open source
Drupal social publishing system.
 Google's Rich Snippets Testing Tool:Test your markup.
 Google Rich Snippets,Field collection,and Entity API:Download the modules
and be sure to get the development releases.
 Live Microdata testing tool:Get another tool,created by Opera developer Philip
Jägenstedt,for testing microdata.
 IBM product evaluation versions:Download or explore the online trials in the
IBM SOA Sandbox and get your hands on application development tools and
middleware products from DB2®,Lotus®,Rational®,Tivoli®,and
 developerWorks profile:Create your profile today and set up a watchlist.
 XML zone discussion forums:Participate in any of several XML-related
 The developerWorks community:Connect with other developerWorks users
while exploring the developer-driven blogs,forums,groups,and wikis.
About the author
Lin Clark
Lin Clark is a Drupal developer specializing in Linked Data.She is the
maintainer of multiple Drupal modules,such as Microdata and SPARQL
Views,and is an active participant in the W3Cs HTML Data Task Force
and Drupal's HTML5 initiative.She attended Carnegie Mellon University
and is finishing a research masters degree at the Digital Enterprise
Research Institute at NUI Galway.More information is available at
IBM,the IBM logo,,DB2,developerWorks,Lotus,Rational,Tivoli,and
WebSphere are trademarks or registered trademarks of International Business developerWorks®
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 17 of 18
Machines Corporation in the United States,other countries,or both.These and other
IBM trademarked terms are marked on their first occurrence in this information with
the appropriate symbol (® or ),indicating US registered or common law
trademarks owned by IBM at the time this information was published.Such
trademarks may also be registered or common law trademarks in other countries.
See the current list of IBM trademarks.
Adobe,the Adobe logo,PostScript,and the PostScript logo are either registered
trademarks or trademarks of Adobe Systems Incorporated in the United States,
and/or other countries.
Other company,product,or service names may be trademarks or service marks of
Combine Drupal,HTML5,and microdata Trademarks
© Copyright IBM Corporation 2011 Page 18 of 18