9. Data Layers

  • Description
  • Resources
  • Lessons

Lesson Overview

This comprehensive course covers everything you will need to become a Manage power user - from deploying tags, data layers and event tracking to organizing your TMS strategy and properly QA and troubleshoot.

Learners should have a basic grasp of client-side web technologies.


Hello and welcome back to the "Ensighten Manage Training Series." The topic of this video isn't one that is found inside of the TMS but one that can be so critical in a good TMS implementation. This is the topic of data layers. A data layer is an object on your site that carries useful, dynamic data about the page, the user, and anything else that might be relevant for your tagging or data needs. Dynamic is a keyword here because there's no sense in wasting you or your developer's time maintaining variables that only hold static values. An example of static data is the user's browser or even the current URL. These are elements that are unlikely to change based on user activity or context and can easily be retrieved by existing methods.

Dynamic data is everything else. Data that is difficult or impossible to retrieve without it being explicitly provided, data that may change based on user activity like current cart items, or data that isn't necessarily found on or related to the current page but might still be desirable. Data layer structure has taken a few forms over the years, but the one that's solidified is a JSON object that is set in a JavaScript of your page that is supplied its information by a server and/or client-side means. A properly implemented data layer can be the difference between perfect accuracy in your analytics and attribution tools and variance caused by factors outside of your control.

A data layer on your page functions as a sort of abstraction layer between all of the data you could possibly have and that which you want to provide to your external endpoints. In the past, this need led to a lot of duplication across vendors and missing or incorrect data retrieved from the wrong sources. The proper data layer, you can be certain that the data is both available and correct at all times. And so, your tags will always be accessing this data from the singular source of truth.

In the past, this sort of data was created as part of the base page HTML via metatags and sometimes with JavaScript defined for each individual vendor. The modern method is a single object in JavaScript at the window level that contains all useful data in a nicely formatted fashion for ease of access and understanding. This vendor-agnostic style of data layer creation removes the risk of your data disappearing when you change vendors over the years.

Some clear advantages and a few disadvantages too come along with the data layer setup. Good design principles abstract form and function, you're better able to keep a clean and efficient setup by placing things into a singular location. Reduce vendor lock-in by centralizing the data rather than only creating and managing pieces as they pertain to a particular vendor. No longer requires cumbersome vendor specific formatting leaving IT free to create a more universal structure while the tags can handle any modifications. Much quicker implementation of new marketing tools as the data is already available at all times. Better data consistency and governance because you know exactly where the data is coming from and have better control over when it arrives. And easier to troubleshoot and QA when you only need to look in one place for any issue you suspect is with your dataset.

For cons, we see an increase and required effort, time, and potentially cost at the beginning of projects. This is because there is an increase in development load. You need to add these to the page and make sure that they are ready when the tags are there. You also see the usual requirement of engaging with your IT or development resources whether they be in-house or a third-party and these resources need to create the data layer on your behalf.

The pros will almost always outweigh the cons for any project, but there are scenarios in which they don't. For example, your site has no use for complex data and you aren't interested in any analytics data beyond what's available as part of the user's journey. If your tagging needs are very simple and your data requirements are little to none, a data layer may not be necessary or helpful in satisfying your tagging needs.

There are alternatives to the data layer, of course, but they're more custom and significantly more fragile. If the information desired would normally be available on the page in the HTML, then a technique called DOM Scraping can be employed. JavaScript is certainly capable of requesting just about anything from the current state of the page, but issues arise as time passes. No website is static. They change and update over time and that means when you use DOM Scraping, you have to change and update your tagging too. If your initial implementation rely on locating the order ID by searching for a specific HTML element with a class user order ID and then six months later, a site redesign removed that element, then, of course, you can expect your tags will at minimum fail to find the number and also possibly break outright, which loses all of your data instead of just the one field.

DOM Scraping scenarios only have access to data that's on the page. If you needed to know the campaign ID that the user landed on the site with when they're reaching the confirmation page, you will have had to store it in some HTML element the entire way for DOM Scraping to be capable of giving it to you.

As you heard earlier, a data layer's most common and effective format is called JSON or JavaScript Object Notation and is typically set globally at the window level of your page. The W3 web standards board has published a specification for JSON data layers. If you're looking for a great example and if you decide that's not the format you want to follow, then tagging a manager is certainly capable of adapting to whatever you land on.

Now that I've convinced you that you definitely need a data layer, let's talk about placement. Now, your site as a whole, you should strongly consider placing the data layer on every single page. But if that's not an option for some reason, then be sure it's at least on all pages that do or could ever have tagging. Once you're down to the page level, the ideal option would be that the data layer is both created and populated synchronously in the head above the bootstrap. That would mean that the data is always available and complete by the time any tagging could possibly need it.

However, we've learned that this is a bit unrealistic in practice. Sometimes information simply is not available until partway through page construction, and that means async and unreliable timing. In that scenario, the preferred action is a two-step approach. One, define the complete data layer in the head above the bootstrap, but only populate it with information available at that time. Two, as the remaining information becomes available during page load, populate the data layer appropriately. Additionally, as an optional step three, consider triggering an event of some kind to indicate that the data layer has completely populated. We'll discuss handling these events in a future video.

Step one is still beneficial as any tags who need data will at least know that there is a data layer and if the data isn't available yet, they can be set to wait for it. When your data is loading during the async period, you lose some guarantees of data accuracy and also the ability to perform some real-time actions like AB testing if that test would rely on unreliable data. Tags relying on that data also need to be set up to fire later or with some form of monitoring to react when that data becomes available.

Some important best practices can really help avoid headaches and further work down the line. Avoid using static data. Data layers are best suited to carry dynamic data that changes from page to page or based on user action. Static data can be hardcoded into tags since it never changes. Populate your data layer using server-side means. A data layer's variable should be populated using server-side sources of data during page construction, not scraped out of the DOM during or post-load. The latter is no better than if we did so ourselves and comes with all the same problems. Keep it tag or vendor agnostic. This dataset will be useful for all vendors, tag management, analytics, marketing and more. There are no benefits to defining vendor-specific variables that will eventually become irrelevant when you inevitably move on from that vendor or the vendor's variable needs to change.

Avoid deeply nested hierarchies or unnecessary flatness. The data layer structure can become confusing when you ignore good organizational planning. Only use as many hierarchies as are necessary. You want a data layer that's as flat as it can be without being so flat that it has one layer and everything is difficult to find due to similar naming. Avoid including elements that were already universally available or easy to locate. Example of these would be elements like the current page URL or its referrer.

When constructing a data layer, a good design flow to follow would be determine all necessary and useful dynamic data points. A good audit of your existing vendor tools and their data needs can provide a great base for building your data layer. Ensure there's no overlap in those data points. Reviewing the data you use currently can ensure you don't have unnecessary duplication. Assign those data points a uniquely identifying name. And if it makes sense to do so, group them with others like them. For example, information about the user such as their login status and user ID could be grouped under a parent structure of some sort. Specify acceptable variables and default states for each data point. You should know what kind of data to expect in each of these. Is it going to be a string of characters, a number, some true or false Boolean value? And when the value is unknown, does it default to no or empty or maybe zero? Define the scope of each data point. Which pages should that variable be populated on? Does it need to be present at all times like a page name or should it only exist in certain locations such as your shopping cart's item details?

That wraps up this video. I know it was a long one, so make sure you're taking a break and letting the information settle in. See you next time. We're talking about data definitions in manage.