Tech post: Resurrecting GA tracking in cross-domain iframes (ITP cleanup)
Who among us is using cross-domain iframes? More than we’d like to admit, I think. Technically, they’re just so so SO terrible, but governance-wise, they’re a necessary evil for many organizations. Here are a couple of good reasons to wade into this type of setup:
Iframing-in third-party functionality like a payment flow or online brochure, webinar, etc. that lives on a vendor’s domain
In global enterprises with distributed teams that need to capture leads in local markets, CRM and forms are often managed locally, while most web content is handled centrally; so, the local forms are deployed on local domains, and displayed via iframes within the global web content
But DID YOU KNOW that when ITP is enabled, Safari blocks any and all GA data from being sent from within these iframes?? (For many users - some technical exceptions are below.)
After an arduous trip into the depths of Safari, I now understand what’s happening and how to fix it.
Wait, what? How did I not notice this?
This issue originally came to me as a report of missing data from a few countries in the middle east (for a global enterprise). The report was that none of their forms were sending form submit data to GA from iOS. More information about my debugging process is below (it was an interesting journey) but I’ll just cut to the chase: that report wasn’t quite right, but it definitely wasn’t all wrong, either. It was Safari, not iOS, that had the problem. But, some Safari (and iOS) data absolutely did exist, because:
Old Safari versions (pre-ITP) are fine
Safari users who have disabled ITP are unaffected
If a form is viewed outside of its iframe wrapper, data is unaffected
If the user has a pre-existing _ga cookie on the iframe content’s domain, data is unaffected
Chrome and IE are unaffected
So if you’re looking at form data in GA, there’s still traffic to your iframe content being recorded - even Safari traffic. Unless you dig into the page + browser data, you may simply not notice. But it’s happening, and your data has vanished.
The scene:
Let’s imagine I have a wrapper page, on mywrapper.com, which loads a form in an iframe from myiframeform.com.
I’ve already implemented cross-domain tracking on this iframe (see Google’s docs for more info), so the URL of the content on myiframeform.com already contains a _ga (linker) parameter. Both domains are in my Referral Exclusion list, of course. This is backed up by my GA data.
For ease of discussion, let’s also assume that the mywrapper.com page has a link to the same myiframeform.com page. The link also gets the same _ga parameter.
Also, I have event tracking implemented on submission of the form on myiframeform.com.
What’s wrong?
Before you start, in Safari, clear all cookies and make sure ITP is enabled by checking “Prevent cross-site tracking” in the Preferences dialog. Unless you’ve previously disabled it, it’s probably already checked.
Then, here’s the recipe:
Navigate to the page on mywrapper.com.
A GA PV from mywrapper.com will go out. You can observe this in Web Inspector and/or GA’s real-time reports.
If the iframe loads automatically, you should immediately see a hit get sent to GA from myiframeform.com. But, you will not.
If you submit the form on myiframeform.com, that hit won’t go out, either.
If you check - a _ga cookie now exists on mywrapper.com, but not on myiframeform.com.
Back on mywrapper.com, click the link to myiframeform.com. The same content now loads outside of an iframe.
A GA PV from myiframeform.com will go out, as it should.
If you submit the form on myiframeform.com, that hit will go out as well.
A _ga cookie now exists on myiframeform.com.
Return to mywrapper.com, and refresh the same wrapper page. This is where it gets weird:
A GA PV from mywrapper.com is sent
A GA PV from myiframeform.com is sent
If you submit the form on myiframeform.com, that hit is sent as well
All hits are received properly by GA.
All hits correctly use the same client ID.
If you repeat this with ITP disabled, all hits are sent as expected - so step 1 above behaves like step 3.
First-party vs Third-party cookies
All analytics people know to fear ITP due to the short cookie expiration periods on GA (and other) first-party cookies, but ITP completely prevents third-party cookies from being set at all. What does this have to do with GA, you may ask?
I always thought that 3rd-party == 'not the domain of the page that's trying to set the cookie', but I realize now that it's == 'not the domain *that matches where the user thinks they are, based on the browser's location bar*'.
If you think about it, this makes sense. If it wasn’t this way, all trackers would just use iframes to get around the third-party cookie limitations. Instead of “the CreepyTracker pixel I loaded on mydomain.com was blocked from setting a cookie on creepytracker.com”, it would be “the CreepyTracker JS I loaded invisibly iframed-in content from creepytracker.com, which set a cookie on its own domain (creepytracker.com)”. And then all of the limitations would be pointless.
So, where the iframed content is from a domain different from the wrapper page’s domain, anything loaded in that iframe is treated as 3rd-party. GA is a third-party tracking tool in this case. When GA tries to drop its cookie, the cookie simply isn’t set (when ITP is enabled).
What does the cookie have to do with blocking the hit?
I always assumed that GA sent its hits, then set its cookies - but this is also wrong and it’s the other way around. You can see this in GA’s list of tasks in the Tasks API, which are listed in the order in which they occur:
Not only does GA check for available storage before even building the hit (let alone sending it), but if storage isn’t available - ie if a cookie can’t be set - then GA aborts the hit! Luckily, it’s also the Tasks API that can save us here: it’s possible to override this so that the hit building and sending just continues. (Code below!)
…but wait. There must be a reason that GA does that, right? Well, one way I’m thinking of it is this: if GA can’t set a cookie (or use some other storage method, but we’ll stick with cookies in this example), that means that it’s going to generate a new clientId on every single page. In a way, that would completely break the concepts of users and sessions; each PV would look like a new user and new session, which would (virtually) always be a bounce. It’s bad, broken data. Do you want bad, broken data? You do not.
So if we’re going to override checkStorageTask, we probably want to be judicious, and leave ourselves some breadcrumbs. Google has some documentation about how to do a cookieless implementation, but we don’t want to do cookieless all the time - only when absolutely necessary. And only when we have a known good clientId to use.
The solution
As Simo Ahava has written about extensively, we can use customTask to update/override GA’s built-in hit processing, including checkStorageTask. We’ll catch an error if one occurs, continue the hit with some modifications if a known good clientId is available, and pass the error on if not (which will abort the hit). In addition, I want to be able to easily see how many “storage-blocked” hits I’m sending, so I can make sure they’re working correctly, and hopefully give myself a baseline to estimate how much traffic I missed before this solution was in place. I’m going to track into custom dimension 189 below. I’ll also use one other GTM variable I’ve created called {{query _ga}}, which is exactly what it sounds like: a URL variable that grabs the value of the _ga query parameter, if it exists:
I’ve added the following as a GTM variable, then used it in my standard GA Settings (as described in my prior post about tracking hit size) as the value for the “customTask” field:
function() {
return function(model) {
// Update checkStorageTask for Safari ITP:
// If can't set cookies, it normally aborts the hit;
// Instead, if the _ga param is present in the URL, send the hit anyway & ignore the error
// Grab a reference to the default function.
var originalCheckStorageTask = model.get('checkStorageTask');
// Set a new checkStorageTask
model.set('checkStorageTask', function(checkStorageTaskModel) {
// First, try the standard checkStorageTask and catch errors
try {
originalCheckStorageTask(model);
} catch(e) {
// An error has occurred - cookie can't be set
// Make sure the _ga param is in the URL,
// using a GTM var that does exactly that
if ({{query _ga}}) {
// The _ga query is already in the model as the clientId
// Set ‘storage’ to ‘none’ to be tidy
checkStorageTaskModel.set('storage', 'none');
// Modify the hit slightly
// In my case - identify this with a custom dimension value
// so we know we did this
// Breadcrumbs make it easy to debug and measure
// Comment this out if you don't want to do that
checkStorageTaskModel.set('dimension189', 'storage-blocked hit');
// Otherwise, allow the hit to be aborted as usual
} else {
throw (e);
}
}
});
}
}
(Note that in reality I added it directly alongside my existing track-hit-size-and-cleanse-PII script; it’s easy to add both in the same customTask variable, and particularly straightforward since they modify different tasks.)
IT WORKS
Try the bug recipe again with this published, and you’ll get all of your hits!
…except…
This solution only works for hits sent on the first page that’s viewed in the iframe, because it only handles hits where the ga value is present. [EDITED 2/10/2020 - In fact it only works for the first 2 minutes of the life of the first pageview, before the linker (ga) param expires. Many thanks to Simo (again!) for pointing out that a named tracker would persist the value from the _ga parameter for the life of the page; that’s worth pursuing but it’s not without its issues.] You could work around this by storing that value in localStorage or sessionStorage, then looking for it there as well as in the query string, but I didn’t. Simo covered that as well, so check that out for more info. I didn’t bother with this because I really only need the PVs and form submits on the first iframed pages, but I may add that later on.
Surprising cases where this isn’t necessary
If the user already had a _ga cookie set on iframe.com, then GA doesn’t need to drop a new one, and the hit goes out. Nothing’s blocking the hit from being sent (except GA itself!) - ITP is only stopping the cookie from being set. So if the cookie exists somehow, all is well. That means that:
If myiframeform.com is visited outside of an iframe, then the cookie already exists and data will be unaffected. If this is the case for many users, the effect on your overall data may be small.
If the user previously visited myiframeform.com via the wrapper page/iframe pre-ITP, then the cookie should still exist, and is there waiting to be used (so data should be sent per usual)
If the wrapper & iframe content are on different subdomains of the same top-level domain, and the _ga cookie’s cookieDomain field is set to ‘auto’, then the cookie is set by the wrapper page, and the iframe page can read the already-existing cookie - so everything works normally. (Unless you’re somehow firing your iframe page’s GA PV hit before the wrapper page’s GA PV hit, which would be odd but could happen.)
As mentioned before (but worth repeating), if the user has disabled ITP or is using a different browser, everything works normally. In different browsers though, there are generally still ways to disable third-party cookies, which would create the same problem. So, it’s possible that this solution will bring back some data from other browsers as well.
Conclusion
If you have cross-domain iframes with GA tracking inside the iframe, you’re currently missing a probably significant portion of your data. You can do a few things with that:
Nothing. Wave goodbye to the data.
Move your cross-domain content onto the same TLD (subdomain is fine) that the wrapper page uses. Then, things work automatically.
Stop iframing the content in - just link to it.
Add a customTask to handle situations where the cookie is rejected.
I will definitely deploy my solution everywhere it might be helpful, but when the topic of cross-domain iframes arises, I’ll continue to think of a presentation which included this inspiring slide:
We can make it work, but let’s avoid if at all possible.
Addendum: debugging
It took a long time to figure out what the problem was here. Reports were clear (ish?) but I couldn’t reproduce them, and it was one of those “unknown unknowns” situations. So I thought I’d summarize how I even got to a point of finding this issue.
As I mentioned above, this issue came as a report of missing data from a few countries in the middle east (for a global enterprise). The report was that none of their forms were sending form submit data to GA from iOS. I asked for clarification, and was told that the pageviews of the form were appearing in GA, but the form submits were not.
My first step in this kind of case is to check the data itself. So, I navigated to a form page, saw that the form itself was in an iframe (as expected), got the URL of the iframed content, and looked for the iframe pageviews in GA - there were some, but not many, from iOS. Then I looked at the form submit events: again, there were a few from iOS. When I reviewed the conversion rate between PVs/submits, it was approximately the same across OSes and browsers. Nothing in that data indicated that anything was wrong, so I suspected a misunderstanding somewhere. It seemed like the user thought that the overall iOS numbers should be higher due to their marketing campaigns; but I thought that there could be some mis-targeting somewhere (media agencies do make mistakes like that) - maybe they just weren’t showing ads to iOS, for example.
“iOS” can be somewhat synonymous with “Safari”, and debugging on desktop is so much easier than mobile, so I tried it myself. At the form’s wrapper page, I filled out and submitted the form with no problems. I saw the pageviews and the form submit events; the hits went out (I viewed them in Safari’s Web Inspector), and the data appeared in GA’s real-time reports. Everything looked fine. We did some more back-and-forth, and the discussion stalled - I couldn’t reproduce the bug and we were just stuck.
The reporter - who by the way was on the other side of the world, so it was difficult to do a debugging session together - tried to replicate this himself. He’s experienced with GA, and he told me that when he viewed and submitted the form on his phone, he saw the pageview, but not the form submit. That’s pretty weird. I started wondering if my PII-cleansing script might be conflicting with some Arabic character encoding, and maybe that was causing an error. (Have you ever tried to debug encoding in iframes on Safari? It’s….not….fun.)
Simultaneously - but unrelated - I was trying to update and streamline my overall cross-domain iframe handling. It’s a separate story, but I was testing some postMessage clientId-passing functionality on a different set of iframed forms, and when I tried to send hits using my postMessage clientId instead of using the sometimes-already-existing _ga cookie on the iframed content, I lost data. I’d narrowed that down to Safari as well, and again, it worked often, but not always. In that process, I tried to debug with a combination of enabling/disabling ITP, and clearing cookies on one or both domains. The behavior was spotty - sometimes data was sent to GA, and sometimes not. Inconsistent results are the worst, because it means you haven’t really found the problem. I was ready to give up on that, and turned some of my attention back to trying to replicate the form submit problem.
Now though - I was able to replicate the problem! Why weren’t my submits going through now? And where were my pageviews? I started opening the iframe source directly and trying there, thinking maybe it was a Safari logging problem. Everything worked there, outside of the iframe. Then I tried the iframe again, and the data WENT THROUGH. What?!?! Was I losing my mind?
A few main things:
At the beginning of all of this, I had ITP disabled (surely because I was debugging something else earlier). These issues only apply when it’s enabled.
When I cleared cookies, I wasn’t always clearing all cookies - I sometimes just cleared cookies for the wrapper page’s domain. As long as the _ga cookie on the iframed content’s domain already existed, ITP didn’t block it from being used - so that’s why some hits went out.
I often opened the wrapper page in one tab, and the iframed content in another tab, so I could compare them to try to figure out if maybe my cross-domain linking was breaking something. But every time I opened the iframed content outside of the iframe, my GA cookie was set, and my whole test was invalidated - though I didn’t know it at the time. I had a lot of “I SWEAR that didn’t happen when I tested this a second ago” moments.
Speaking of which, there were times when a hit appeared to be missing in Safari’s Network panel, but it appeared immediately in GA’s real-time reports. I still don’t know what that was about. I don’t have any crazy filtering applied in Safari, so it makes no sense; my only takeaway there is to always check RT when debugging Safari. <shrug>
Debugging GTM on Safari is terrible when ITP is enabled! I tried the workarounds to accept GTM’s 3rd-party cookies so I could use the debugger, but even if I could get this to work, it was broken every time I cleared cookies to test my ITP problem. I gave up on this and instead wound up pushing versions with plenty of console logging live - not dangerous exactly, but not my favorite thing, and definitely not great workflow-wise.
While trying to streamline my iframe tracking via postMessage, I had a hard time debugging in Safari. It’s easy to switch contexts in Chrome, to view the iframed content’s messages separately from the wrapper page. I couldn’t find a way to do that in Safari, and things were working correctly in Chrome (so it couldn’t help). This slowed me down.
When the original-bug-reporter said that pageviews were being tracked, he meant that wrapper page pageviews existed in GA. But I was looking at iframe page pageviews. So my Safari conversion rate was based only on pageviews where the cookie was already set, but his was based on all Safari views of the wrapper (of which there were many more). If I’d noticed that, I think I would have reached this conclusion more quickly.
Since the issue originally came from a single market that uses a non-Latin script, I thought the problem might be related to encoding. In reality it’s a worldwide problem and has nothing to do with language, but no other market happened to notice it - probably because there are still form submits, GA tracking never perfectly matches CRM data, there’s still traffic from Safari, and most people would only look at the wrapper page (which is fine) to measure pageviews.
That was long, but hopefully worth it. Thanks for reading and please get in touch - Twitter works, or angela (at) angelagrammatas (dot) com - if you have any thoughts or questions. Happy iframing!