How to Find and Fix PII in Google Analytics Data
Receiving a notice like the one below from Google can be a jarring experience.
The threat of being cut-off from a significant marketing channel to drive revenue is nothing to take lightly:
Here are more specifics from Google around this policy:
To protect user privacy, Google policies mandate that no data be passed to Google that Google could use or recognize as personally identifiable information (PII). PII includes, but is not limited to, information such as email addresses, personal mobile numbers, and social security numbers.
Many contracts, terms of service, and policies for Google’s advertising and measurement products refer to “Personally Identifiable Information” (PII). You may find in such contracts, terms of service, and policies a prohibition against passing information to Google that Google could use or recognize as PII.
What Google Considers PII
Google interprets PII as information that could be used on its own to directly identify, contact, or precisely locate an individual. This includes:
- email addresses
- mailing addresses
- phone numbers
- full names or usernames
How to Look for PII in Your Google Analytics
There are a few different methods to accomplish this.
The easiest way to do this is go to:
Google Analytics > Behavior > Site Content > All Pages
And then filter with @ so it looks something like this:
This will bring up any pageviews that have common emails in them.
Another option is to use the GA Debugger Google Chrome Extension and
Look for email addresses
If you need a more robust method to ensure you are looking for data like: firstname.lastname@example.org (instead of just the @ symbol) then insert this regex into the filter field:
This is a bit more strict in looking for the full email format.
Look for social security #’s
This regex looks for common social security # format of 111-11-1111:
Look for addresses
This regex looks for common address inclusions but is very subjective so it will need to be adapted to your own needs. The pipe symbol | is an OR condition.
Look for phone numbers
This is very similar to your social security regex but can be modified:
This matches the format of 800-867-5309. If you wanted to remove the – then it would look like this:
Look for names
This one is a bit more difficult to nail down but you can start with a regex like this that looks for names that are labeled:
How to Remove PII from Pageview Hits
The only real way to remove PII from your own Google Analytics pageview hits is preventing this PII data from being sent to GA in the first place.
And the only way to fully protect yourself is by putting a safeguard in place that strips out this data from your hits being sent to GA via Google Tag Manager.
NOTE: Filters do not constitute removing this data. Do not put filters in place and think this fixes your issue.
If you are on Shopify then you can use our Google Tag Manager Suite App which has this PII redaction tag already in place.
This redaction was made possible by the GTM guru Simo Ahava by utilizing the customTask function via a custom HTML tag that redacts this data within the pageview hit send to Google Analytics.
Once you’ve implemented one of these methods:
- Installing GTM Suite App and migrating Google Analytics hit data to GTM
- Implementing Simo’s method of sitewide GA tracking via GTM
Then it’s time to test.
It’s pretty simple to test this. All you have to do is go to your website and put an email into your URL like this:
Then you should start seeing the REDACTED EMAIL within your pageview hits like this:
Once you’ve implemented this PII restriction then it’s time to move on to mitigating bounce rate issues.