Blog Layout

How to Make Google Analytics HIPAA Compliant

Lesley Van De Mortel • Dec 02, 2023

10 Steps to Ensure Google Analytics (GA4) is HIPAA Compliant by De-Identifying Your Data

In this article, you will learn the steps required to ensure Google Analytics 4 complies with the HIPAA rules by de-identifying the data before it gets stored on Google's servers.

Screenshot of Google Analytics showing Redact data details for HIPAA compliance.

Before diving into the technical setup to ensure Google Analytics is HIPAA compliant, we must first understand what HIPAA is and what it aims to protect. HIPAA (Health Insurance Portability and Accountability Act) was enacted in 1996 primarily to modernize and improve the portability and continuity of health insurance coverage for individuals when they change or lose their jobs.


One of the most significant aspects of HIPAA is the establishment of rules and regulations to safeguard the privacy and security of patients' health information. This is achieved through the Privacy Rule, which sets standards for the protection of individually identifiable health information, and the Security Rule, which establishes security standards for electronic health information.


In short, it includes strict rules and regulations regarding the unauthorized disclosure of Protected Health Information (PHI). When talking about PHI, most assume that PHI consists of treatment information, billing information, and other patient data. While this is considered PHI, there are lesser-known unique identifiers that aren't so obvious, but are still considered PHI.


For example, IP addresses are unique identifiers, and user_id cookies stored in browsers and collected by Google Analytics are considered unique identifiers. These identifiers become PHI once connected to a healthcare provider's (covered entity) website and are then protected under HIPAA. The HHS' Office for Civil Rights issued a bulletin back in December of 2022 explaining in detail that whenever PII (Personally Identifiable Information) is stored in the same dataset as health data, it becomes PHI, even when the individual does not have an existing relationship with the covered entity.


Here's a direct quote from the bulletin:


"All such IIHI collected on a regulated entity's website or mobile app generally is PHI, even if the individual does not have an existing relationship with the regulated entity and even if the IIHI, such as IP address or geographic location, does not include specific treatment or billing information like dates and types of health care services. This is because when a regulated entity collects the individual's IIHI through its website or mobile app, the information connects the individual to the regulated entity (i.e., it is indicative that the individual has received or will receive health care services or benefits from the covered entity), and thus relates to the individual's past, present, or future health or health care or payment for care."

10 steps to ensure the use of Google Analytics is HIPAA compliant

There are certain steps we must take to ensure the use of Google Analytics complies with the HIPAA rules. Google Analytics doesn't sign Business Associate Agreements (BAA) with covered entities, which is a requirement under HIPAA when the business associate transfers, stores, or has access to PHI. This means that in order to be compliant, we must ensure that Google's servers do not collect and store identifiers.


Since we can not sign a BAA, by implementing the 10 steps outlined in this article, we can ensure the use of Google Analytics complies with the HIPAA rules.


In this article, I will walk you through the 10 required steps to prevent Personally Identifiable Information from being collected and stored in Google Analytics, even when using Google Tag Manager.


If you're not using Google Tag Manager, the easiest way to redact data is from within Google Analytics 4. Let's start with GA4 without (S)GTM first..

Step 1 - Google Analytics 4:

Log into your Google Analytics account and navigate to your Admin section as shown in the screenshot below.

Screenshot of Google Analytics showing Data collection details.

Step 2 - Google Analytics 4:

After Step 1, navigate to your Data collection, as shown in Step 2 in the screenshot above.

Now that you're in the Data collection settings, ensure all boxes are unchecked. They are unchecked by default, but they may have been turned on previously by other team members or an agency. These settings allow Google to collect "Signals," primarily based on unique identifiers such as GEO location and other device identifiers.


Google Signals leverages third-party cookies to link user data collected from websites with the user's login information on Google services. As a reporting identity, it enhances cross-browser and cross-device reporting, but of course, this is not compliant with the HIPAA rules, and there are state privacy laws where this could be an issue if misconfigured.

Screenshot of Google Analytics showing Google Signals data details.

Step 3 - Google Analytics 4:

Next, navigate to the Data retention settings on your left, as shown in the screenshot below, and set the Event data retention to the minimum setting of 2 months, and ensure the Reset user data is switched off. Configuring these two settings to the minimum will help comply with the HIPAA Minimum Necessary Standard as part of the HIPAA Privacy Rule.

Screenshot of Google Analytics showing Data retention details.

Step 4 - Google Analytics 4:

Next, head back to your Admin dashboard and navigate to the Reporting Identity, as shown in the screenshot below.

Screenshot of Google Analytics showing the Reporting Identity details.

Step 5 - Google Analytics 4:

Ensure that your setting is set to Device-based inside the Reporting Identity settings, as shown in the screenshot below. This may already be set to Device-based because we turned off Google Signals in the previous steps, but double-check.

Screenshot of Google Analytics showing the Reporting Identity settings details.

Step 6 - Google Analytics 4:

After completing Step 1 to 6, navigate back to your Admin dashboard and go to your Data streams, as shown in the screenshot below.

Screenshot of Google Analytics showing Data streams details.

Step 7 - Google Analytics 4:

After clicking into your Data Stream, you have to click through one more time to access the settings of your Data Stream, as shown in the screenshot below.

Screenshot of Google Analytics showing how to add Data streams details.

Step 8 - Google Analytics 4:

Inside your Data Stream, scroll down to where it says "Redact data," and click to open the settings.

Screenshot of Google Analytics showing Redact data details for HIPAA compliance.

Step 9 - Google Analytics 4:

Inside the Redact data settings, you have two main options: enabling the redaction of email addresses, and URL query parameters. To start, toggle on Email redaction so that Google Analytics does not collect and store email addresses. You can read Google's [GA4] Data redaction article for a more detailed explanation.

Screenshot of Google Analytics showing Redact data settings for HIPAA compliance.

Step 10 - Google Analytics 4:

After enabling the Email redaction, it is time to add query parameters. Please list all those that could contain unique identifiers. Obviously, this is different for every healthcare organization, depending on the type of data they collect. However, a good starting point would be to add first name, last name (or first_name and last_name, depending on your setup), and phone, as shown in the screenshot above. You'll notice that I added the email parameter here again and this is probably overkill, but I use it as a wildcard for "just in case."


Again, if your organization collects additional information in their web forms such as zip code, city, or any registration number fields that could contain unique identifiers, you should list them there too.

For most smaller healthcare practices, having completed the 10 steps above, your use of Google Analytics 4 should now comply with the HIPAA rules. However, be sure to verify this with your legal department as it's still not a guarantee, as it depends on your setup.


But what if I don't know which URL query parameters my website and web forms collect?


I've got you covered. Let's dive into that in the next section of this article.

How to find URL query parameters in Google Analytics that may contain PHI

As mentioned earlier, every healthcare organization is different and collects different information, but this is equally true for their technology stack.


There are so many different Content Management Systems (CMS) from Wordpress to Duda, Wix, Squarespace, and many others. Whichever CMS you use, there are potential pitfalls to HIPAA compliance, and to learn more about how to build a HIPAA-compliant website, you can read this article by HIPAAjournal.com.


Since Google Analytics 4, by default, doesn't store IP addresses, you don't have to worry about them for now. If your organization assigns user_id in the web browser and stores that in a cookie to be collected by Google Analytics as part of your reporting setup, you must assess whether that data can be used to identify users when it gets stored in GA. If this is the case, you must stop collecting that data to comply with HIPAA. Google Analytics 4 doesn't assign user_id by default, so you don't need to worry about them if your organization isn't setting these cookies manually.


There are various ways of identifying potential harmful URL parameters that Google Analytics 4 may be collecting, and we'll go through one of them now: A PII Redaction Report.


We'll create an Exploration report in Google Analytics 4 to identify URL parameters based on conversion events (form submissions, for example).


I'll walk you through the four steps below.

PII Redaction Report - Step 1:

Log back into your Google Analytics 4 account and navigate to the "Explore" tab on your left, as shown in the screenshot below.

Screenshot of Google Analytics showing Explore report details.

PII Redaction Report - Step 2:

Click into the Explore tab and select a blank canvas to create a new PII Redaction report for one of your conversion events in Google Analytics 4.

Screenshot of Google Analytics showing Explore blank canvas.

PII Redaction Report - Step 3:

In your blank canvas, as shown in the screenshot below, you would need to select the following:


  • Dimensions: page_location and event_name.
  • Metrics: sessions
  • Rows: page_location
  • Filters: generate_lead


Look at the screenshot below and add these parameters precisely where I did.


In this instance, I have selected generate_lead as this is the event that is triggered when a user submits a web form on the website. Using a conversion event like generate_lead for this report will show you the URLs and their query parameters sent to Google Analytics upon submitting the web form, which, of course, has the highest likelihood of containing PHI.


Select your date range, whether the last 30 days or the last six months, and look through the list of URLs shown on your right for any parameters containing unique identifiers.


If there aren't any parameters that "potentially" contain unique identifiers, you don't need to do anything else other than the ten steps listed in the first section of this article, and your use of Google Analytics 4 should comply with the HIPAA rules; if you do see parameters that contain unique identifiers, then you would add those individual parameters to the Redact data settings in your Data streams as shown in the first section of this article.

Screenshot of Google Analytics showing an Explore report with URL parameters.

PII Redaction Report - Step 4:

IMPORTANT: Depending on your marketing and website, you may have different websites with different web forms or portals, and different conversion events. If so, you will have to duplicate the PII Redaction report multiple times and change the filter to reflect the other conversion actions or login portal URL's (authenticated) to ensure those actions don't collect any PII in the URL parameters.


Once you have all your reports built out, you should check these reports periodically to ensure no new parameters are sending unique identifiers over to GA4.


Congratulations! While these steps DO NOT GUARANTEE that your Google Analytics 4 setup is HIPAA compliant, you are now in a much better place than before and you shouldn't be too far off.


One thing to note, HIPAA compliance differs from state privacy laws and while your set up may comply with the HIPAA rules, it may not comply with your state's privacy laws just yet. For example, Washington state has recently enacted the most stringent health privacy law, which has a scope much broader than HIPAA, and since HIPAA preempts state law, you'd need to comply with that state law.


If an external agency handles your marketing and analytics, or if you're looking for a HIPAA-compliant marketing agency, I wrote an article on how to find a HIPAA-compliant marketing agency that you may find useful.


But what if I use Google Tag Manager to build a more complex tracking setup and to collect more web data than I can with the standard setup discussed above?


For this, you would need a Google Tag Manager (client-side) setup in combination with a Google Tag Manager Server-Side setup (yes, there are two different versions of Google Tag Manager). No worries, I've got you covered with this too. 

How to de-identify your data in Google Analytics using Google Tag Manager Server-Side (SGTM)

Many healthcare organizations leverage a tag manager solution like Google Tag Manager (client-side) deployed by their in-house marketing team or external marketing agency. Google Tag Manager, in combination with Google Tag Manager Server-Side is the better way of managing website tracking as it allows for much more control and flexibility in tracking various actions and events while staying compliant with HIPAA and other privacy regulations, such as CCPA, CPRA, and GDPR, for example.


We will continue this article by diving deep into Google Tag Manager Server-Side and how we must set this up to ensure no PHI gets collected or stored by Google Analytics 4.


Before we dive into the technical setup of SGTM and the GA4 configuration, I must inform you that the above steps could conflict with the information I will go through below. This is because the GA4 configuration in GTM will exclude the parameters before the GA4 client in SGTM claims the requests, and this may or may not interfere with what you're trying to achieve in SGTM. Of course, you can always TBV (trust but verify) this in preview mode and see whether this is an issue for your particular use case.


A final caveat before diving in: the following strategy is for individuals with a solid foundation of Google Tag Manager. If you have no experience with GTM, you will not understand the following sections, and my recommendation is to learn how to implement a standard GTM tracking setup first. For those familiar with a standard GTM implementation (client-side) but not Server-Side, I recommend taking the Server-Side Tagging in Google Tag Manager by Simo Ahava.


Right, now we've got that out of the way, let's dive in...

First, a sneak peek at the end result

The screenshot below shows a URL parameter called ip_override, which collects the IP address. What I've done to prevent this from being sent over to Google Analytics is leveraging a transformation inside of SGTM to hash this value.


For this example, we'll use the ip_override parameter that is showing in the screenshot below, as this is the only unique identifier left in this dataset. You'll see a ton more data in the screenshot below, but none of that data is considered identifiable and, therefore, not protected by HIPAA. Remember, HIPAA only applies when unique identifiers are found in the exact same dataset as healthcare data. Without this combination, there is no PHI (Protected Health Information).


This doesn't look too difficult at first glance. Unfortunately, it is. Hashing that particular parameter is relatively easy, but it requires building a Google Cloud Run server infrastructure to accomplish this little task. While I won't be covering how to build a Google Cloud Run server infrastructure (maybe in a future blog), I will guide you through removing PII/PHI from your dataset using Google Tag Manager Server-Side and its transformations.

Screenshot of Google Tag Manager showing preview mode details.

Again, the following sections assume you already have experience working with Google Tag Manager and have set up a standard client-side GTM implementation that sends data to your SGTM environment. If you don't have this yet, and you're unsure about how to set this up, feel free to book a consultation call, or apply to work with me, and I will help you with this.


The screenshot below shows a standard client-side implementation that is configured to send all data to a dedicated server.

Screenshot of Google Tag Manager showing the GA4 configuration tag.

The screenshot below shows the GA4 Client configuration inside of SGTM. The sole task of the GA4 Client is to claim the requests from the server and process that data before it gets sent to third-party platforms through the tags.

Screenshot of Google Tag Manager showing the GA4 configuration tag with server connection.

In the screenshot below, you'll see a Transformation configuration that is set up to hash the ip_override parameter in the Event Data before it gets sent to Google Analytics 4. In the next few sections, we'll go over how to set this up exactly.

Screenshot of Google Tag Manager showing a transformation hashing PHI for HIPAA compliance.

How to Hash unique identifiers leveraging Transformations in SGTM before sending de-identified data to Google Analytics

Follow the steps below in sequential order. We'll go over the Transformations and all the variables that need to be set up to be able to isolate the individual parameters, and then hashing that data before it gets sent to Google Analytics.


Let's dive in.

Google Tag Manager (SGTM) - Step 1:

Head over to your Variables section in SGTM and add a new User-Defined Variable. The sha256 Hasher isn't listed in the default options, so you will need to click into the Community Template Gallery. Simply type in "hash," and you'll see the variable template appear. Select the sha256 Hasher variable by Simo Ahava as shown in the screenshot below, and add it to your workspace.

Screenshot of Google Tag Manager showing the hashing variable for HIPAA compliance.

Google Tag Manager (SGTM) - Step 2:

The following couple of steps are relatively easy, but they have to be in the exact order for it to work. First, we're going to click into sha256 Hasher variable and clicking into the grey building block, which we'll open another selection of variables, as shown in the screenshot below this one.

Screenshot of Google Tag Manager showing the sha256 hasher variable for HIPAA compliance.

Google Tag Manager (SGTM) - Step 3:

Since there is no pre-built Event Data variable, we'll have to create our own by going to the top right corner of your screen and clicking into the +.

Screenshot of Google Tag Manager showing a variable for HIPAA compliance.

Google Tag Manager (SGTM) - Step 4:

Here we select the Event Data variable.

Screenshot of Google Tag Manager showing an event data variable for HIPAA compliance.

Google Tag Manager (SGTM) - Step 5:

In the Event Data variable, we add the parameter "ip_override," and name this variable: Event Data - ip_override. Then hit the blue save button in the top right corner.

Screenshot of Google Tag Manager showing an event data variable for the sha256 variable.

Google Tag Manager (SGTM) - Step 6:

Now you will see your Event Data - ip_override variable show in the sha256 Hasher variable, as shown in the screenshot below. Name this variable: Hash - ip_override and save this variable again.

Screenshot of Google Tag Manager showing the ip_override variable hashed for HIPAA compliance.

Google Tag Manager (SGTM) - Step 7:

The Hash - ip_override variable will now show in your transformation below the Augment event, as shown in the screenshot below. Be sure to add ip_override in the Name section, as shown below.


Under the Matching Conditions, you would need to add GA4 equals Client Name, and select the Google Analytics 4 tag under Affected Tags by choosing Some Tags, not all tags.


Again, this SGTM section of the article is based on the assumption that you already have this portion of SGTM up and running.

Screenshot of Google Tag Manager showing the augmented event transformation for HIPAA compliance.

Congrats, you now have a Google Analytics 4 tracking solution using GTM and SGTM to prevent PII from flowing to Google Analytics and being stored on Google's servers.


Please note: The ip_override parameter is used as an example only and is the only specific parameter covered in this article; there may be more or different parameters in your setup that could send PII to Google Analytics, and you need to double-check this before you can be confident that you comply with the HIPAA rules.


Here are some other parameters that contain unique identifiers:


  • _gclid = Google's click ID - when you're running Google ads
  • _fbclid = Facebook's click ID - when you're running Facebook ads (Meta)
  • _msclkid = Microsoft's click ID - when you're running Microsoft ads (Bing)


The above parameters are automatically appended to the URL when you're running paid ads on their platforms and may not be the only parameters sending unique identifiers through. You or your team may be advertising on other platforms, and they will probably append parameters to the URLs when users click on their links. When you identify other potentially harmful parameters, repeat the above steps and hash all those parameters.


But what if I want to leverage Google and Facebook ads to generate more leads for my practice? How would I send conversion events back to those advertising platforms for reporting and optimization purposes without sending PHI and violating the HIPAA rules?

Using SGTM Transformations to send conversion data back to advertising platforms

I will cover this in-depth in my next article. However, that approach would be similar to the SGTM/GA4 setup covered in this article, except you would do the inverse by removing the treatment data (URLs/Titles) from the dataset instead of the unique identifiers in the case of GA4. This is because Google Analytics already has all the URLs and page titles stored, and advertising platforms already have all the unique identifiers on their end, as most people are logged into their services. 


Here's a screenshot of what that could look like:

Screenshot of Google Tag Manager showing the augmented event transformation for HIPAA-compliant advertising.

Conclusion: Healthcare Organizations Can Use Google Analytics in a Way That Complies With The HIPAA Rules

Ensuring that Google Analytics 4 complies with the HIPAA rules is not an easy undertaking, but with the help of this article and the steps outlined above, it is very doable.


Just remember, as mentioned before, Google Analytics isn't a HIPAA-compliant analytics platform, as it will not sign a business associate agreement. However, you can configure its settings to comply with the HIPAA rules because HIPAA only seeks to protect Protected Health Information (PHI) and not Personally Identifiable Information (PII). PII only becomes PHI when stored in a dataset containing health data, such as specific URLs containing treatment information. Ensuring that both components aren't stored in the same dataset will help comply with the HIPAA Privacy and Security Rule.


I recommend following the steps outlined in this article first, and then engaging with your legal department. They can advise you on any additional steps you may need to take for your unique business compliance requirements, and regarding your obligations under state law, which could differ depending on the state you are located in. This article does not cover any specifics regarding state privacy laws, and you are advised to do your due diligence on that part.


Please book a consulting call with me if you need help implementing the above steps to ensure your use of Google Analytics is HIPAA compliant. I'd be happy to help you with this.


Thank you for taking the time to read this article, and I genuinely hope it will help you on your journey toward HIPAA compliance. If you would like us to manage the entire process from A to Z, please apply to work with us.

Share this blog

About the Author


LESLEY VAN DE MORTEL

HIPAA Marketing Consultant

Lesley is a CDMP-certified digital marketing consultant and a CHPSE® Certified HIPAA Privacy Security Expert. With over seven years of in-depth experience building profitable HIPAA-compliant patient acquisition systems for private healthcare organizations across the United States, Lesley has worked, and still works, with some of the leading healthcare organizations in their respective field and has helped several of them scale their organization across multiple cities and states by leveraging high-performance, HIPAA-compliant patient acquisition systems.

APPLY TODAY
Share by: