



The Google Analytics API provides access to Google Analytics (GA) reporting data such as pageviews, sessions, traffic sources and bounce rate.

Google’s official documentation explains that it can be used for:

Build custom dashboards to display GA data. Automate complex reporting tasks. Integrate with other applications.

You can access API responses using several different methods, including Java, PHP, and JavaScript, but this article focuses specifically on accessing and exporting data using Python.

This article describes several methods you can use to access different subsets of your data using different metrics and dimensions.

I would like to write a follow-up guide that explores different ways to analyze, visualize, and combine data.

Setting up the API Creating a Google service account

The first step is to create a project or select a project within your Google service account.

Once this is created, the next step is[+ サービス アカウントの作成]to select the button.

Google Cloud screenshots, December 2022

You will then be prompted to add details such as name, id and description.

Google Cloud screenshots, December 2022

Once the service account is created,[キー]Go to the section and add a new key.

Google Cloud screenshots, December 2022

This will prompt you to create and download a private key. For this example, select JSON, create the file, and wait for it to download.

Google Cloud Screenshots, December 2022 Added to Google Analytics Account

You can also get a copy of the email generated for your service account. This can be found on your main account page.

Google Cloud screenshots, December 2022

The next step is to add that email as a Google Analytics user with analyst permissions.

Google Analytics Screenshots, December 2022 API Enabled

The last and perhaps most important step is enabling access to the API. To do this, make sure you are in the correct project and follow this link to enable access.

Then follow the steps to enable on elevation.

Google Cloud screenshots, December 2022

This is required to access the API. If you don’t do this step, you will be prompted to complete it the first time you run the script.

Accessing the Google Analytics API using Python

Now that we have everything set up in our service account, we can start writing scripts to export data.

I chose Jupyter Notebook to create this, but you can also use other integrated development environments (IDEs) such as PyCharm or VSCode.

Install library

The first step is to install the necessary libraries to run the rest of our code.

Some are specific to the analytics API, others will be useful in future sections of code.

!pip install –upgrade google-api-python-client !pip3 install –upgrade oauth2client from apiclient.discovery import build from oauth2client.service_account import ServiceAccountCredentials !pip install connect !pip install functions import connect

Note: If you use pip in a Jupyter notebook, you don’t need ! – ! if you’re running in the command line or another IDE.

Create a service build

The next step is to set the scope. This is a read-only analytics API authentication link.

This is followed by a JSON download of the client secret generated when creating the private key. This is used in a similar way to an API key.

To easily access this file in your code, make sure the JSON file is saved in the same folder as your code file. This can be easily called with the KEY_FILE_LOCATION function.

Finally, add the view ID from your analytics account that you will use to access the data.

Author screenshot, December 2022

Overall this looks like this: Reference these functions throughout your code.

scope = [‘https://www.googleapis.com/auth/analytics.readonly’]

KEY_FILE_LOCATION = ‘client_secrets.json’ VIEW_ID = ‘XXXXX’

After adding the private key file, you can add it to your credentials function by calling the file and setting it in the ServiceAccountCredentials step.

Next, set up a build report and call the analytics report API V4 with the credentials defined above.

credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES) service = build(‘analyticsreporting’, ‘v4’, credentials=credentials) Description of request body

Once everything is set up and defined, the real fun begins.

From your API service build, you have the ability to select elements from the response to access. This is called a ReportRequest object and requires at least the following:

A valid view ID in the viewId field. At least one valid entry in the dateRanges field. At least one valid entry in the metric field.

Show ID

As mentioned earlier, there are a few things we need at this build stage, starting with the viewId. Instead of re-adding the entire view ID as defined earlier, you can just call that function name (VIEW_ID).

In the future, if you want to collect data from different analysis views, you can just change the ID in the first code block instead of both.

date range

Then you can add a date range of dates for which you want to collect data. It consists of a start date and an end date.

There are several ways to write this within a build request.

For example, to select a date defined between two dates, add the dates in the format year-month-day: ‘startDate’: ‘2022-10-27’, ‘endDate’: ‘2022-11′ -27’

Alternatively, if you want to see data for the last 30 days, you can set the start date to 30 days ago and the end date to today.

Metrics and dimensions

The final step in the basic response call is setting metrics and dimensions. Metrics are quantitative measurements from Google Analytics such as number of sessions, session duration, bounce rate, etc.

Dimensions are characteristics of users, user sessions, and user actions. For example, page paths, traffic sources, keywords used, etc.

There are many different metrics and dimensions you can access. We won’t cover all of them in this article, but you can find all the additional information and attributes collectively here.

Anything you can access in Google Analytics can be accessed in the API. This includes goal conversions, initiation and value, browser devices used to access your website, landing pages, second page path tracking, internal searches, site speed, and audience metrics.

Both metrics and dimensions are added in dictionary form using key-value pairs. For metrics, the key is the “expression” followed by a colon (:) followed by the value of the metric with a specific format.

For example, to get the count of all sessions, add ‘expression’: ‘ga:sessions’. Or ‘expression’: ‘ga:newUsers’ if you want to check the count of all new users.

For dimensions, the key is “name” followed again by a colon and the value of the dimension. For example, to extract various page paths, ‘name’: ‘ga:pagePath’.

Alternatively, you can see referrals of various traffic sources to your site by ‘name’: ‘ga:medium’.

Combining dimensions and metrics

The real value lies in combining metrics and dimensions to extract the key insights that matter most to you.

For example, to see the number of all sessions created from various traffic sources, set the metric to ga:sessions and the dimension to ga:medium.

response = service.reports().batchGet( body={ ‘reportRequests’: [

{

‘viewId’: VIEW_ID,

‘dateRanges’: [{‘startDate’: ’30daysAgo’, ‘endDate’: ‘today’}],”index”: [{‘expression’: ‘ga:sessions’}],”dimension”: [{‘name’: ‘ga:medium’}]

}]} .execute() create dataframe

The response you get from the API is in the form of a dictionary where all data are key-value pairs. You can convert the data to a Pandas dataframe for easier viewing and analysis of the data.

To convert the response to a dataframe, we first need to create some empty lists to hold our metrics and dimensions.

Then call the response output to add the data from the dimension to the empty dimension list and add the number of metrics to the metric list.

This extracts the data and appends it to the previously empty list.

Dim = []

metric = []

response. get(‘reports’, []): columnHeader = report.get(‘columnHeader’, {}) dimensionHeaders = columnHeader.get(‘dimensions’, []) metricHeaders = columnHeader.get(‘metricHeader’, {}).get(‘metricHeaderEntries’, []) row = report.get(‘data’, {}).get(‘row’, []) row row: dimension = row.get(‘dimension’, []) dateRangeValues ​​= row.get(‘metrics’, []) if header then dimensions in zip (dimensionHeaders, dimensions): dim.append(dimension) if i then values ​​in enumerate(dateRangeValues): if metricHeader then values ​​in zip (metricHeaders, values.get(‘values’)) : metric. append(int(value))

Add response data

Once your data is in these lists, you can easily convert them to dataframes by defining the column names in square brackets and assigning list values ​​to each column.

df = pd.DataFrame() df[“Sessions”]= metric degrees of freedom[“Medium”]=dim df= df[[“Medium”,”Sessions”]]df.head() Other Response Request Examples Multiple Metrics

There is also the ability to combine multiple metrics, each pair appended with curly braces and separated by commas.

“Metrics”: [

{“expression”: “ga:pageviews”},

{“expression”: “ga:sessions”}

]

filtering

You can also request API responses that only return metrics that return specific criteria by adding metric filters. Use the following format:

if {metricName} {operator} {comparisonValue} returns a metric

For example, if you want to extract only pageviews with more than 10 views.

response = service.reports().batchGet( body={ ‘reportRequests’: [

{

‘viewId’: VIEW_ID,

‘dateRanges’: [{‘startDate’: ’30daysAgo’, ‘endDate’: ‘today’}],”index”: [{‘expression’: ‘ga:pageviews’}],”dimension”: [{‘name’: ‘ga:pagePath’}]”metricFilterClauses”: [{

“filters”: [{

“metricName”: “ga:pageviews”,

“operator”: “GREATER_THAN”,

“comparisonValue”: “10”

}]

}]}]} ). Run()

Filters work similarly for dimensions, but filter expressions differ slightly depending on the dimension characteristics.

For example, if you want to extract only pageviews from users who visit your site using the Chrome browser, you can set the EXTRACT operator and use “Chrome” as the expression.

response = service.reports().batchGet( body={ ‘reportRequests’: [

{

‘viewId’: VIEW_ID,

‘dateRanges’: [{‘startDate’: ’30daysAgo’, ‘endDate’: ‘today’}],”index”: [{‘expression’: ‘ga:pageviews’}],”Size”: [{“name”: “ga:browser”}]”dimensionFilterClauses”: [

{

“filters”: [

{

“dimensionName”: “ga:browser”,

“operator”: “EXACT”,

“expressions”: [“Chrome”]

} ]} ]} ]} .execute() expression

Since metrics are quantitative measures, we also have the ability to write formulas that work similarly to calculated metrics.

This involves defining an alias to represent the formula and completing the math function with the two metrics.

For example, you can calculate the number of completions per user by dividing the number of completions by the number of users.

response = service.reports().batchGet( body={ ‘reportRequests’: [

{

‘viewId’: VIEW_ID,

‘dateRanges’: [{‘startDate’: ’30daysAgo’, ‘endDate’: ‘today’}],”index”:

[

{

“expression”: “ga:goal1completions/ga:users”,

“alias”: “completions per user”

}

]

} ]} .execute() Histogram

The API also allows you to bucket dimensions with integer (numeric) values ​​into ranges using histogram buckets.

For example, if you bucket the session count dimension into four buckets: 1-9, 10-99, 100-199, and 200-399, you can use the HISTOGRAM_BUCKET order type to define ranges on histogramBuckets.

response = service.reports().batchGet( body={ ‘reportRequests’: [

{

‘viewId’: VIEW_ID,

‘dateRanges’: [{‘startDate’: ’30daysAgo’, ‘endDate’: ‘today’}],”index”: [{“expression”: “ga:sessions”}],”Size”: [

{

“name”: “ga:sessionCount”,

“histogramBuckets”: [“1″,”10″,”100″,”200″,”400”]

} ], “orderBys”: [

{

“fieldName”: “ga:sessionCount”,

“orderType”: “HISTOGRAM_BUCKET”

}

]

} ]} .execute() Screenshot by author, December 2022

We hope this provided a basic guide to accessing the Google Analytics API, making various requests, and gathering meaningful insights in an easy-to-read format.

Added the build and request code and snippets shared in this GitHub file.

I would love to hear if you have any plans to try any of these and explore the data further.

