Google Analytics 4 (GA4) supplies useful insights into consumer conduct throughout web sites and apps. However what if it’s essential to mix GA4 knowledge with different sources or carry out deeper evaluation? That’s the place Amazon Redshift and Amazon AppFlow are available. Amazon AppFlow bridges the hole between Google functions and Amazon Redshift, empowering organizations to unlock deeper insights and drive data-informed choices. On this publish, we present you the best way to set up the info ingestion pipeline between Google Analytics 4, Google Sheets, and an Amazon Redshift Serverless workgroup.
Amazon AppFlow is a completely managed integration service that you should use to securely switch knowledge from software program as a service (SaaS) functions, reminiscent of Google BigQuery, Salesforce, SAP, HubSpot, and ServiceNow, to Amazon Net Providers (AWS) companies reminiscent of Amazon Easy Storage Service (Amazon S3) and Amazon Redshift, in just some clicks. With Amazon AppFlow, you possibly can run knowledge flows at practically any scale and on the frequency you select—on a schedule, in response to a enterprise occasion, or on demand. You possibly can configure knowledge transformation capabilities reminiscent of filtering and validation to generate wealthy, ready-to-use knowledge as a part of the circulate itself, with out further steps. Amazon AppFlow mechanically encrypts knowledge in movement, and permits you to prohibit knowledge from flowing over the general public web for SaaS functions which can be built-in with AWS PrivateLink, decreasing publicity to safety threats.
Amazon Redshift is a quick, scalable, and absolutely managed cloud knowledge warehouse that permits you to course of and run your complicated SQL analytics workloads on structured and semi-structured knowledge. It additionally helps you securely entry your knowledge in operational databases, knowledge lakes, or third-party datasets with minimal motion or copying of information. Tens of 1000’s of shoppers use Amazon Redshift to course of giant quantities of information, modernize their knowledge analytics workloads, and supply insights for his or her enterprise customers.
Conditions
Earlier than beginning this walkthrough, it’s essential to have the next stipulations in place:
- An AWS account.
- In your Google Cloud mission, you’ve enabled the next APIs:
- Google Analytics API
- Google Analytics Admin API
- Google Analytics Knowledge API
- Google Sheets API
- Google Drive API
For extra data, consult with Amazon AppFlow help for Google Sheets.
For the steps to allow these APIs, see Allow and disable APIs on the API Console Assist for Google Cloud Platform.
Structure overview
The next structure reveals how Amazon AppFlow can rework and transfer knowledge from SaaS functions to processing and storage locations. Three sections seem from left to proper within the diagram: Supply, Transfer, Goal. These sections are described within the following part.
- Supply – The leftmost part on the diagram represents completely different functions performing as a supply, together with Google Analytics, Google Sheets, and Google BigQuery.
- Transfer – The center part is labeled Amazon AppFlow. The part accommodates bins that signify Amazon AppFlow operations reminiscent of Masks Fields, Map Fields, Merge Fields, Filter Knowledge, and others. On this publish, we deal with organising the info motion utilizing Amazon AppFlow and filtering knowledge primarily based on begin date. The opposite transformation operations reminiscent of mapping, masking, and merging fields should not coated on this publish.
- Vacation spot – The part on the correct of the diagram is labeled Vacation spot and represents targets reminiscent of Amazon Redshift and Amazon S3. On this psot, we primarily deal with Amazon Redshift because the vacation spot.
This publish has two elements. The first half covers integrating from Google Analytics. The second half focuses on connecting with Google Sheets.
Utility configuration in Google Cloud Platform
Amazon AppFlow requires OAuth 2.0 for authentication. You must create an OAuth 2.0 shopper ID, which Amazon AppFlow makes use of when requesting an OAuth 2.0 entry token. To create an OAuth 2.0 shopper ID within the Google Cloud Platform console, comply with these steps:
- On the Google Cloud Platform Console, from the initiatives checklist, choose a mission or create a brand new one.
- If the APIs & Providers web page isn’t already open, select the menu icon on the higher left and choose APIs & Providers.
- Within the navigation pane, select Credentials.
- Select CREATE CREDENTIALS, then select OAuth shopper ID, as proven within the following screenshot.

- Choose the applying sort Net utility, enter the title demo-google-aws, and supply URIs for Licensed JavaScript origins
https://console.aws.amazon.com. For Licensed redirect URIs, addhttps://us-east-1.console.aws.amazon.com/appflow/oauth. Select SAVE, as proven within the following screenshot.

- The OAuth shopper ID is now created. Choose demo-google-aws.

- Beneath Further data, as proven within the following screenshot, observe down the Consumer ID and Consumer secret.

Knowledge ingestion from Google Analytics 4 to Amazon Redshift
On this part, you configure Amazon AppFlow to arrange a connection between Google Analytics 4 and Amazon Redshift for knowledge migration. This process may be categorised into the next steps:
- Create a connection to Google Analytics 4 in Amazon AppFlow
- Create an IAM position for Amazon AppFlow integration with Amazon Redshift
- Arrange Amazon AppFlow connection for Amazon Redshift
- Arrange desk and permission in Amazon Redshift
- Create knowledge circulate in Amazon AppFlow
Create a connection to Google Analytics 4 in Amazon AppFlow
To create a connection to Google Analytics 4 in Amazon AppFlow, comply with these steps:
- Register to the AWS Administration Console and open Amazon AppFlow.
- Within the navigation pane on the left, select Connections.
- On the Handle connections web page, for Connectors, select Google Analytics 4.
- Select Create connection.
- Within the Connect with Google Analytics 4 window, enter the next data. For Consumer ID, enter the shopper ID of the OAuth 2.0 shopper ID in your Google Cloud mission created within the earlier part. For Consumer secret, enter the shopper secret of the OAuth 2.0 shopper ID in your Google Cloud mission created within the earlier part.
- (Non-obligatory) underneath Knowledge encryption, select Customise encryption settings (superior) if you wish to encrypt your knowledge with a buyer managed key in AWS Key Administration Service (AWS KMS). By default, Amazon AppFlow encrypts your knowledge with an AWS KMS key that AWS creates, makes use of, and manages for you. Select this selection if you wish to encrypt your knowledge with your personal AWS KMS key as a substitute.
The next screenshot reveals the Connect with Google Analytics 4 window.

Amazon AppFlow encrypts your knowledge throughout transit and at relaxation. For extra data, see Knowledge safety in Amazon AppFlow.
If you wish to use an AWS KMS key from the present AWS account, choose this key underneath Select an AWS KMS key. If you wish to use an AWS KMS key from a special AWS account, enter the Amazon Useful resource Identify (ARN) for that key:
- For Connection title, enter a reputation in your connection
- Select Proceed
- Within the window that seems, check in to your Google account and grant entry to Amazon AppFlow
On the Handle connections web page, your new connection seems within the Connections desk. Once you create a circulate that makes use of Google Analytics 4 as the info supply, you possibly can choose this connection.

Create an IAM position for Amazon AppFlow integration with Amazon Redshift
You need to use Amazon AppFlow to switch knowledge from supported sources into your Amazon Redshift databases. You want an IAM position as a result of Amazon AppFlow wants authorization to entry Amazon Redshift utilizing an Amazon Redshift Knowledge API.
- Register to the AWS Administration Console, ideally as admin consumer, and within the navigation pane of the IAM dashboard, select Insurance policies.
- Select Create coverage.
- Choose the JSON tab and paste within the following coverage. Amazon AppFlow wants the next permissions to realize entry and run SQL statements with the Amazon Redshift database.
- Select Subsequent, present the Coverage title as
appflow-redshift-policy, Description as appflow redshift coverage, and select Create coverage.

- Within the navigation pane, select Roles and Create position. Select Customized belief coverage and paste within the following. Select Subsequent. This belief coverage grants Amazon AppFlow the power to imagine the position for Amazon AppFlow to entry and course of knowledge.
- Seek for coverage
appflow-redshift-policy, test the field subsequent to it, and select Subsequent.

- Present the position title
appflow-redshift-access-roleand Description and select Create position.
Arrange Amazon AppFlow connection for Amazon Redshift
To arrange an Amazon AppFlow connection for Amazon Redshift, comply with these steps:
- On the Amazon AppFlow console, within the navigation pane, select Connectors, choose Amazon Redshift, and select Create connection.

- Enter the connection title appflow-redshift-connection. You possibly can both use Amazon Redshift provisioned or Amazon Redshift Serverless, however on this instance we’re utilizing Amazon Redshift Serverless. Choose Amazon Redshift Serverless and enter the workgroup title and database title.
- Select the S3 bucket and enter the bucket prefix.

- For Amazon S3 entry, choose the IAM position hooked up to the Redshift cluster or namespace in the course of the creation of the Redshift cluster. Moreover, for the Amazon Redshift Knowledge API, select the IAM position
appflow-redshift-access-rolecreated within the earlier part after which select

Arrange a desk and permission in Amazon Redshift
To arrange desk and permission in Amazon Redshift, comply with these steps:
- On the Amazon Redshift console, select Question editor v2 in Explorer.
- Connect with your current Redshift cluster or Amazon Redshift Serverless workgroup.
- Create a desk with the next Knowledge Definition Language (DDL).
The next screenshot reveals the profitable creation of this desk in Amazon Redshift:

The next step is barely relevant to Amazon Redshift Serverless. If you’re utilizing a Redshift provisioned cluster, you possibly can skip this step.
- Grant the permissions on the desk to the IAM consumer utilized by Amazon AppFlow to load knowledge into Amazon Redshift Serverless, for instance,
appflow-redshift-access-role.
Create knowledge circulate in Amazon AppFlow
To create an information circulate in Amazon AppFlow, comply with these steps:
- On the Amazon AppFlow console, select Flows and choose Amazon Redshift. Select Create circulate and enter the circulate title and the circulate description, as proven within the following screenshot.

- In Supply title, select Google Analytics 4. Select the Google Analytics 4 connection.
- Choose the Google Analytics 4 object, then select Amazon Redshift because the vacation spot, deciding on the
publicschema andstg_ga4_daily_summarydesk in your Redshift occasion.

- For Circulation set off, select Run on demand and select Subsequent, as proven within the following screenshot.
You possibly can run the circulate on schedule to drag both full or incremental knowledge refresh. For extra data, see Schedule-triggered flows.

- Choose Manually map fields. From the Supply subject title dropdown menu, choose the attribute
date, and from the Vacation spot subject title, chooseevent_dateand select Map fields, as proven within the following screenshot.

- Repeat the earlier step (step 5) for the next attributes after which select Subsequent. The next screenshot reveals the mapping.
The Google Analytics API supplies varied dimensions and metrics for reporting functions. Consult with API Dimensions & Metrics for particulars.

- In Area title, enter the filter
start_end_dateand select Subsequent, as proven within the following screenshot. The Amazon AppFlow date filter helps each a begin date (criteria1) and an finish date (criteria2) to outline the specified date vary for knowledge switch. We’re utilizing the date vary as a result of now we have pattern knowledge created for this vary.

- Evaluation the configurations and select Create circulate.
- Select Run circulate, as proven within the following screenshot, and look ahead to the circulate execution to be accomplished.

- On the Amazon Redshift console, select Question editor v2 in Explorer.
- Connect with your current Redshift cluster or Amazon Redshift Serverless workgroup.
- Enter the next SQL to confirm the info in Amazon Redshift.
The screenshot under reveals the outcomes loaded into the stg_ga4_daily_summary desk.

Knowledge ingestion from Google Sheets to Amazon Redshift
Ingesting knowledge from Google Sheets to Amazon Redshift utilizing Amazon AppFlow streamlines analytics, enabling seamless switch and deeper insights. On this part, we exhibit how enterprise customers can keep their enterprise glossary in Google Sheets and combine that utilizing Amazon AppFlow with Amazon Redshift and get significant insights.
For this demo, you possibly can add the Nation Market phase file to your Google sheet earlier than continuing to the following steps. These steps present the best way to configure Amazon AppFlow to arrange a connection between Google Sheets and Amazon Redshift for knowledge migration. This process may be categorised into the next steps:
- Create Google Sheets connection in Amazon AppFlow
- Arrange desk and permission in Amazon Redshift
- Create knowledge circulate in Amazon AppFlow
Create Google Sheets connection in Amazon AppFlow
To create a Google Sheets connection in Amazon AppFlow, comply with these steps:
- On the Amazon AppFlow console, select Connectors, choose Google Sheets, then select Create connection.
- Within the Connect with Google Sheets window, enter the next data. For Consumer ID, enter the shopper ID of the OAuth 2.0 shopper ID in your Google Sheets mission. For Consumer secret, enter the shopper secret of the OAuth 2.0 shopper ID in your Google Sheets mission.
- For Connection title, enter a reputation in your connection.
- (Non-obligatory) Beneath Knowledge encryption, select Customise encryption settings (superior) if you wish to encrypt your knowledge with a buyer managed key in AWS KMS. By default, Amazon AppFlow encrypts your knowledge with an AWS KMS key that AWS creates, makes use of, and manages for you. Select this selection if you wish to encrypt your knowledge with your personal AWS KMS key as a substitute.
- Select Join.
- Within the window that seems, check in to your Google account and grant entry to Amazon AppFlow.
Arrange desk and permission in Amazon Redshift
To arrange a desk and permission in Amazon Redshift, comply with these steps:
- On the Amazon Redshift console, select Question editor v2 in Explorer
- Connect with your current Redshift cluster or Amazon Redshift Serverless workgroup
- Create a desk with the next DDL
he following steps are solely relevant to Amazon Redshift Serverless. If you’re utilizing a Redshift provisioned cluster, you possibly can skip this step.
- Grant the permissions on the desk to the IAM consumer utilized by Amazon AppFlow to load knowledge into Amazon Redshift Serverless, for instance,
appflow-redshift-access-role
Create knowledge circulate in Amazon AppFlow
- On the Amazon AppFlow console, select Flows and choose Google Sheets. Select Create circulate, enter the circulate title and circulate description, and select Subsequent.
- Choose Google Sheets in Supply title and select the Google Sheets connection.
- Choose the Google Sheets object
nation_market_segment#Sheet1. - Select the Vacation spot title as Amazon Redshift, then choose
stg_nation_market_segmentas your Amazon Redshift object, as proven within the following screenshot.

- For Circulation set off, choose On demand and select Subsequent.
You possibly can run the circulate on schedule to drag full or incremental knowledge refresh. Learn extra at Schedule-triggered flows.
- Choose Manually map fields. From the Supply subject title dropdown menu, choose Map all fields instantly. When a dialog field pops up, select the respective attribute values and select Map fields, as proven within the following screenshot. Select Subsequent.

The next screenshot reveals the mapping.

- On the Add Filters web page, select Subsequent.
- On the Evaluation and create web page, select Create circulate.
- Select Run circulate and look ahead to the circulate execution to complete.
The screenshot under reveals the execution particulars of the circulate job.

- On the Amazon Redshift console, select Question editor v2 in Explorer.
- Connect with your current Redshift cluster or Amazon Redshift Serverless workgroup.
- Run the next SQL to confirm the info in Amazon Redshift.
The screenshot under reveals the outcomes loaded into the stg_nation_market_segment desk.

- Run the next SQL to organize a pattern dataset in Amazon Redshift.
- Run the next SQL to do the info analytics utilizing Google Sheets enterprise knowledge classification within the Amazon Redshift dataset.
The screenshot under reveals the outcomes from the aggregated question in Amazon Redshift from knowledge loaded utilizing Amazon Appflow.

Clear up
To keep away from incurring costs, clear up the assets in your AWS account by finishing the next steps:
- On the Amazon AppFlow console, within the navigation pane, select Flows.
- From the checklist of flows, choose the circulate title created and delete it.
- Enter “delete” to delete the circulate.
- Delete the Amazon Redshift workgroup.
- Clear up assets in your Google account by deleting the mission that accommodates the Google BigQuery assets. Observe the documentation to clear up the Google assets.
Conclusion
On this publish, we walked you thru the method of utilizing Amazon AppFlow to combine knowledge from Google Advertisements and Google Sheets. We demonstrated how the complexities of information integration are minimized so you possibly can deal with deriving actionable insights out of your knowledge. Whether or not you’re archiving historic knowledge, performing complicated analytics, or making ready knowledge for machine studying, this connector streamlines the method, making it accessible to a broader vary of information professionals.
For extra data, consult with Amazon AppFlow help for Google Sheets and Google Advertisements.
Concerning the authors
Ritesh Kumar Sinha is an Analytics Specialist Options Architect primarily based out of San Francisco. He has helped prospects construct scalable knowledge warehousing and large knowledge options for over 16 years. He likes to design and construct environment friendly end-to-end options on AWS. In his spare time, he loves studying, strolling, and doing yoga.
Tahir Aziz is an Analytics Answer Architect at AWS. He has labored with constructing knowledge warehouses and large knowledge options for over 13 years. He loves to assist prospects design end-to-end analytics options on AWS. Outdoors of labor, he enjoys touring and cooking.
Raza Hafeez is a Senior Product Supervisor at Amazon Redshift. He has over 13 years {of professional} expertise constructing and optimizing enterprise knowledge warehouses and is keen about enabling prospects to appreciate the ability of their knowledge. He makes a speciality of migrating enterprise knowledge warehouses to AWS Trendy Knowledge Structure.
Amit Ghodke is an Analytics Specialist Options Architect primarily based out of Austin. He has labored with databases, knowledge warehouses and analytical functions for the previous 16 years. He loves to assist prospects implement analytical options at scale to derive most enterprise worth.
