Skip to main content

BigQuery Integration: Storing Your Event Data

BigQuery is your data warehouse for all server-side tracking events. Here's how to set it up and use it effectively.

Why BigQuery?

Benefits

  • Unlimited storage
  • Fast queries
  • Advanced analytics
  • Integration with other tools
  • Cost-effective
  • Google ecosystem integration

Use Cases

  • Store all events for analysis
  • Create custom reports
  • Build data models
  • Integrate with BI tools
  • Long-term data retention

BigQuery Setup

Step 1: Create Dataset

CREATE SCHEMA IF NOT EXISTS `your-project.tracking_data`
OPTIONS(
description="Server-side tracking event data",
location="EU" -- or US, based on your needs
);

Step 2: Create Event Table

CREATE TABLE IF NOT EXISTS `your-project.tracking_data.events` (
event_name STRING NOT NULL,
event_timestamp TIMESTAMP NOT NULL,
user_id STRING,
session_id STRING,
page_url STRING,
page_title STRING,
referrer STRING,
utm_source STRING,
utm_medium STRING,
utm_campaign STRING,
utm_term STRING,
utm_content STRING,
device_type STRING,
browser STRING,
os STRING,
country STRING,
city STRING,
event_data JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
)
PARTITION BY DATE(event_timestamp)
CLUSTER BY event_name, user_id;

Step 3: Configure GTM to Stream to BigQuery

  1. Add BigQuery destination in GTM
  2. Configure service account
  3. Set up streaming inserts
  4. Test data flow

Data Schema Design

Core Fields

  • event_name: Type of event (page_view, form_submit, purchase)
  • event_timestamp: When event occurred
  • user_id: Anonymous user identifier
  • session_id: Session identifier
  • page_url: URL where event occurred
  • utm_*: Campaign tracking parameters

Extended Fields

  • event_data: JSON with event-specific data
  • device_info: Device type, browser, OS
  • location: Country, city
  • custom_dimensions: Business-specific data

Querying BigQuery Data

Example Queries

1. Daily Event Count:

SELECT 
DATE(event_timestamp) as date,
event_name,
COUNT(*) as event_count
FROM `your-project.tracking_data.events`
WHERE event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY date, event_name
ORDER BY date DESC, event_count DESC;

2. Conversion Funnel:

SELECT 
COUNT(DISTINCT CASE WHEN event_name = 'page_view' THEN user_id END) as visitors,
COUNT(DISTINCT CASE WHEN event_name = 'form_submit' THEN user_id END) as form_submits,
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN user_id END) as purchases
FROM `your-project.tracking_data.events`
WHERE event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY);

3. Attribution Analysis:

SELECT 
utm_source,
utm_medium,
utm_campaign,
COUNT(DISTINCT user_id) as users,
COUNT(DISTINCT CASE WHEN event_name = 'purchase' THEN user_id END) as conversions
FROM `your-project.tracking_data.events`
WHERE event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY utm_source, utm_medium, utm_campaign
ORDER BY conversions DESC;

Data Retention & Partitioning

Partitioning Strategy

  • Partition by date (event_timestamp)
  • Enables efficient queries
  • Automatic data management
  • Cost optimization

Data Retention

  • Set retention policy (e.g., 2 years)
  • Archive old data if needed
  • Comply with GDPR requirements
  • Optimize costs

Cost Optimization

BigQuery Pricing

  • Storage: $0.02/GB/month
  • Queries: $5/TB scanned
  • Streaming: $0.01/200MB

Cost Optimization Tips

  1. Use partitioning (reduces scan costs)
  2. Use clustering (improves query performance)
  3. Limit data retention
  4. Optimize queries
  5. Use scheduled queries for reports

Next Step: Learn about Platform Integrations to connect with GA4, Meta, and LinkedIn.