Logo

Setting Up Analytics with BigQuery and Looker Studio

6 min read
PythonGCPBigQueryLooker Studio

Table of Contents

Intro

After running my Gemini-powered Slack bot for a while, I wanted to understand how it was being used:

Cloud Run logs are great for debugging, but not for analytics. I needed structured data I could query and visualize.

Here's how I set up a complete analytics pipeline: Structured Logging → BigQuery → Looker Studio.

1. Adding Structured Logging

Cloud Logging can automatically parse JSON printed to stdout as structured jsonPayload. This makes it queryable in BigQuery.

The Analytics Module

I created a reusable analytics module:

1# app/utils/analytics.py
2import json
3import sys
4import time
5from contextlib import contextmanager
6from typing import Optional
7
8
9def _log_json(data: dict) -> None:
10 """Output structured JSON for Cloud Logging."""
11 log_entry = {
12 "severity": "INFO",
13 "message": "bot_analytics",
14 **data,
15 }
16 print(json.dumps(log_entry, ensure_ascii=False), file=sys.stdout, flush=True)
17
18
19@contextmanager
20def track_response(event_type: str, user_id: str, channel: str, question: str):
21 """Context manager to track bot response timing."""
22 start_time = time.time()
23 tracker = ResponseTracker(event_type, user_id, channel, question, start_time)
24 try:
25 yield tracker
26 tracker.log_success()
27 except Exception as e:
28 tracker.log_error(str(e))
29 raise
30
31
32class ResponseTracker:
33 def __init__(self, event_type, user_id, channel, question, start_time):
34 self.event_type = event_type
35 self.user_id = user_id
36 self.channel = channel
37 self.question = question[:500]
38 self.start_time = start_time
39 self.response: Optional[str] = None
40
41 def set_response(self, response: str) -> None:
42 self.response = response[:500]
43
44 def _elapsed_ms(self) -> int:
45 return int((time.time() - self.start_time) * 1000)
46
47 def log_success(self) -> None:
48 _log_json({
49 "event_type": self.event_type,
50 "user_id": self.user_id,
51 "channel": self.channel,
52 "question": self.question,
53 "response": self.response,
54 "response_time_ms": self._elapsed_ms(),
55 "success": True,
56 })
57
58 def log_error(self, error: str) -> None:
59 _log_json({
60 "event_type": self.event_type,
61 "user_id": self.user_id,
62 "channel": self.channel,
63 "question": self.question,
64 "response_time_ms": self._elapsed_ms(),
65 "success": False,
66 "error": error[:200],
67 })

Using It in Handlers

The context manager makes logging clean and automatic:

1from app.utils.analytics import track_response
2
3@app.event("app_mention")
4def handle_mention(event, say, logger):
5 user = event.get("user")
6 channel = event.get("channel")
7 text = event.get("text", "")
8
9 try:
10 with track_response("mention_response", user, channel, text) as tracker:
11 response = gemini.generate_response_sync(user, text)
12 tracker.set_response(response)
13 say(text=response)
14 except Exception as e:
15 logger.error(f"Error: {e}")
16 say(text=ERROR_MESSAGE)

The context manager automatically:

What Gets Logged

Each request produces a structured log entry:

1{
2 "severity": "INFO",
3 "message": "bot_analytics",
4 "event_type": "mention_response",
5 "user_id": "U1234567890",
6 "channel": "C0123456789",
7 "question": "What is the release checklist?",
8 "response": "Here's the release checklist...",
9 "response_time_ms": 4523,
10 "success": true
11}

2. Setting Up BigQuery

Create a Dataset

In the GCP Console:

  1. Go to BigQuery
  2. Click your project → Create dataset
  3. Settings:
    • Dataset ID: slack_bot_logs
    • Location: asia-northeast1 (same region as Cloud Run)
    • Default table expiration: Leave blank or set to 90 days

Or via CLI:

1bq mk --dataset --location=asia-northeast1 YOUR_PROJECT:slack_bot_logs

Create a Log Sink

A log sink routes matching logs from Cloud Logging to BigQuery.

  1. Go to LoggingLog Router
  2. Click Create Sink
  3. Configure:
    • Sink name: your-bot-analytics
    • Sink destination: BigQuery dataset → select your dataset
    • Inclusion filter:
    1resource.type="cloud_run_revision"
    2resource.labels.service_name="your-service"
    3jsonPayload.message="bot_analytics"
  4. Click Create Sink

Grant Permissions

The sink creates a service account that needs write access to BigQuery.

Find the service account:

1gcloud logging sinks describe your-bot-analytics --format="value(writerIdentity)"
2# Output: serviceAccount:service-XXXX@gcp-sa-logging.iam.gserviceaccount.com

Grant it access (or do this in the Console under BigQuery → Dataset → Sharing → Permissions):

1bq add-iam-policy-binding \
2 --member="serviceAccount:service-XXXX@gcp-sa-logging.iam.gserviceaccount.com" \
3 --role="roles/bigquery.dataEditor" \
4 YOUR_PROJECT:slack_bot_logs

Verify Data Is Flowing

After sending a few messages to your bot, check BigQuery:

1SELECT
2 timestamp,
3 jsonPayload.event_type,
4 jsonPayload.user_id,
5 jsonPayload.question,
6 jsonPayload.response_time_ms,
7 jsonPayload.success
8FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
9WHERE jsonPayload.message = "bot_analytics"
10ORDER BY timestamp DESC
11LIMIT 10

Tables are auto-created based on log structure. It may take 5-10 minutes for the first data to appear.

3. Creating Looker Studio Dashboards

Connect to BigQuery

  1. Go to lookerstudio.google.com
  2. Click CreateReport
  3. Click Add dataBigQuery
  4. Navigate to your dataset and select the table
  5. Click Add

Useful Calculated Fields

Create these calculated fields for better analysis:

Response Time Bucket (for histograms):

1CASE
2 WHEN jsonPayload.response_time_ms < 3000 THEN "0-3s"
3 WHEN jsonPayload.response_time_ms < 5000 THEN "3-5s"
4 WHEN jsonPayload.response_time_ms < 10000 THEN "5-10s"
5 WHEN jsonPayload.response_time_ms < 15000 THEN "10-15s"
6 ELSE "15s+"
7END

Timestamp (JST) (convert from UTC):

1DATETIME_ADD(timestamp, INTERVAL 9 HOUR)

Hour of Day:

1HOUR(timestamp)

Suggested Charts

1. Scorecard - Total Requests

2. Scorecard - Average Response Time

3. Time Series - Requests Over Time

4. Bar Chart - Response Time Distribution

5. Table - Recent Queries

6. Pie Chart - Success vs Error

Add a Date Range Control

  1. InsertDate range control
  2. Place it at the top of your report
  3. Set default to "Last 7 days"

All charts automatically filter when you change the date range.

Timezone Considerations

All timestamps are stored in UTC. Looker Studio may auto-convert to your local timezone, but for consistency:

  1. Use the JST calculated field for display
  2. Or go to ResourceManage added data sourcesEdit to check timezone settings

Cost Considerations

BigQuery:

Looker Studio:

For a low-traffic bot, expect less than $1/month.

Example Queries

Average response time by day:

1SELECT
2 DATE(timestamp) as date,
3 COUNT(*) as requests,
4 AVG(jsonPayload.response_time_ms) as avg_response_ms
5FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
6WHERE jsonPayload.message = "bot_analytics"
7GROUP BY date
8ORDER BY date DESC

Most active users:

1SELECT
2 jsonPayload.user_id,
3 COUNT(*) as question_count
4FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
5WHERE jsonPayload.message = "bot_analytics"
6GROUP BY jsonPayload.user_id
7ORDER BY question_count DESC
8LIMIT 10

Error rate:

1SELECT
2 COUNTIF(jsonPayload.success = false) as errors,
3 COUNT(*) as total,
4 ROUND(COUNTIF(jsonPayload.success = false) / COUNT(*) * 100, 2) as error_rate_pct
5FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
6WHERE jsonPayload.message = "bot_analytics"

Wrapping Up

Setting up analytics took about 30 minutes:

  1. Structured logging - Print JSON to stdout with a message field for filtering
  2. BigQuery dataset + Log sink - Route logs automatically
  3. Looker Studio - Connect and build dashboards

Now I can see at a glance:

The best part: it's almost free and fully automated. Every bot response is logged, streamed to BigQuery, and ready to query.

Quick Reference

Log sink filter:

1resource.type="cloud_run_revision"
2resource.labels.service_name="YOUR_SERVICE"
3jsonPayload.message="bot_analytics"

Basic BigQuery query:

1SELECT * FROM `PROJECT.DATASET.run_googleapis_com_stdout_*`
2WHERE jsonPayload.message = "bot_analytics"
3ORDER BY timestamp DESC
4LIMIT 100

Structured log format:

1print(json.dumps({
2 "severity": "INFO",
3 "message": "bot_analytics",
4 "your_field": "your_value",
5}, ensure_ascii=False), flush=True)

Project Navigation

  1. 1.Building My First Flask App: A Next.js Developer‘s Perspective
  2. 2.From TypeScript to Python: Setting Up a Modern Development Environment
  3. 3.Deploying Python to GCP Cloud Run: A Guide for AWS Developers
  4. 4.Integrating Vertex AI Gemini into Flask: Building an AI-Powered Slack Bot
  5. 5.Adding GCS Memory to Gemini: Teaching Your Bot with Markdown Files
  6. 6.Slack Bot Troubleshooting: Duplicate Messages, Cold Starts, and Gemini Latency
  7. 7.Setting Up Analytics with BigQuery and Looker Studio