Setting Up Analytics with BigQuery and Looker Studio

Intro

After running my Gemini-powered Slack bot for a while, I wanted to understand how it was being used:

How many questions are users asking?
What's the average response time?
Which users are most active?
Are there any errors I'm missing?

Cloud Run logs are great for debugging, but not for analytics. I needed structured data I could query and visualize.

Here's how I set up a complete analytics pipeline: Structured Logging → BigQuery → Looker Studio.

1. Adding Structured Logging

Cloud Logging can automatically parse JSON printed to stdout as structured jsonPayload. This makes it queryable in BigQuery.

The Analytics Module

I created a reusable analytics module:

1# app/utils/analytics.py
2import json
3import sys
4import time
5from contextlib import contextmanager
6from typing import Optional
7
8
9def _log_json(data: dict) -> None:
10    """Output structured JSON for Cloud Logging."""
11    log_entry = {
12        "severity": "INFO",
13        "message": "bot_analytics",
14        **data,
15    }
16    print(json.dumps(log_entry, ensure_ascii=False), file=sys.stdout, flush=True)
17
18
19@contextmanager
20def track_response(event_type: str, user_id: str, channel: str, question: str):
21    """Context manager to track bot response timing."""
22    start_time = time.time()
23    tracker = ResponseTracker(event_type, user_id, channel, question, start_time)
24    try:
25        yield tracker
26        tracker.log_success()
27    except Exception as e:
28        tracker.log_error(str(e))
29        raise
30
31
32class ResponseTracker:
33    def __init__(self, event_type, user_id, channel, question, start_time):
34        self.event_type = event_type
35        self.user_id = user_id
36        self.channel = channel
37        self.question = question[:500]
38        self.start_time = start_time
39        self.response: Optional[str] = None
40
41    def set_response(self, response: str) -> None:
42        self.response = response[:500]
43
44    def _elapsed_ms(self) -> int:
45        return int((time.time() - self.start_time) * 1000)
46
47    def log_success(self) -> None:
48        _log_json({
49            "event_type": self.event_type,
50            "user_id": self.user_id,
51            "channel": self.channel,
52            "question": self.question,
53            "response": self.response,
54            "response_time_ms": self._elapsed_ms(),
55            "success": True,
56        })
57
58    def log_error(self, error: str) -> None:
59        _log_json({
60            "event_type": self.event_type,
61            "user_id": self.user_id,
62            "channel": self.channel,
63            "question": self.question,
64            "response_time_ms": self._elapsed_ms(),
65            "success": False,
66            "error": error[:200],
67        })

Using It in Handlers

The context manager makes logging clean and automatic:

1from app.utils.analytics import track_response
2
3@app.event("app_mention")
4def handle_mention(event, say, logger):
5    user = event.get("user")
6    channel = event.get("channel")
7    text = event.get("text", "")
8
9    try:
10        with track_response("mention_response", user, channel, text) as tracker:
11            response = gemini.generate_response_sync(user, text)
12            tracker.set_response(response)
13        say(text=response)
14    except Exception as e:
15        logger.error(f"Error: {e}")
16        say(text=ERROR_MESSAGE)

The context manager automatically:

Starts timing when entering
Logs success with response time on normal exit
Logs error on exception (and re-raises)

What Gets Logged

Each request produces a structured log entry:

1{
2  "severity": "INFO",
3  "message": "bot_analytics",
4  "event_type": "mention_response",
5  "user_id": "U1234567890",
6  "channel": "C0123456789",
7  "question": "What is the release checklist?",
8  "response": "Here's the release checklist...",
9  "response_time_ms": 4523,
10  "success": true
11}

2. Setting Up BigQuery

Create a Dataset

In the GCP Console:

Go to BigQuery
Click your project → Create dataset
Settings:
- Dataset ID: slack_bot_logs
- Location: asia-northeast1 (same region as Cloud Run)
- Default table expiration: Leave blank or set to 90 days

Or via CLI:

1bq mk --dataset --location=asia-northeast1 YOUR_PROJECT:slack_bot_logs

Create a Log Sink

A log sink routes matching logs from Cloud Logging to BigQuery.

Go to Logging → Log Router
Click Create Sink

Configure:

Sink name: your-bot-analytics
Sink destination: BigQuery dataset → select your dataset
Inclusion filter:

1resource.type="cloud_run_revision"
2resource.labels.service_name="your-service"
3jsonPayload.message="bot_analytics"

Click Create Sink

Grant Permissions

The sink creates a service account that needs write access to BigQuery.

Find the service account:

1gcloud logging sinks describe your-bot-analytics --format="value(writerIdentity)"
2# Output: serviceAccount:service-XXXX@gcp-sa-logging.iam.gserviceaccount.com

Grant it access (or do this in the Console under BigQuery → Dataset → Sharing → Permissions):

1bq add-iam-policy-binding \
2  --member="serviceAccount:service-XXXX@gcp-sa-logging.iam.gserviceaccount.com" \
3  --role="roles/bigquery.dataEditor" \
4  YOUR_PROJECT:slack_bot_logs

Verify Data Is Flowing

After sending a few messages to your bot, check BigQuery:

1SELECT
2  timestamp,
3  jsonPayload.event_type,
4  jsonPayload.user_id,
5  jsonPayload.question,
6  jsonPayload.response_time_ms,
7  jsonPayload.success
8FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
9WHERE jsonPayload.message = "bot_analytics"
10ORDER BY timestamp DESC
11LIMIT 10

Tables are auto-created based on log structure. It may take 5-10 minutes for the first data to appear.

3. Creating Looker Studio Dashboards

Connect to BigQuery

Go to lookerstudio.google.com
Click Create → Report
Click Add data → BigQuery
Navigate to your dataset and select the table
Click Add

Useful Calculated Fields

Create these calculated fields for better analysis:

Response Time Bucket (for histograms):

1CASE
2  WHEN jsonPayload.response_time_ms < 3000 THEN "0-3s"
3  WHEN jsonPayload.response_time_ms < 5000 THEN "3-5s"
4  WHEN jsonPayload.response_time_ms < 10000 THEN "5-10s"
5  WHEN jsonPayload.response_time_ms < 15000 THEN "10-15s"
6  ELSE "15s+"
7END

Timestamp (JST) (convert from UTC):

1DATETIME_ADD(timestamp, INTERVAL 9 HOUR)

Hour of Day:

1HOUR(timestamp)

Suggested Charts

1. Scorecard - Total Requests

Metric: Record Count

2. Scorecard - Average Response Time

Metric: AVG(jsonPayload.response_time_ms)

3. Time Series - Requests Over Time

Dimension: timestamp (set to Date or Date Hour)
Metric: Record Count

4. Bar Chart - Response Time Distribution

Dimension: Response Time Bucket (calculated field)
Metric: Record Count

5. Table - Recent Queries

Dimensions: timestamp, jsonPayload.user_id, jsonPayload.question, jsonPayload.response_time_ms
Sort: timestamp descending

6. Pie Chart - Success vs Error

Dimension: jsonPayload.success
Metric: Record Count

Add a Date Range Control

Insert → Date range control
Place it at the top of your report
Set default to "Last 7 days"

All charts automatically filter when you change the date range.

Timezone Considerations

All timestamps are stored in UTC. Looker Studio may auto-convert to your local timezone, but for consistency:

Use the JST calculated field for display
Or go to Resource → Manage added data sources → Edit to check timezone settings

Cost Considerations

BigQuery:

Log ingestion: ~$0.50/GB
Storage: ~$0.02/GB/month
Queries: First 1TB/month free, then $5/TB

Looker Studio:

Free!

For a low-traffic bot, expect less than $1/month.

Example Queries

Average response time by day:

1SELECT
2  DATE(timestamp) as date,
3  COUNT(*) as requests,
4  AVG(jsonPayload.response_time_ms) as avg_response_ms
5FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
6WHERE jsonPayload.message = "bot_analytics"
7GROUP BY date
8ORDER BY date DESC

Most active users:

1SELECT
2  jsonPayload.user_id,
3  COUNT(*) as question_count
4FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
5WHERE jsonPayload.message = "bot_analytics"
6GROUP BY jsonPayload.user_id
7ORDER BY question_count DESC
8LIMIT 10

Error rate:

1SELECT
2  COUNTIF(jsonPayload.success = false) as errors,
3  COUNT(*) as total,
4  ROUND(COUNTIF(jsonPayload.success = false) / COUNT(*) * 100, 2) as error_rate_pct
5FROM `YOUR_PROJECT.slack_bot_logs.run_googleapis_com_stdout_*`
6WHERE jsonPayload.message = "bot_analytics"

Wrapping Up

Setting up analytics took about 30 minutes:

Structured logging - Print JSON to stdout with a message field for filtering
BigQuery dataset + Log sink - Route logs automatically
Looker Studio - Connect and build dashboards

Now I can see at a glance:

How many questions users are asking
Whether response times are acceptable
If there are any errors to investigate

The best part: it's almost free and fully automated. Every bot response is logged, streamed to BigQuery, and ready to query.

Quick Reference

Log sink filter:

1resource.type="cloud_run_revision"
2resource.labels.service_name="YOUR_SERVICE"
3jsonPayload.message="bot_analytics"

Basic BigQuery query:

1SELECT * FROM `PROJECT.DATASET.run_googleapis_com_stdout_*`
2WHERE jsonPayload.message = "bot_analytics"
3ORDER BY timestamp DESC
4LIMIT 100

Structured log format:

1print(json.dumps({
2    "severity": "INFO",
3    "message": "bot_analytics",
4    "your_field": "your_value",
5}, ensure_ascii=False), flush=True)