BigQuery dataset for Go build history

1,045 views
Skip to first unread message

Brad Fitzpatrick

unread,
Apr 25, 2017, 2:40:12 PM4/25/17
to golang-dev
I've shared the build system's BigQuery dataset for those who want to play.


There are two tables;

Builds -- summary of every build (1.8M and counting)
Spans -- summary of every step of every build (19.8M and counting)

Example queries: 

Fastest make.bash by builder in the past two weeks:

SELECT Builder, AVG(Seconds) As AvgSec, COUNT(*) As Count
FROM
  [symbolic-datum-552.builds.Spans]
Where
    Event = "make"
   AND StartTime > DATE_ADD(CURRENT_TIMESTAMP(), -14, "DAY")
   AND Error = ""
Group By Builder
Order By AvgSec ;


Trybot builder speeds: (e.g. why are trybots slow?)

SELECT
  Builder,
  NTH(951, QUANTILES(Seconds, 1001)) as Secs95p,
  STDDEV(Seconds) as StdDevSec,
  AVG(Seconds) as AverageSec,
  COUNT(*) AS count
FROM
  builds.Builds
WHERE
  Result="ok"
  AND istry = TRUE
  AND Repo = "go"
  AND StartTime > DATE_ADD(CURRENT_TIMESTAMP(), -14, "DAY")
GROUP BY
  Builder
ORDER BY
  AverageSec DESC;


etc

If you find anything fun or make anything pretty, please share!

elli...@google.com

unread,
Apr 26, 2017, 8:51:52 AM4/26/17
to golang-dev
Top five slowest builds for each repo in the past week:

#standardSQL
SELECT
  Repo AS repo,
  ARRAY_AGG(
    STRUCT(
      Seconds AS elapsed_time,
      Arch AS architecture,
      Builder AS builder,
      ID AS build_id,
      IsTry AS is_try_bot
    )
    ORDER BY Seconds DESC LIMIT 5
  ) AS slowest_builds
FROM `symbolic-datum-552.builds.Builds`
WHERE
  StartTime BETWEEN
    TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND
    CURRENT_TIMESTAMP() AND
  Result = 'ok'
GROUP BY repo;

Successful/unsuccessful builds by repo:

#standardSQL
SELECT
  Repo AS repo,
  COUNTIF(Result = 'ok') AS successful_builds,
  COUNTIF(Result = 'fail') AS failed_builds,
  COUNTIF(Result = 'ok') * 100 / COUNT(*) AS percent_successful
FROM `symbolic-datum-552.builds.Builds`
WHERE
  StartTime BETWEEN
    TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND
    CURRENT_TIMESTAMP()
GROUP BY repo
ORDER BY percent_successful DESC;

Jan Mercl

unread,
Apr 26, 2017, 9:56:27 AM4/26/17
to Brad Fitzpatrick, golang-dev
On Tue, Apr 25, 2017 at 8:40 PM Brad Fitzpatrick <brad...@golang.org> wrote:

> I've shared the build system's BigQuery dataset for those who want to play.

https://bigquery.cloud.google.com/dataset/symbolic-datum-552:builds

The link forces me to some "Create project" page. Do I want to create a project? How can I see your data set/where can I play and query it?

Sorry, never used not even seen BigQuery before.

--

-j

Elliott Brossard

unread,
Apr 26, 2017, 10:02:14 AM4/26/17
to Jan Mercl, Brad Fitzpatrick, golang-dev
Hi Jan,

You'll need to create a Google Cloud project to be able to run queries, but you don't need to set up billing or anything since first terabyte of data that you query each month is free. After creating a project, you can go to https://bigquery.cloud.google.com/ and the click "Compose Query" to try the examples.

--

-j

--
You received this message because you are subscribed to a topic in the Google Groups "golang-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-dev/dtxNXmDfG-8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages