Telemetry in Go: How We’ve Improved It

Go 1.23 provides a new way to help you improve the Go tool channel. By activating telemetry downloadYou can choose to share data on tool chain programs and use with the GO team. These data will help contributors to correct bugs, avoid regressions and make better decisions.
By default, Go telemetry data is stored only on your local computer. If you activate the download, a limit The subset of your data is published each week on Telemetrie.Go.DEV.
Starting with Go 1.23, you can activate the download of your local telemetry data with the following command:
go telemetry on
To deactivate even the collection of local telemetry data, run the following order:
go telemetry off
THE Telemetry documentation Contains a more detailed description of the implementation.
A brief history of Go telemetry
Although software telemetry is not a new idea, the Go team has undergone many iterations looking for a telemetry implementation that met the GO requirements for performance, portability and transparency.
The initial design aimed to be so discreet, open and preserving confidentiality that it would be acceptable to activate by default, but many users have raised concerns in a long public discussionAnd the design was finally modified to require an explicit consent of users for remote download.
The new design was accepted In April 2023, and implemented during this summer.
Telemetry in Gopls
The first iteration of GO telemetry shipped in v0.14 of the Go language server gopls
In October 2023. After the launch, around 100 users made it possible to download, perhaps motivated by outing notes or a discussion in the Gophers Slack chain, and the data began to flow. It was not long before telemetry found its first bug in Gopls:
A trace of Dan battery noticed in its downloaded telemetry data led to a reported and corrected bug. It should be noted that we did not know who had reported the battery.
Ide
Although it is great to see telemetry working in practice and we appreciated the support of these first adopters, 100 participants are not enough to measure the types of things we want to measure.
As Russ Cox pointed out in his original blog articles, an inconvenience of the off-by-default approach for telemetry is the continuous need to encourage participation. It is necessary to raise awareness to maintain a sample of users large enough for a significant quantitative analysis of data and representative of the user population. Although blog articles and publication notes can stimulate participation (and we will appreciate it if you allow telemetry after reading this!), They lead to a biased sample. For example, we have received almost no data to GOOS=windows
The first adopters of telemetry in Gopls.
To help reach more users, we have introduced a fast In the vs code Go plugin, asking users if they wish to activate telemetry:
From this blog article, the prompt is over 5% of users of VS code Go, and the telemetry sample increased to around 1,800 weekly participants:
(The initial bump is probably due to the incentive all Users of code extension vs code go every night).
However, he introduced a notable asymmetry to Code VS users, compared to Most recent GO survey results::
We suspect that the VS code is over -represented in telemetry data.
We plan to resolve this bias by Inviting all LSP compatible publishers who use Goplsusing a functionality of the language server protocol itself.
Telemetry wins
By caution, we proposed the collection of some basic measures for the initial launch of telemetry in Gopls. One of them was the gopls/bug
battery meterwhich records unexpected or “impossible” conditions encountered by Gopls. Indeed, it is a kind of affirmation, but instead of stopping the program, he records in telemetry that he was reached in a certain execution, as well as the battery.
During our Gopls scalability Work, we had added many claims of this kind, but we have rarely observed them to fail in tests or in our own use of Gopls. We expected almost all these claims to be inaccessible.
While we are starting to invite random users to the VS code to allow telemetry, we have seen that many of these conditions were Reached in practice, and the context of the pile trace was often sufficient to reproduce and correct the longtime bugs. We started collecting these problems under the gopls/telemetry-wins
Label, to keep track of the “victories” facilitated by telemetry.
I came to think of “telemetry victories” with a second meaning: when you compare the development of Gopls with and without telemetry, Telemetry wins.
The most surprising aspect of bugs from telemetry was the number of them real. Of course, some of them were invisible to users, but a good number of them were real bad behaviors of Gopls – things like missing cross references, or subtly incorrect completion under certain rare conditions. They were exactly the kind of thing that a user could be slightly bored, but would probably not bother to report as a problem.
Maybe the user would assume that the behavior was planned. If they have reported a problem, they might not assure you how to reproduce the bug, or we would need a long back and forth on the tracker of the problem to capture a trace of battery. Without telemetry, there is No reasonable way That most of these bugs have been discovered, even less corrected.
And all of this came from a few counters. We only had traces of instrumented battery for potential bugs We knew. What about problems that we have not planned?
Automated planting reports
Go 1.23 includes a new runtime.SetCrashOutput
API which can be used to implement automated crash reports via a surveillance process. Beginning with V0.15.0, Gopls reports a crash/crash
stack the counter when it blocks, supplied Gopls itself is built with Go 1.23.
When we got out [email protected]Only a handful of users of our sample had built GOPLS using an unprecedented development construction of Go 1.23, but the new crash/crash
The counter has always found two bugs.
Given the usefulness of telemetry which has proven with only a tiny amount of instrumentation and a fraction of our target sample, the future seems brilliant.
Go 1.23 records telemetry in the Go tool chain, including go
command and other tools such as compiler, links and links and go vet
. We have added telemetry to vulncheck
and the plugin vs code go, and We offer To add it to delve
Also.
The series of original telemetry blogs reflect on many ideas on how telemetry could be used to improve Go. We can't wait to explore these ideas and more.
Within Gopls, we plan to use telemetry to improve reliability and clarify decision -making and hierarchy. With automated planting activated by Go 1.23, we plan to take many more crashes in pre -rereation tests. In the future, we will add more meters to measure the user experience – key operations disease, the frequency of use of various features – so that we can focus our efforts where they will benefit the GO developers.
Go tors 15 in November, and the language and its ecosystem continue to grow. Telemetry will play an essential role by helping contributors to go faster and more safely, in the right direction.
Credits: Robert Findley
Photo by Anne Nygård on Unsplash
This article is available on