I write code and talk about writing code

Go Sync or Go Home: ErrGroup

Posted on Jul 10, 2023
go

Introduction

This is the second post in my series, Go Sync or Go Home, where I explore lesser-known features of the sync and x/sync packages.

In today’s blog post, we will be diving into the ErrGroup package, which can be seen as an enhanced version of WaitGroup. If you’re not familiar with WaitGroup, I recommend reading my previous post on the topic. I’m excited as this is our first look at a feature from x/sync!

x Packages

Go’s x packages are packages that are developed as part of the Go project but are not part of the Go standard library. These packages are available separately, and in order to use them, you need to fetch the specific package you want using the go get command. The x packages serve a variety of purposes, ranging from expanding on the functionality of the standard library (like x/sync) to experimental features (like x/debug) and additional tools (like x/mod). That’s where ErrGroup comes from. Although it is not yet part of the standard library, there is an open proposal to add it (which you can read about here. If that’s not enough to convince you that ErrGroup contains valuable functionality, maybe this deep dive into its features will!

ErrGroup

ErrGroup provides a powerful mechanism for managing a group of concurrent subtasks, taking into consideration errors, context cancellation, and more.

If any of the following apply to you, you should consider upgrading your WaitGroup to an ErrGroup:

  • You need more fine-tuned control over each goroutine executing a subtask.
  • You want to propagate errors returned from goroutines.
  • You’re using WaitGroup with other synchronization features (like semaphore or mutex)

To illustrate the features of ErrGroup, we’ll apply them to a file transfer app. We’ll start with a basic implementation and progressively improve our code by leveraging the package’s capabilities.

File Transfer App

The goal of our file transfer app is to transfer files to multiple destinations. To establish a connection to each destination, we’re given a Conn interface:

1
2
3
type Conn interface {
    Send(ctx context.Context, file File) error
}

The Send method will send the file over the connection, returning an error if the file is corrupt.

Using a WaitGroup, we can implement a function that transfers one file over multiple connections concurrently:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func TransferFile(ctx context.Context, conns []Conn, file File) {
    wg := sync.WaitGroup{}
    for _, conn := range conns {
    wg.Add(1)
        go func(c Conn) {
            defer wg.Done()
            c.Send(ctx, file)
        }(conn)
    }
    wg.Wait()
}
💡 Side note: We’re passing the Conn as a parameter to the go function so that each invocation uses a copy of the Conn and not the same variable (check out the bug we’re avoiding here).

See if you can spot the error in this implementation.

The issue with our code is that we’re ignoring any errors returned from Send. A better implementation would propagate the first error returned from any of the Send calls to TransferFile’s caller.

Error Handling with ErrGroup

Let’s look at the parts of ErrGroup’s API that will allow us to easily make this change.

Creation

First, we’ll need to create an ErrGroup, and the way we do that is similar to WaitGroup:

1
eg := errgroup.Group{}

Go(f func() error)

The Go method takes the function f and runs it in a separate goroutine. Internally, Go does the equivalent of WaitGroup.Add(1) before starting the goroutine and WaitGroup.Done() once the function has finished. The correct way to use this method is to call it once for every concurrent task we want to run.

Wait() error

ErrGroup’s Wait method blocks until all goroutines are finished, just like WaitGroup’s Wait. The difference lies in the return value — if an error has been encountered in one of the goroutine functions, that error will be returned by this method. If more than one error has been encountered, only the first will be returned.

File Transfer & ErrGroup

Let’s use this feature to propagate the first error returned from a call to Send.First, we’ll create an ErrGroup instead of a WaitGroup. Then, we’ll call the ErrGroup’s Go method instead of creating a new goroutine ourselves. Finally, we’ll return the error returned by Wait.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func TransferFile(ctx context.Context, conns []Conn, file File) error {
    eg := errgroup.Group{}
    for _, conn := range conns {
        func(c Conn) {
            eg.Go(func() error {
                return c.Send(ctx, file)
            }
        }(conn)
    }
    return eg.Wait()
}

This solution may be perfect in certain situations, but in our case we know that Send returns an error only if the file is corrupt, so we can optimize it further. Since encountering an error during any Send operation indicates that all subsequent Send operations will also fail, it is better to halt the remaining Send tasks and immediately return from TransferFile. Fortunately, implementing this will be easy with just a few adjustments.

Context Cancellation with ErrGroup

We can create an ErrGroup that is based on a context.Context. This enables us to use the context to control the execution of our tasks, and stop them immediately upon encountering an error. The API for this is simple:

WithContext(context.Context) (Group, context.Context)

WithContext returns a new ErrGroup and a new context that is based on the context it is given. The new context will end either when the original context ends or when one of the ErrGroup tasks encounters an error.

File Transfer & WithContext

Instead of creating an empty ErrGroup, we’ll create one using WithContext. Then all we need to do is pass the context returned from WithContext to Send:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func TransferFile(ctx context.Context, conns []Conn, file File) error {
    eg, egCtx := errgroup.WithContext(ctx)
    for _, conn := range conns {
        func(c Conn) {
            eg.Go(func() error {
                return c.Send(egCtx, file)
            }
        }(conn)
    }
    return eg.Wait()
}

Send will stop sending the file once the context passed to it ends. As a result, this small change will cause all Send tasks to end once an error is encountered. Consequently, the Wait call will return immediately which will cause the TransferFile function to end immediately. Not only will the error be propagated faster, but this can also improve performance and decrease CPU usage.

Limiting active goroutines with ErrGroup

Another valuable feature of ErrGroup is to limit the number of goroutines running at once. This doesn’t limit all goroutines running in your app, only the ones started by the ErrGroup.

SetLimit(n int)

Calling this method will set a limit on the number of goroutines running at once in this ErrGroup. If SetLimit isn’t called, there is no limit to the number of goroutines running.

Go(f func() error)

Once a limit has been set on the ErrGroup, the Go method will block until it is able to run f in a new goroutine and stay under the given limit.

TryGo(f func() error) bool

TryGo behaves exactly like Go, except for when a limit is set on the ErrGroup. If the limit has been reached, TryGo returns immediately with false. If the limit hasn’t been reached, f will be run in a new goroutine and TryGo will return true.

Summary

In summary, ErrGroup offers an upgraded solution for managing concurrency in your Go programs. It enables you to propagate errors, stop running tasks once an error occurs and set limits on the number of tasks running at once.

What’s Next?

If you’re wondering what’s next, the answer is the least imported sync package! Though it’s not very popular, we’ll find out how useful it can be to improve performance when creating a caching mechanism, so stay tuned!