The Case for interface{}
As we begin thinking about the future of Go with events like the Contributors Summit and by writing Experience Reports, one topic that’s near and dear to my heart is Go’s use of the empty interface. While the empty interface is often criticized for being overused, even labeled a “mistake” by programmers who think it erases all type information, it does have use cases for which it is very well suited.
In this post I propose three rules that can be used to determine if your (Go 1) use of the empty interface will work out, or lead to heartache down the line. I also look at two examples of the empty interface in real APIs, one good, and one a necessary evil.
But first:
What is an empty interface?
An interesting consequence of the Go type system (where types implicitly satisfy interfaces if they have the correct methods) is the empty interface:
interface{}
Because the empty interface does not require any methods to be satisfied, it is satisfied by all types. The following is valid Go:
var v interface{}
v = 1
v = "string"
v = 2.3
v = struct{
a int
b string
}{
a: 3,
b: "One hen",
}
Though the underlying type is not entirely lost when we assign a value to a binding that uses the empty interface, it is still a form of type erasure. The underlying type is hidden: it cannot be recovered without type assertions or reflection.
interface{} says nothing.
In this post I’ll take it as a given that reflection should be avoided if possible. Suffice it to say that reflection can sometimes push compile time errors back to runtime, and can make code difficult to read and maintain.
Let us also take it as a given that the ability of the empty interface to hold a value of any type is very easy to misuse and thus, it is (widely and often). Experience has taught us that use of the empty interface is an anti-pattern1, albeit an elegant one.
Empty interface is elegant in the same sense that Java’s Object is elegant: simplicity in form, but actually confusing in practice.
Hypothesis
I posit that 3 conditions must be met in order to use empty interface in an API without making the code difficult to read or maintain:
- The behavior of the type cannot be described by an existing, more specific, interface.
- A new, more specific, interface cannot be written to describe the behavior of the type.
- The code that assigns a value to an empty interface must also be the code that consumes the value in the empty interface.
The first two points are relatively straightforward. Together, they can be simplified to:
If we can use a more specific type, we should.
Unless you’re being forced to use a static type system against your will, I suspect this is not very controversial and I will ignore it for the rest of the post. The third postulate may need a bit of discussion, since there’s actually quite a lot crammed into this seemingly simple statement.
Example: bad use of interface{}
Let’s take a look at the xml.Marshal
function:
// Marshal returns the XML encoding of v.
func Marshal(v interface{}) ([]byte, error)
This is a very simple API, and satisfies our first two postulates.
We can’t always use a more specific type for v
because in Go we cannot define
methods on types from other packages.
To use a more specific interface such as xml.Marshaler
every single type
defined by every single package would need to provide a “marshaler” for every
single type of encoding.
This API does not satisfy the third postulate, however, because the consumer of
the value (Marshal
) is not the same as the producer of the value (the caller
of Marshal
).
The consequence of this is that the default behavior isn’t actually all that
useful and the XML package uses reflection heavily to find information about the
underlying type at runtime to try and avoid hitting the default case for unknown
types and provide a better XML representation.
This package—and other similar packages such as encoding/json
—make good use
of interface{}
in their API, but at the cost of maintainability.
Note that I’m not suggesting that there is a better way to handle serialization
of arbitrary values in Go 1, just that this use of interface{}
is not ideal.
Example: good use of interface{}
So what is a good use of the empty interface that meets all three of our
original requirements?
SASL is a standard
for authentication defined in RFC 4422.
Let’s take a look at my mellium.im/sasl
package.
There are two main APIs in this package: the Negotiator
API is meant for
application developers and is what you use to actually negotiate auth.
Given a representation of a SASL mechanism a Negotiator
ensures that the state
machine cannot enter an invalid state, or step backwards to a previous state
which might be a security vulnerability.
The package also contains a Mechanism
API.
This API is meant for library developers creating new authentication mechanisms.
A library author might create an XOAUTH2 mechanism to be used by application
developers in conjunction with the Negotiator
API from the sasl
package to
perform authorization with an OAuth2 bearer token.
This Mechanism
API makes use of the empty interface.
// NewClient creates a new SASL Negotiator that supports creating authentication
// requests using the given mechanism.
func NewClient(m Mechanism, opts ...Option) *Negotiator
// Mechanism represents a SASL mechanism such as PLAIN or SCRAM-SHA-256 that can
// be used by a Negotiator to generate challenges and responses.
type Mechanism struct {
Name string
Start func(n *Negotiator) (more bool, resp []byte, cache interface{}, err error)
Next func(n *Negotiator, challenge []byte, data interface{})
(more bool, resp []byte, cache interface{}, err error)
}
Because the Negotiator
manages the state machine, Mechanism
’s are stateless.
They can be reused, are easier to write, and are less likely to contain critical
security vulnerabilities.
Having Negotiator
handle all the state does not, of course, guarantee that
there are no bugs or vulnerabilities in the SASL mechanism implementations, it
just makes it easier for authors to write bug free mechanisms.
Unfortunately, on each step of the state machine, the mechanism may generate information that needs to be known in a future step. For example: the SCRAM-SHA-256 mechanism computes a proof-of-possession which includes the bytes of the very first message that was sent at the beginning of auth. If we want our mechanism to be stateless, it can’t remember the bytes it returned earlier.
Instead, the Negotiator
(which is already stateful) keeps track of the
required state for the mechanism that it is using.
This state might be entirely different (if it exists at all) for different
mechanisms so an interface{}
is used to store it.
Each time the state machine advances, the mechanism’s Next
function is called
(or Start
for the first step, which is different for reasons that don’t matter
here), and each time it’s called it may return any state it needs to use later
using the cache
return parameter.
This data is then cached by the Negotiator
and when the Next
function is
invoked again, whatever data was returned from the previous step will be passed
back in via the data
parameter.
The Negotiator
does not know anything about the value it is storing.
It simply stores it after Start
or Next
is called, and blindly passes it
along to future invocations of Next
.
The empty interface “says nothing”, but “nothing” is sometimes exactly what’s
required.
The Mechanism
on the other hand can be written to always know the type of the
value that the data
parameter will have, because it set that value in the
first place.
This meets all of our requirements for a good use of interface{}
:
- We cannot be more specific about the type of data stored by the negotiator because any SASL mechanism spec could define a mechanism that needs to store any type of data with any behavior (requirements 1 and 2).
- The code in the
sasl
package where theinterface{}
was declared needs to know nothing about what is stored in the interface; the code in the package defining the mechanism, which both produces and consumes the value in the empty interface, can always assert on the type of the value (requirement 3).
Conclusion
The empty interface is easy to misuse, and it is not obvious under what circumstances it can be used without making code difficult to read and maintain. Even when experienced users are aware of the dangers of using empty interface too readily, it is often required due to other limitations in the type system. However, it also has use cases for which it is an excellent solution.
Future versions of Go should provide users with alternatives to misusing the empty interface in situations where the type of data is unknown, while still supporting the use case of APIs that may take many different types of data, but always know the type of the data they are using.
Thanks to Christopher Agocs and Dave Cheney for their reviews and criticism. Gopher artwork by Ashley McNamara and based on an original work by Renee French.
Update
Since writing this post I’ve tweaked my rules a little bit and given a talk to the Go dev room at FOSDEM2018. You can see it here:
-
Go Anti Patterns. Edward Muller, GopherCon 2017, https://youtu.be/ltqV6pDKZD8?t=2019 ↩︎