Skip to content

Latest commit

 

History

History
452 lines (330 loc) · 16.3 KB

revisiting-arrays-and-slices-with-generics.md

File metadata and controls

452 lines (330 loc) · 16.3 KB

Revisiting arrays and slices with generics

The code for this chapter is a continuation from Arrays and Slices, found here

Take a look at both SumAll and SumAllTails that we wrote in arrays and slices. If you don't have your version please copy the code from the arrays and slices chapter along with the tests.

// Sum calculates the total from a slice of numbers.
func Sum(numbers []int) int {
	var sum int
	for _, number := range numbers {
		sum += number
	}
	return sum
}

// SumAllTails calculates the sums of all but the first number given a collection of slices.
func SumAllTails(numbersToSum ...[]int) []int {
	var sums []int
	for _, numbers := range numbersToSum {
		if len(numbers) == 0 {
			sums = append(sums, 0)
		} else {
			tail := numbers[1:]
			sums = append(sums, Sum(tail))
		}
	}

	return sums
}

Do you see a recurring pattern?

  • Create some kind of "initial" result value.
  • Iterate over the collection, applying some kind of operation (or function) to the result and the next item in the slice, setting a new value for the result
  • Return the result.

This idea is commonly talked about in functional programming circles, often times called 'reduce' or fold.

In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value. Typically, a fold is presented with a combining function, a top node of a data structure, and possibly some default values to be used under certain conditions. The fold then proceeds to combine elements of the data structure's hierarchy, using the function in a systematic way.

Go has always had higher-order functions, and as of version 1.18 it also has generics, so it is now possible to define some of these functions discussed in our wider field. There's no point burying your head in the sand, this is a very common abstraction outside the Go ecosystem and it'll be beneficial to understand it.

Now, I know some of you are probably cringing at this.

Go is supposed to be simple

Don't conflate easiness, with simplicity. Doing loops and copy-pasting code is easy, but it's not necessarily simple. For more on simple vs easy, watch Rich Hickey's masterpiece of a talk - Simple Made Easy.

Don't conflate unfamiliarity, with complexity. Fold/reduce may initially sound scary and computer-sciencey but all it really is, is an abstraction over a very common operation. Taking a collection, and combining it into one item. When you step back, you'll realise you probably do this a lot.

A generic refactor

A mistake people often make with shiny new language features is they start by using them without having a concrete use-case. They rely on conjecture and guesswork to guide their efforts.

Thankfully we've written our "useful" functions and have tests around them, so now we are free to experiment with ideas in the refactoring stage of TDD and know that whatever we're trying, has a verification of its value via our unit tests.

Using generics as a tool for simplifying code via the refactoring step is far more likely to guide you to useful improvements, rather than premature abstractions.

We are safe to try things out, re-run our tests, if we like the change we can commit. If not, just revert the change. This freedom to experiment is one of the truly huge values of TDD.

You should be familiar with the generics syntax from the previous chapter, try and write your own Reduce function and use it inside Sum and SumAllTails.

Hints

If you think about the arguments to your function first, it'll give you a very small set of valid solutions

  • The array you want to reduce
  • Some kind of combining function, or accumulator

"Reduce" is an incredibly well documented pattern, there's no need to re-invent the wheel. Read the wiki, in particular the lists section, it should prompt you for another argument you'll need.

In practice, it is convenient and natural to have an initial value

My first-pass of Reduce

func Reduce[A any](collection []A, accumulator func(A, A) A, initialValue A) A {
	var result = initialValue
	for _, x := range collection {
		result = accumulator(result, x)
	}
	return result
}

Reduce captures the essence of the pattern, it's a function that takes a collection, an accumulating function, an initial value, and returns a single value. There's no messy distractions around concrete types.

If you understand generics syntax, you should have no problem understanding what this function does. By using the recognised term Reduce, programmers from other languages understand the intent too.

The usage

// Sum calculates the total from a slice of numbers.
func Sum(numbers []int) int {
	add := func(acc, x int) int { return acc + x }
	return Reduce(numbers, add, 0)
}

// SumAllTails calculates the sums of all but the first number given a collection of slices.
func SumAllTails(numbers ...[]int) []int {
	sumTail := func(acc, x []int) []int {
		if len(x) == 0 {
			return append(acc, 0)
		} else {
			tail := x[1:]
			return append(acc, Sum(tail))
		}
	}

	return Reduce(numbers, sumTail, []int{})
}

Sum and SumAllTails now describe the behaviour of their computations as the functions declared on their first lines respectively. The act of running the computation on the collection is abstracted away in Reduce.

Further applications of reduce

Using tests we can play around with our reduce function to see how re-usable it is. I have copied over our generic assertion functions from the previous chapter.

func TestReduce(t *testing.T) {
	t.Run("multiplication of all elements", func(t *testing.T) {
		multiply := func(x, y int) int {
			return x * y
		}

		AssertEqual(t, Reduce([]int{1, 2, 3}, multiply, 1), 6)
	})

	t.Run("concatenate strings", func(t *testing.T) {
		concatenate := func(x, y string) string {
			return x + y
		}

		AssertEqual(t, Reduce([]string{"a", "b", "c"}, concatenate, ""), "abc")
	})
}

The zero value

In the multiplication example, we show the reason for having a default value as an argument to Reduce. If we relied on Go's default value of 0 for int, we'd multiply our initial value by 0, and then the following ones, so you'd only ever get 0. By setting it to 1, the first element in the slice will stay the same, and the rest will multiply by the next elements.

If you wish to sound clever with your nerd friends, you'd call this The Identity Element.

In mathematics, an identity element, or neutral element, of a binary operation operating on a set is an element of the set which leaves unchanged every element of the set when the operation is applied.

In addition, the identity element is 0.

1 + 0 = 1

With multiplication, it is 1.

1 * 1 = 1

What if we wish to reduce into a different type from A?

Suppose we had a list of transactions Transaction and we wanted a function that would take them, plus a name to figure out their bank balance.

Let's follow the TDD process.

Write the test first

func TestBadBank(t *testing.T) {
	transactions := []Transaction{
		{
			From: "Chris",
			To:   "Riya",
			Sum:  100,
		},
		{
			From: "Adil",
			To:   "Chris",
			Sum:  25,
		},
	}

	AssertEqual(t, BalanceFor(transactions, "Riya"), 100)
	AssertEqual(t, BalanceFor(transactions, "Chris"), -75)
	AssertEqual(t, BalanceFor(transactions, "Adil"), -25)
}

Try to run the test

# github.com/quii/learn-go-with-tests/arrays/v8 [github.com/quii/learn-go-with-tests/arrays/v8.test]
./bad_bank_test.go:6:20: undefined: Transaction
./bad_bank_test.go:18:14: undefined: BalanceFor

Write the minimal amount of code for the test to run and check the failing test output

We don't have our types or functions yet, add them to make the test run.

type Transaction struct {
	From string
	To   string
	Sum  float64
}

func BalanceFor(transactions []Transaction, name string) float64 {
	return 0.0
}

When you run the test you should see the following:

=== RUN   TestBadBank
    bad_bank_test.go:19: got 0, want 100
    bad_bank_test.go:20: got 0, want -75
    bad_bank_test.go:21: got 0, want -25
--- FAIL: TestBadBank (0.00s)

Write enough code to make it pass

Let's write the code as if we didn't have a Reduce function first.

func BalanceFor(transactions []Transaction, name string) float64 {
	var balance float64
	for _, t := range transactions {
		if t.From == name {
			balance -= t.Sum
		}
		if t.To == name {
			balance += t.Sum
		}
	}
	return balance
}

Refactor

At this point, have some source control discipline and commit your work. We have working software, ready to challenge Monzo, Barclays, et al.

Now our work is committed, we are free to play around with it, and try some different ideas out in the refactoring phase. To be fair, the code we have isn't exactly bad, but for the sake of this exercise, I want to demonstrate the same code using Reduce.

func BalanceFor(transactions []Transaction, name string) float64 {
	adjustBalance := func(currentBalance float64, t Transaction) float64 {
		if t.From == name {
			return currentBalance - t.Sum
		}
		if t.To == name {
			return currentBalance + t.Sum
		}
		return currentBalance
	}
	return Reduce(transactions, adjustBalance, 0.0)
}

But this won't compile.

./bad_bank.go:19:35: type func(acc float64, t Transaction) float64 of adjustBalance does not match inferred type func(Transaction, Transaction) Transaction for func(A, A) A

The reason is we're trying to reduce to a different type than the type of the collection. This sounds scary, but actually just requires us to adjust the type signature of Reduce to make it work. We won't have to change the function body, and we won't have to change any of our existing callers.

func Reduce[A, B any](collection []A, accumulator func(B, A) B, initialValue B) B {
	var result = initialValue
	for _, x := range collection {
		result = accumulator(result, x)
	}
	return result
}

We've added a second type constraint which has allowed us to loosen the constraints on Reduce. This allows people to Reduce from a collection of A into a B. In our case from Transaction to float64.

This makes Reduce more general-purpose and reusable, and still type-safe. If you try and run the tests again they should compile, and pass.

Extending the bank

For fun, I wanted to improve the ergonomics of the bank code. I've omitted the TDD process for brevity.

func TestBadBank(t *testing.T) {
	var (
		riya  = Account{Name: "Riya", Balance: 100}
		chris = Account{Name: "Chris", Balance: 75}
		adil  = Account{Name: "Adil", Balance: 200}

		transactions = []Transaction{
			NewTransaction(chris, riya, 100),
			NewTransaction(adil, chris, 25),
		}
	)

	newBalanceFor := func(account Account) float64 {
		return NewBalanceFor(account, transactions).Balance
	}

	AssertEqual(t, newBalanceFor(riya), 200)
	AssertEqual(t, newBalanceFor(chris), 0)
	AssertEqual(t, newBalanceFor(adil), 175)
}

And here's the updated code

package main

type Transaction struct {
	From string
	To   string
	Sum  float64
}

func NewTransaction(from, to Account, sum float64) Transaction {
	return Transaction{From: from.Name, To: to.Name, Sum: sum}
}

type Account struct {
	Name    string
	Balance float64
}

func NewBalanceFor(account Account, transactions []Transaction) Account {
	return Reduce(
		transactions,
		applyTransaction,
		account,
	)
}

func applyTransaction(a Account, transaction Transaction) Account {
	if transaction.From == a.Name {
		a.Balance -= transaction.Sum
	}
	if transaction.To == a.Name {
		a.Balance += transaction.Sum
	}
	return a
}

I feel this really shows the power of using concepts like Reduce. The NewBalanceFor feels more declarative, describing what happens, rather than how. Often when we're reading code, we're darting through lots of files, and we're trying to understand what is happening, rather than how, and this style of code facilitates this well.

If I wish to dig in to the detail I can, and I can see the business logic of applyTransaction without worrying about loops and mutating state; Reduce takes care of that separately.

Fold/reduce are pretty universal

The possibilities are endless™️ with Reduce (or Fold). It's a common pattern for a reason, it's not just for arithmetic or string concatenation. Try a few other applications.

  • Why not mix some color.RGBA into a single colour?
  • Total up the number of votes in a poll, or items in a shopping basket.
  • More or less anything involving processing a list.

Find

Now that Go has generics, combining them with higher-order-functions, we can reduce a lot of boilerplate code within our projects, to help our systems be easier to understand and manage.

No longer do you need to write specific Find functions for each type of collection you want to search, instead re-use or write a Find function. If you understood the Reduce function above, writing a Find function will be trivial.

Here's a test

func TestFind(t *testing.T) {
	t.Run("find first even number", func(t *testing.T) {
		numbers := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

		firstEvenNumber, found := Find(numbers, func(x int) bool {
			return x%2 == 0
		})
		AssertTrue(t, found)
		AssertEqual(t, firstEvenNumber, 2)
	})
}

And here's the implementation

func Find[A any](items []A, predicate func(A) bool) (value A, found bool) {
	for _, v := range items {
		if predicate(v) {
			return v, true
		}
	}
	return
}

Again, because it takes a generic type, we can re-use it in many ways

type Person struct {
    Name string
}

t.Run("Find the best programmer", func(t *testing.T) {
    people := []Person{
        Person{Name: "Kent Beck"},
        Person{Name: "Martin Fowler"},
        Person{Name: "Chris James"},
    }

    king, found := Find(people, func(p Person) bool {
        return strings.Contains(p.Name, "Chris")
    })

    AssertTrue(t, found)
    AssertEqual(t, king, Person{Name: "Chris James"})
})

As you can see, this code is flawless.

Wrapping up

When done tastefully, higher-order functions like these will make your code simpler to read and maintain, but remember the rule of thumb:

Use the TDD process to drive out real, specific behaviour that you actually need, in the refactoring stage you then might discover some useful abstractions to help tidy the code up.

Practice combining TDD with good source control habits. Commit your work when your test is passing, before trying to refactor. This way if you make a mess, you can easily get yourself back to your working state.

Names matter

Make an effort to do some research outside of Go, so you don't re-invent patterns that already exist with an already established name.

Writing a function takes a collection of A and converts them to B? Don't call it Convert, that's Map. Using the "proper" name for these items will reduce the cognitive burden for others and make it more search engine friendly to learn more.

This doesn't feel idiomatic?

Try to have an open-mind.

Whilst the idioms of Go won't, and shouldn't radically change due to generics being released, the idioms will change - due to the language changing! This should not be a controversial point.

Saying

This is not idiomatic

Without any more detail, is not an actionable, or useful thing to say. Especially when discussing new language features.

Discuss with your colleagues patterns and style of code based on their merits rather than dogma. So long as you have well-designed tests, you'll always be able to refactor and shift things as you understand what works well for you, and your team.

Resources

Fold is a real fundamental in computer science. Here's some interesting resources if you wish to dig more into it