Last time, I blogged about the Potter Kata, and I carefully followed the advice, which was to only do the minimum required work to pass the tests. This time, I’m going to break the rule and implement a generic solver for the problem, although it may have no practical interest whatsoever.

Let’s pretend that in the pricing rules applied last time, the discount percentages could change, depending on the mood of a very lunatic sales person. For instance, he (or she) might decide that on a particular day, we remove the discount on the pack of two books, and introduce a “buy 2, get 1 free” offer. The prices would then become:

1 book | No discount | 8.00 EUR |

2 different books | No discount |
8.00 * 2 = 16.00 EUR |

3 different books | 3rd one free |
8.00 * 2 + 0.00 * 1 = 16.00 EUR |

4 different books | 20% discount | 8.00 * 4 *0.80 = 25.60 EUR |

5 different books | 25% discount | 8.00 * 5 *0.75 = 30.00 EUR |

Now, let’s examine the previous “edge-case” basket, composed of:

- 2 copies of the first book
- 2 copies of the second book
- 2 copies of the third book
- 1 copy of the fourth book
- 1 copy of the fifth book

With the new pricing rules, the new best price is obtained by (5 * 8.0 * 0.75) + (2 * 8.00 + 1 * 0.00) and is 46.00 EUR. Here, the greedy algorithm is correct again, so we could just remove our adjustment rule from last time, isn’t it? Of course not! If we add one more book in the basket, the best price becomes… 3 * (2 * 8.00 + 1 * 0.00) = 48.00 EUR.

So how do we solve that ? Well, we could simply try a brute force solution, compute every single possible set of sets of books, price them, and take the cheapest option. But here we have an interesting property: every book has the same price. So evaluating [[1;2];[1;2];[3]] and [[1;2];[1;3];[2]] is exactly the same. When we consider books, we don’t really care about their individual number or title, but only about whether they are distinct from each other or not.

So here is my take on a solver algorithm. First, we take the books and organize them by types, sorted by ascending quantity:

let types = books |> Seq.countBy id |> Seq.map snd |> Seq.sort |> Seq.toList |> List.rev

This means that the following books [1;2;3;1;2;3;4;5] get transformed to the list [1;1;2;2;2], which means “a book that appears only once, then another one, then one that appears twice, then another one, and yet another one”.

From this representation, we can price the basket by splitting it into two simpler baskets, in several ways, price each combinations, and take the cheapest. To split the basket, we remove a single item of each kind of book, for the first *n* types. An example is worth a thousand words here:

From [1;1;2;2;2] we can get the following splittings:

- [1] + [1;2;2;2]
- [1;1] + [2;2;2]
- [1;1;1] + [1;2;2]
- [1;1;1;1] + [1;1;2]
- [1;1;1;1,1] + [1;1;1]

The first half is always priced directly as there is never more than one book of each kind. The second half is priced recursively using the same principle.

The algorithm is the following:

let rec priceTypes priceBucket (types: int list) = let nbTypes = List.length types if nbTypes = 0 then 0.0M else let allPrices = seq { for bucketSize = 1 to nbTypes do let bucketPrice = priceBucket bucketSize let remainingTypes = types |> Seq.mapi (fun i n -> if i < bucketSize then n - 1 else n) |> Seq.filter (fun n -> n > 0) |> Seq.sort |> Seq.toList |> List.rev yield bucketPrice + priceTypes priceBucket remainingTypes } allPrices |> Seq.min

There are **many** combinations. This will lead to **long** computations, even with a low number of books. But many of these computations will share some identical combinations, and they don’t need to be run several times. This is just what memoization is designed for. The principle is simple: when an idempotent function gets the same input twice, it must generate the same output. From there, if we can record the generated output the first time and associate it with the input, the next time we get this input we can directly return the recorded output.

A standard memoize function can then be defined:

let memoize f = let dict = Dictionary<_, _>() fun x -> match dict.TryGetValue(x) with | true, res -> res | false, _ -> let res = f x dict.Add(x, res) res

The things get a little more complicated when recursion is involved because in order to memoize effectively a recursive function, the function itself has to call its memoized version recursively (instead of itself). It gets even more complicated if you need the whole recursive thing to be tail-recursive. For now, hopefully, we don’t need that, because the tree of combinations that we try to build is quite wide, but not very deep.

My implementation of the memoized version is the following:

let rec priceTypes priceBucketFunc = let rec memoizedPriceTypes = let priceTypesRaw (types: int list) = let nbTypes = List.length types if nbTypes = 0 then 0.0M else let allPrices = seq { for bucketSize = 1 to nbTypes do let bucketPrice = priceBucketFunc bucketSize let remainingTypes = types |> Seq.mapi (fun i n -> if i < bucketSize then n - 1 else n) |> Seq.filter (fun n -> n > 0) |> Seq.sort |> Seq.toList |> List.rev yield bucketPrice + memoizedPriceTypes remainingTypes } allPrices |> Seq.min memoize priceTypesRaw memoizedPriceTypes

We can now build a price function that takes a “priceBucket” function as an argument, and returns the actual pricing function:

let price priceBucketFunc books = let types = books |> Seq.countBy id |> Seq.map snd |> Seq.sort |> Seq.toList |> List.rev priceTypes priceBucketFunc types

The function can then be used this way:

let priceBucket = function | 1 -> 8.0M | 2 -> 16.0M | 3 -> 16.0M | 4 -> 25.6M | 5 -> 30.0M | _ -> failwith "Really ? 6 books ?" let price = PotterKataTooMuch.price priceBucket price [1;1;2;2;3;3;4;5]

It works on small to medium lists of books (such as 50 books for instances). Let’s pretend the world is perfect, and that it’s enough for today!

The code from this post and the previous one is available on GitHub.