Property-Based Testing with PropEr, Erlang, and...

Extracted from:

Property-Based Testing withPropEr, Erlang, and Elixir

Find Bugs Before Your Users Do

This PDF file contains pages extracted from Property-Based Testing with PropEr,Erlang, and Elixir, published by the Pragmatic Bookshelf. For more information

or to purchase a paperback or PDF copy, please visit http://www.pragprog.com.

Note: This extract contains some colored text (particularly in code listing). Thisis available only in online versions of the books. The printed versions are blackand white. Pagination might vary between the online and printed versions; the

content is otherwise identical.

Copyright © 2019 The Pragmatic Programmers, LLC.

All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted,in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise,

without the prior consent of the publisher.

The Pragmatic BookshelfRaleigh, North Carolina

http://www.pragprog.com

Property-Based Testing withPropEr, Erlang, and Elixir

Find Bugs Before Your Users Do

Fred Hebert

The Pragmatic BookshelfRaleigh, North Carolina

Many of the designations used by manufacturers and sellers to distinguish their productsare claimed as trademarks. Where those designations appear in this book, and The PragmaticProgrammers, LLC was aware of a trademark claim, the designations have been printed ininitial capital letters or in all capitals. The Pragmatic Starter Kit, The Pragmatic Programmer,Pragmatic Programming, Pragmatic Bookshelf, PragProg and the linking g device are trade-marks of The Pragmatic Programmers, LLC.

Every precaution was taken in the preparation of this book. However, the publisher assumesno responsibility for errors or omissions, or for damages that may result from the use ofinformation (including program listings) contained herein.

Our Pragmatic books, screencasts, and audio books can help you and your team createbetter software and have more fun. Visit us at https://pragprog.com.

The team that produced this book includes:

Publisher: Andy HuntVP of Operations: Janet FurlowManaging Editor: Susan ConantDevelopment Editor: Jacquelyn CarterCopy Editor: L. Sakhi MacMillanIndexing: Potomac Indexing, LLCLayout: Gilson Graphics

For sales, volume licensing, and support, please contact [email protected].

For international rights, please contact [email protected].

Copyright © 2019 The Pragmatic Programmers, LLC.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,or transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording,or otherwise, without the prior consent of the publisher.

ISBN-13: 978-1-68050-621-1Book version: P1.0—January 2019

https://pragprog.com

[email protected]

[email protected]

CHAPTER 1

Foundations of Property-Based TestingTesting can be a pretty boring affair. It’s a necessity you can’t avoid: the onething more annoying than having to write tests is not having tests for yourcode. Tests are critical to safe programs, especially those that change overtime. They can also prove useful to help properly design programs, helpingus write them as users as well as implementers. But mostly, tests are repetitiveburdensome work.

Take a look at this example test that will check that an Erlang function can takea list of presorted lists and always return them merged as one single sorted list:

merge_test() ->[] = merge([]),[] = merge([[]]),[] = merge([[],[]]),[] = merge([[],[],[]]),[1] = merge([[1]]),[1,1,2,2] = merge([[1,2],[1,2]]),[1] = merge([[1],[],[]]),[1] = merge([[],[1],[]]),[1] = merge([[],[],[1]]),[1,2] = merge([[1],[2],[]]),[1,2] = merge([[1],[],[2]]),[1,2] = merge([[],[1],[2]]),[1,2,3,4,5,6] = merge([[1,2],[],[5,6],[],[3,4],[]]),[1,2,3,4] = merge([[4],[3],[2],[1]]),[1,2,3,4,5] = merge([[1],[2],[3],[4],[5]]),[1,2,3,4,5,6] = merge([[1],[2],[3],[4],[5],[6]]),[1,2,3,4,5,6,7,8,9] = merge([[1],[2],[3],[4],[5],[6],[7],[8],[9]]),Seq = seq(1,100),true = Seq == merge(map(fun(E) -> [E] end, Seq)),ok.

This is slightly modified code taken from the Erlang/OTP test suites for thelists module, one of the most central libraries in the entire language. It is

• Click HERE to purchase this book now. discuss

http://pragprog.com/titles/fhproper

http://forums.pragprog.com/forums/fhproper

boring, repetitive. It’s the developer trying to think of all the possible waysthe code can be used, and making sure that the result is predictable. Youcould probably think of another ten or thirty lines that could be added, andit could still be significant and explore the same code in somewhat differentways. Nevertheless, it’s perfectly reasonable, usable, readable, and effectivetest code. The problem is that it’s just so repetitive that a machine could doit. In fact, that’s exactly the reason why traditional tests are boring. They’recarefully laid out instructions where we tell the machine which test to runevery time, with no variation, as a safety check.

It’s not just that a machine could do it, it’s that a machine should do it. We’respending our efforts the wrong way, and we could do better than this withthe little time we have.

This is why property-based testing is one of the software development practicesthat generated the most excitement in the last few years. It promises better,more solid tests than nearly any other tool out there, with very little code.This means, similarly, that the software we develop with it should also getbetter—which is good, since overall software quality is fairly dreadful(including my own code). Property-based testing offers a lot of automation tokeep the boring stuff away, but at the cost of a steeper learning curve and alot of thinking to get it right, particularly when getting started. Here’s whatan equivalent property-based test could look like:

sorted_list(N) -> ?LET(L, list(N), sort(L)).

prop_merge() ->?FORALL(List, list(sorted_list(pos_integer())),

merge(List) == sort(append(List))).

Not only is this test shorter with just four lines of code, it covers more cases.In fact, it can cover hundreds of thousands of them. Right now the property-based test probably looks like a bunch of gibberish that can’t be executed (atleast not without the PropEr framework), but in due time, this will be prettyeasy for you to read, and will possibly take less time to understand than eventhe long-form traditional test would.

Of course, not all test scenarios will be looking so favorable to property-basedtesting, but that is why you have this book. Let’s get started right away: inthis chapter, we’ll see the results we should expect from property-basedtesting, and we’ll cover the principles behind the practice and how theyinfluence the way to write tests. We’ll also pick the tools we need to get going,since, as you’ll see, property-based testing does require a framework to beuseful.

• 6




Promises of Property-Based TestingProperty-based tests are different from traditional ones and require morethinking. A lot more. Good property-based testing is a learned and practicedskill, much like playing a musical instrument or getting good at drawing: youcan get started and benefit from it in plenty of friendly ways, but the skillceiling is very high. You could write property tests for years and still find waysto improve and get more out of them. Experts at property-based testing cando some pretty amazing stuff.

The good news is that even as a beginner, you can get good results out ofproperty-based testing. You’ll be able to write simple, short, and concise teststhat automatically comb through your code the way only the most obsessivetester could. Your code coverage will go high and stay there even as youmodify the program without changing the tests. You’ll even be able to usethese tests to find new edge cases without even needing to modify anything.

With a bit more experience, you’ll be able to write straightforward integrationtests of stateful systems that find complex and convoluted bugs nobody eventhought could be hiding in there. In fact, if you currently feel like a code basewithout unit tests is not trustworthy, you’ll probably discover the same phe-nomenon with property testing real soon. You’ll have seen enough of whatproperties can do to see the shadows of bugs that haven’t been discoveredyet, and just feel something is missing, until you’ve given them a propershakedown.

Overall, you’ll see that property-based testing is not just using a bunch oftools to automate boring tasks, but a whole different way to approach testing,and to some extent, software design itself. Now that’s a bold claim, but weonly have to look at what experts can do to see why that might be so. Anexample can be found in Thomas Arts’ slide set1 and presentation2 from theErlang Factory 2016. In that talk, he mentioned using QuickCheck (thecanonical property-based testing tool) to run tests on Project FIFO,3 an opensource cloud project. With a mere 460 lines of property tests, they covered60,000 lines of production code and uncovered twenty-five important bugs,including:

• Timing errors• Race conditions

1. http://www.erlang-factory.com/static/upload/media/1461230674757746pbterlangfactorypptx.pdf2. https://youtu.be/iW2J7Of8jsE3. https://project-fifo.net


Promises of Property-Based Testing • 7

http://www.erlang-factory.com/static/upload/media/1461230674757746pbterlangfactorypptx.pdf

https://youtu.be/iW2J7Of8jsE

https://project-fifo.net



• Type errors• Incorrect use of library APIs• Errors in documentation• Errors in the program logic• System limits errors• Errors in fault handling• One hardware error

Considering that some studies estimate that developers average six softwarefaults per 1,000 lines of code, then finding twenty-five important bugs with460 lines of tests is quite a feat. That’s finding over fifty bugs per 1,000 linesof test, with each of these lines covering 140 lines of production code. If thatdoesn’t make you want to become a pro, I don’t know what will.

Let’s take a look at some more-expert work. Joseph Wayne Norton ran aQuickCheck suite4 of under 600 lines over Google’s levelDB to find sequencesof seventeen and thirty-one specific calls that could corrupt databases withghost keys. No matter how dedicated to the task someone is, nobody wouldhave found it easy to come up with the proper sequence of thirty-one callsrequired to corrupt a database.

Again, those are amazing results; that’s a surprisingly low amount of codeto find a high number of nontrivial errors on software that was otherwisealready tested and running in production. Property-based testing is soimpressive that it has wedged itself in multiple industries, including mission-critical telecommunication components,5 databases,6 components of cloudproviders’ routing and certificate-management layers, IoT platforms, and evenin cars.7

We won’t start at that level, but if you follow along (and do the exercises andpractice enough), we might get pretty close. Hopefully, the early lessons willbe enough for you to start applying property-based testing in your ownprojects. As we go, you’ll probably feel like writing fewer and fewer traditionaltests and will replace them with property tests instead. This is a really goodthing, since you’ll be able to delete code that you won’t have to ever maintainagain, at no loss in the quality of your software.

4. http://htmlpreview.github.io/?https://raw.github.com/strangeloop/lambdajam2013/master/slides/Norton-QuickCheck.html

5. http://www.erlang-factory.com/conference/ErlangUserConference2010/speakers/TorbenHoffman6. http://www.erlang-factory.com/upload/presentations/255/RiakInside.pdf7. http://www2015.taicpart.org/slides/Arts-TAICPART2015-Presentation.pdf

• 8


http://htmlpreview.github.io/?https://raw.github.com/strangeloop/lambdajam2013/master/slides/Norton-QuickCheck.html

http://htmlpreview.github.io/?https://raw.github.com/strangeloop/lambdajam2013/master/slides/Norton-QuickCheck.html

http://www.erlang-factory.com/conference/ErlangUserConference2010/speakers/TorbenHoffman

http://www.erlang-factory.com/upload/presentations/255/RiakInside.pdf

http://www2015.taicpart.org/slides/Arts-TAICPART2015-Presentation.pdf



But as in playing music or drawing, things will sometimes be hard. When wehit a wall, it can be tempting (and easy) to tell ourselves that we just don’thave that innate ability that experts must have. It’s important to rememberproperty-based testing is not a thing reserved for wizard programmers. Theeffort required to improve is continuous, and the progress is gradual andstepwise. Each wall you hit reveals an opportunity for improvement. We’ll getthere, one step at a time.

Properties FirstBefore jumping to our tools and spitting out lines of code, the first thing todo is to get our fundamentals right and stop thinking property-based testingis about tests. It’s about properties. Let’s take a look at what the differenceis, and what thinking in properties looks like.

Conceptually, properties are not that complex. Traditional tests are oftenexample-based: you make a list of a bunch of inputs to a given program andthen list the expected output. You may put in a couple of comments aboutwhat the code should do, but that’s about it. Your test will be good if you canhave examples that can exercise all the possible program states you have.

By comparison, property-based tests have nothing to do with listing examplesby hand. Instead, you’ll want to write some kind of meta test: you find a rulethat dictates the behavior that should always be the same no matter whatsample input you give to your program and encode that into some executablecode—a property. A special test framework will then generate examples foryou and run them against your property to check that the rule you came upwith is respected by your program.

In short, traditional tests have you come up with examples that indirectlyimply rules dictating the behavior of your code (if at all), and property-basedtesting asks you to come up with the rules first and then tests them for you.It’s a similar distinction as the one you’d get between imperative and declar-ative programming styles, but pushed to the next level.

The Example-Based WayHere’s an example. Let’s say we have a function to represent a cash register.The function should take in a series of bills and coins representing what’salready in the register, an amount of money due to be paid by the customer,and then the money handed by the customer to the cashier. It should returnthe bills and coins to cover the change due to the customer.

An approach based on unit tests may look like the following:


Properties First • 9



%% Money in the cash registerRegister = [{20.00, 1}, {10.00, 2}, {5.00, 4},

{1.00, 10}, {0.25, 10}, {0.01, 100}],%% Change = cash(Register, PriceToPay, MoneyPaid),[{10.00, 1}] = cash(Register, 10.00, 20.00),❶[{10.00, 1}, {0.25, 1}] = cash(Register, 9.75, 20.00),❷[{0.01, 18}] = cash(Register, 0.82, 1.00),[{10.00, 1}, {5.00, 1}, {1.00, 3}, {0.25, 2}, {0.01, 13}]❸

= cash(Register, 1.37, 20.00).

At ❶, the test says that a customer paying a $10 item with $20 should expecta single $10 bill back. The case at ❷ says that for a $9.75 purchase paid with$20, a $10 bill with a quarter should be returned, for a total of $10.25.Finally, the test at ❸ shows a $1.37 item paid with a $20 bill yields $18.63in change, with the specific cuts shown.

That’s a fairly familiar approach. Come up with a bunch of arguments withwhich to call the function, do some thinking, and then write down theexpected result. By listing many examples, we try to cover the full set of rulesand edge cases that describe what the code should do. In property-basedtesting, we have to flip that around and come up with the rules first.

The Properties-Based WayThe difficult part is figuring out how to go from our abstract ideas about theprogram behavior to specific rules expressed as code. Continuing with ourcash register example, two rules to encode as properties could be:

• The amount of change is always going to add up to the amount paid minusthe price charged.

• The bills and coins handed back for change are going to start from thebiggest bill possible first, down to the smallest coin possible. This couldalternatively be defined as trying to hand the customer as few individualpieces of money as possible.

Let’s assume we magically encode them into functioning Erlang code (doingthis for real is what this book is about—we can’t show it all in the firstchapter). Our test, as a property, could look something like this:

for_all(RegisterMoney, PriceToPay, MoneyPaid) ->Change = cash(RegisterMoney, PriceToPay, MoneyPaid),sum(Change) == MoneyPaid - PriceToPayandfewest_pieces_possible(RegisterMoney, Change).

Given some amount of money in the register, a price to pay, and an amountof money given by the customer, call the cash/3 function, and then check the

• 10




change given. You can see the two rules encoded there: the first one checkingthat the change balances, and the second one checking that the smallestamount of bills and coins possible is returned (the implementation of whichis not shown).

This property alone is useless. What is needed to make it functional is aproperty-based testing framework. The framework should figure out how togenerate all the inputs required (RegisterMoney, PriceToPay, and MoneyPaid), andthen it should run the property against all the inputs it has generated. If theproperty always remains true, the test is considered successful. If one of thetest cases fails, a good property-based testing framework will modify thegenerated input until it can come up with one that can still provoke the failure,but that is as small as possible—a process called shrinking. Maybe it wouldfind a mistake with a cash register that has a billion coins and bills in it, andan order price in the hundreds of thousands of dollars, but could then replicatethe same failure with only $5 and a cheap item. If so, our debugging job ismade a lot easier by the framework because a simple input means it’s wayeasier to walk the code to figure out what happened.

For example, such a framework could generate inputs giving a call likecash([{1.00, 2}], 1.00, 2.00). Whatever the denomination, we might expect the cash/3function would return a $1 bill and pass. Sooner or later, it would generatean input such as cash([{5.00, 1}], 20.00, 30.00), and then the program would crashand fail the property because there’s not enough change in the register. Payinga $20 purchase with $30, even if the register holds only $5 is entirely possible:take $10 from the $30 and give it back to the customer. Is that specific amountpossible to give back though? In real life, yes. We do it all the time. But inour program, since the money taken from the customer does not come in asbills, coins, or any specific denomination, there is no way to use part of theinput money to form the output. The interface chosen for this function iswrong, and so are our tests.

Comparing ApproachesLet’s take a step back and compare example-based tests with property testsfor this specific problem. With the former, even if all the examples we hadcome up with looked reasonable, we easily found ourselves working withinthe confines of the code and interface we had established. We were not reallytesting the code, we were describing its design, making sure it conformed toexpectations and demands while making sure we don’t slip in the future. Thisis valuable for sure, but properties gave us something more: they highlighteda failure of imagination.


Properties First • 11



Example-based unit tests made it easy to lock down bugs we could see coming,but those we couldn’t see coming at all were left in place and would probablyhave made it to production. With properties (and a framework to executethem), we can instead explore the problem space more in depth, and findbugs and design issues much earlier. The math is simple: bugs that make itto production are by definition bugs that we couldn’t see coming. If a tool letsus find them earlier, then production will see fewer bugs.

To put it another way, if example-based testing helps ensure that code doeswhat we expect, property-based testing forces the exploration of the program’sbehavior to see what it can or cannot do, helping find whether our expectationswere even right to begin with. In fact, when we test with properties, the designand growth of tests requires an equal part of growth and design of the programitself. A common pattern when a property fails will be to figure out if it’s thesystem that is wrong or if it’s our idea of what it should do that needs tochange. We’ll fix bugs, but we’ll also fix our understanding of the problemspace. We’ll be surprised by how things we thought we knew are far morecomplex and tricky than we thought, and how often it happens.

That’s thinking in properties.

• 12




Property-Based Testing with PropEr, Erlang, and...

Documents

Transcript of Property-Based Testing with PropEr, Erlang, and...