Showing posts with label agile. Show all posts
Showing posts with label agile. Show all posts

Saturday, August 09, 2014

spies versus mocks, 2014 edition

Johnny says to learn to use your debugger, kids. Also, chew your gum.

After 6 years of mostly doing low-level C and C++ agile engineering (along with some personal C# projects), I came back into the Ruby and JavaScript fold last year when I accepted a CTO position. I hadn't paid incredibly close attention to the Rails and front-end JavaScript worlds from most perspectives, but was very much looking forward to how things had improved from a developer practices and tooling perspective. 

One interesting thing is that edit/test cycles are so fast at this point with interpreted (read: non-compiled) languages, that many developers don't really use the debugger. To them, it's easier to iterate 2 or 3 times by adding console.log() or Rails.logger.* calls, re-running the test (sometimes manually), and checking the output. An older engineer I worked with over a decade ago called this practice of incrementally adding temporary log messages as "tombstoning": Tombstones are what you trip over when running through a graveyard. This practice isn't really practical in large C/C++ codebases, where even the fastest of build+test cycles is still in the high double-digit seconds.

Despite the process being fast enough to iterate this way thanks to guard-livereload (and other great tools), I still see the waste of context-switching in this practice -- especially when the cycle involves going from the code editor to a web browser, sometimes having to refresh manually, then manually click through a use case. With the several people I paired with on these Rails, Node.js, and ReactJS+Backbone components over the last year, I had to re-introduce and gently push the practice of using the debugger to set a breakpoint to cut the number of times we had to re-run the test (and context switch between several other developer tools). Chrome, Firefox, and Internet Explorer 11 all have fantastic debuggers (each with their own strengths and weaknesses), and RubyMine's Ruby/Rails debugger integration works *extremely* well in version 6.3.x on MacOS Mavericks. What I often heard is that both JavaScript and Ruby debuggers had *previously* been flaky and/or hard to use, and so people fall back to 1960s style debugging practices.

Wait, isn't this post supposed to be about Strict Mocks versus Spies? Yes, it was -- and it is. The perception of maturity and reliability of debuggers, and tooling in general, plays into a root cause of why many developers I have been working with favor spies over strict mocks. Also, there is seemingly still a general affinity in the Ruby community toward optimizing code for readability, often by leveraging deep Ruby features and optional syntax to create DSLs that read from left-to-right, at the expense or afterthought of other medium-term aspects of developer productivity*. Namely, how efficient is it to figure out what's going on when a test FAILS at runtime.

We have all heard the adage from Bob Martin and Kent Beck that code will be read many more times than it is written, and that is the economy to optimize for. I totally agree, but like many XP practices the principle behind this is a reaction against an anti-pattern of "macho" coding where people are proud of their 280-character one-line of code with no spaces whatsoever. The principle of being mindful of your current and future peers -- and your future self -- with regard to making code, especially tests, approachable and "as simple as possible, no simpler", is an important one. Unfortunately, between the DSL-mania at the end of the last decade and the perception of poor tooling, people focused purely on the static approachability of code and tests and not the runtime aspects.

The nice thing about Spies is that they blend well with existing assertions, after the arranging and invokation of the module/class/function being tested. Consistency is good. The downside is when a Spy assertion like this fails:
--
let (:invitation) { invitation = double('invitation').as_null_object }

before do
  InvistationSender.send(invitation)
end

it "delivers invitations to both recipients" do
  expect(invitation) .to have_received(:deliver).twice
end
--

Let's say the test fails because the invitation.deliver method is called three times instead of the specified two times. Where did the third call come from? At this point, the step might be to to a grep/find in the sources looking for references to the deliver method and then eyeball the code, playing computer in our heads to see where the logic fell down. Now, if we are keeping our changes small and our intervals between test feedback short, then both of those practices woudl reduces the scope we have to manually review. Still, let's say that we just upgraded a Ruby gem, Node.js package, or what have you -- there's not any easy way to slice a common development task like that to be thinner.

Manually reviewing in an instance like this often means trying to follow the call stack down in the IDE and ending up in your third-party gems/modules. This huge context switch is a major productivity killer, and can be even more frustrating if you and your pair were previously cruising along. What we really want to know is, "where did the unexpected call come from?" -- more specifically, "what is the call stack (and conditional branches) that led to the unexpected call?".

This is where strict mocks come in handy: in this instance, when the expected number of calls is exceeded, or a parameter constraint does not match, an exception will be thrown/raised, and we'll get a call stack in our console output. If we need more context, we can just re-run the test again but under the debugger this time. Under the debugger, when the invariant violation for the mock is thrown, we'll be able to switch across the frames of the call stack and inspect local variables in right then. It's possible we'll need to do one more cycle where we set specific breakpoints, but often the cycle ends there -- inspecting the variables in the stack frames and in the global context give us enough information. Also, we reduce the context switches between different tools that have differing visual layouts, key bindings, etc.
--
let (:invitation) { invitation = double('invitation').as_null_object }

it "delivers invitations to both recipients" do
  expect(invitation) .to_receive(:deliver).twice
  InvistationSender.send(invitation)
end
--

So, there's a pretty good case for using strict mocks instead of spies when our test is pinning down the *most* number of calls expected for a mocked method/function or constraining the parameters passed to the mock. What about when we're doing different kinds of constraints, like saying a method is called at *least* a certain number of times? In that instance, we can't fail the test until the actions of the object under test are finished -- there's no real way of catching the object misbehaving "in the act" like there is when setting an expectation for a maxumum number of calls. So, a spy/verify approach is perfectly reasonable.

At this point, some people may be worrying about consistency: if I use strict mocks only some of the time (during setup/arrange part of a test) and then spies some of the time, then it'll be 1) harder to read and 2) harder to understand when to do which style for people who are new to testing/mocks/etc. Both points are valid, and I therefore would recommend consistently doing strict mocks. Note that it is possible to go overboard with strict mocks, especially with parameter constraints or ordered calls, and end up with tests are are too tightly coupled to the implementation details. This can result in 1) tests that have a near-parallel change rate to the implementation, making for higher test maintenance; and, 2) cascading failures where many tests all fail for the same reason  -- both issues are classic ways of reducing the ROI for developer testing. Sometimes, this is a smell that the interface of the class(es) you are mocking has too granular of a public interface. Sometimes the tight coupling between tests and implementation details is someone from a design-by-contract (DBC) background not realizing the longer-term maintenance issues they might be introducing by introducing things into the test that do not relate to the specific scenario the test is documenting. It can also be someone who's just plain excited about the mock framework being used wants to try every feature of it. It's also possible to write spies that are tightly coupled to the implementation as well, so there's no panacea in either style for solving the problem of properly onboarding and mentoring new developers in your codebase and developer testing practices at large.

In summary, all Ruby and JavaScript (front-end and Node.js) developers should take half a day and learn to use the debugger in whatever their IDE/runtime of choice is. Add a throw/raise and make sure the debugger stops, inspect the different stack frames, and explore variables/objects in those stack frames. Consumers of test/mock frameowkrs, and the designers of those frameworks, need to be mindful of optimizing not just the writing and reading aspects, but also the debugging aspects when a given assertion or mock constraint fails. If the observe->understand->fix->re-test cycle is too expensive, then the longer-term maintenance cost/friction of a reasonably growing test suite will upset the ROI of developer testing and potentially cause a backlash. Like most things when it comes to development practices, the whys and the hows are fairly complex and nuanced -- there's no bumper sticker way to drum up zealotry. 

* I think a big part of why medium- to long-term developer productivity issues don't come into play is because often these frameworks are being driven by consultants who never have to deal with a given product/project beyond the first 3-6 months or first 6-12 full-time engineers. As a result, a lot of perceptions, practices, and frameworks put forth by shorter-term consultants tend to over-optimize for the front-end inception/bootstrapping of a project and its initial developers.