TDD, Mocks and Design

Mike Feathers has posted an exploration of some ideas about and misconceptions of TDD. I wish that more people were familiar this story that he mentions:
John Nolan, the CTO of a startup named Connextra [...] gave his developers a challenge: write OO code with no getters. Whenever possible, tell another object to do something rather than ask. In the process of doing this, they noticed that their code became supple and easy to change.
That's right: no getters. Well, Steve Freeman was amongst those developers and the rest is history. Tim Mackinnon tells another part of the story. I think that there's actually a little bit missing from Michael's decription. I'll get to it at the end.


A World Without Getters

Suppose that we want to print a value that some object can provide. Rather than writing something like statement.append(account.getTransactions()) instead we would write something more like account.appendTransactionsTo(statement) We can test this easily by passing in a mocked statement that expects to have a call like append(transaction) made. Code written this way does turn out to be more flexible, easier to maintain and also, I submit, easier to read and understand. (Partly because) This style lends itself well to the use of Intention Revealing Names.

This is the real essence of TDD with Mocks. It happens to be true that we can use mocks to stub out databases or web services or what all else, but we shouldn't. Not doing that leads us to write code for each sub-domain within our application in terms of very narrow, very specific interfaces with other sub-domains and to write transducers that sit at the boundaries of those domains. This is a good thing. At the largest scale, with functional tests, it leads to hexagonal architecture. And that can apply equally well recursively down to the level of individual objects.

The next time someone tries to tell you that an application has a top and a bottom and a one-dimensional stack of layers in between like pancakes, try exploring with them the idea that what systems really have is an inside and a outside and a nest of layers like an onion. It works remarkable wonders.

If we've decided that we don't mock infrastructure, and we have these transducers at domain boundaries, then we write the tests in terms of the problem domain and get a good OO design. Nice.


The World We Actually Live In

Let's suppose that we work in a mainstream IT shop, doing in-house development. Chances are that someone will have decided (without thinking too hard about it) that the world of facts that our system works with will live in a relational database. It also means that someone (else) will have decided that there will be a object-relational mapping layer, based on the inference that since we are working in Java(C#) which is deemed by Sun(Microsoft) to be an object-oriented language then we are doing object-oriented programming. As we shall see, this inference is a little shaky.

Well, a popular approach to this is to introduce a Data Access Object as a facade onto wherever the data actually lives. The full-blown DAO pattern is a hefty old thing, but note the "transfer object" which the data source (inside the DAO) uses to pass values to and receive values from the business object that's using the DAO. These things are basically structs, their job is to carry a set of named values. And if the data source is hooked up to an RDBMS then they more-or-less represent a row in a table. And note that the business object is different from the transfer object. The write-up that I've linked to is pretty generic, but the inference seems to be invited that the business object is a big old thing with lots of logic inside it.

A lot of the mechanics of this are rolled up into nice frameworks and tools such as Hibernate. Now, don't get me wrong in what follows: Hibernate is great stuff. I do struggle a bit with how it tends to be used, though. Hibernate shunts data in and out of your system using transfer objects, which are (lets say) Java Beans festooned with getters and setters. That's fine. The trouble begins with the business objects.


Irresponsible and Out of Control

In this world another popular approach is, whether it's named as such or not, whether it's explicitly recognized or not, robustness analysis. A design found by robustness analysis (as I've seen it in the wild, which may well not be be what's intended, see comments on ICONIX) is built out of "controllers", big old lumps of logic, and "entities", bags of named values. (And a few other bits and bobs) Can you see where this is going? There are rules for robustness analysis and one of them is that entities are not allowed to interact directly, but a controller may have many entities that it uses together.

Can you imagine what the code inside the update method on the GenerateStatementController (along with its Statement and Account entities) might look like?
Hmmm.


Classy Behaviour

Whenever I've taught robustness analysis I've always contrasted it with Class Responsibility Collaboration, a superficially similar technique that produces radically different results. The lesson has been that RA-style controllers always, but always, hide valuable domain concepts.

It's seductively easy to bash in a controller for a use case and then bolt on a few passive entities that it can use without really considering the essence of the domain. What you end up with is the moral equivalent of stored procedures and tables. That's not necessarily wrong, and it's not even necessarily bad depending on the circumstances. But it is completely missing the point of the last thirty-odd years worth of advances in development technique. One almost might as well be building the system in PRO*C

Anyway, with CRC all of the objects we find are assumed to have the capability of knowing things and doing stuff. In RA we assume that objects either know stuff or do stuff. And how's a know-nothing stuff-doer get the information to carry out its work? Why, it uses a passive knower, an entity which (ta-daaah!) pops ready made out of a DAO in the form of a transfer object.

And actually that is bad.


Old Skool

Back in the day the masters of structured programming[pdf] worried a lot about various coupling modes that can occur between two components in a system. One of these is "Stamp Coupling". We are invited to think of the "stamp" or template from which instances of a struct are created. Stamp coupling is considered (in the structured design world) one of the least bad kinds of coupling. Some coupling is inevitable, or else your system won't work, so one would like to choose the least bad ones, and (as of 1997) stamp coupling was a recommended choice.

OK, so the thing about stamp coupling is that it implicitly couples together all the client modules of a struct. If one of them changes in a way that requires the shape of the struct to change then all the clients are impacted, even if they don't use the changed or new or deleted field. That actually doesn't sound so great, but if you're bashing out PL/1 it's probably about the best you can do. Stamp coupling is second best, with only "data" coupling as preferable: the direct passing of atomic values as arguments. Atomic data, eh? We'll come back to that.

However, the second worst kind of coupling that the gurus identified was "common coupling" What that originally meant was something like a COMMON block in Fortran, or global variables in C, or pretty much everything in a COBOL program: just a pile of values that all modules/processes/what have you can go an monkey with. Oops! Isn't that what a transfer object that comes straight out of a (single, system-wide) database ends up being? This is not looking so good now.

What about those atomic data values? What was meant back in the day was what we would now call native types: int, char, that sort of thing. The point being that these are safe because it's profoundly unlikely that some other application programmer is going to kybosh your programming effort by changing the layout of int.
And the trouble with structs is that they can. And the trouble with transfer objects covered in getters and setters is that they can, too. But what if there were none...


Putting Your Head in a Bag Doesn't Make you Hidden

David Parnas helped us all out a lot when in 1972 he made some comments[pdf] on the criteria to be used in decomposing systems into modules
Every module [...] is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings.
Unfortunately, this design principle of information hiding has become fatally confused with the implementation technique of encapsulation.

If the design of a class involves a member private int count then encapsulating that behind a getter public int getCount() hides nothing. When (not if) count gets renamed, changed to a big integer class, or whatever, all the client classes need to know about it.

I hope you can see that if we didn't have any getters on our objects then this whole story unwinds and a nasty set of design problems evaporate before our eyes.


What was the point of all that, again?

John's simple sounding request: write code with no getters (and thoroughly test it, quite important that bit) is a miraculously clever one. He is a clever guy, but even so that's good.

Eliminating getters leads developers down a route that we have known is beneficial for thirty years, without really paying much attention. And the idea has become embedded in the first really new development technique to come along for a long time: TDD. What we need to do now is make sure that mocking doesn't get blurred in the same was as a lot of these other ideas have been.

First step: stop talking about mocking out infrastructure, start talking about mocks as a design tool.

53 comments:

James Justin Harrell said...

Seems like removing getters in that way would lead to an enormous amount of duplication and to really enormous interfaces. Every class would need a method for everything you wanted to do with every property.

When (not if) count gets renamed, changed to a big integer class, or whatever, all the client classes need to know about it.

But in the example you provided, this would not help any.
account.appendTransactionTo(statement)
Renaming "transaction" to "transfer" would require changing every call from "appendTransactionTo" to "appendTransferTo" just as much as changing "count" to "amount" would require changing "getCount" to "getAmount". And changing the type of the transaction would require the arguments to the method be changed.

You've ended up with tons of duplication and enormous interfaces without actually improving anything. Sounds really awful.

jbullock said...

I don't recall the exact link or reference right now, but there's a design piece from some years back titled "Getters and Setters Considered Harmful." Personally, and as you know Keith, I slid into management (bog help me) a while back so I'm a bit out of the conversation here, I never got the notion of getters and setters. If they correspond to internal elements, having them reveals design that should be concealed. (And BTW, what is "encapsulation" that doesn't hide implementation details like these?)

I'd like to hear something about operations that return or influence state without being tied to internal implementation. Things like "reset yourself" or similar. In principal state should be invisible to clients, and even side effects are suspect. Yet, without them, a system has no state.

BTW there's a nice bit on coupling of object systems in Paige-Jones' book on object design with UML.

Wolter said...

If the design of a class involves a member private int count then encapsulating that behind a getter public int getCount() hides nothing.

But how is anyone to know that it's simply passing count if count is private? You're not hiding WHAT it provides, but HOW it provides it. With getCount(), you could do some internal calculation, or you could make it fetch from a web service somewhere, and no other part of the program needs to change or know.
This is the WHOLE reason for getters and setters (well, that and the fact that C++ and Java and others do not support class level properties, which is basically what we are emulating here).

As a previous commenter noted, changing the type or name of a public member will have the EXACT SAME effect as if you'd done it with a getter/setter paradigm.


Another problem:
Rather than writing something like statement.append(account.getTransactions()) instead we would write something more like account.appendTransactionTo(statement)

This is a horrible design decision! Now account needs to know what statements are and how to use them. What the hell does an account have to do with statements? It's the other way around! An account should know NOTHING about a report! Thus, you use statement.append(account.getTransactions()), or statement.append(account.transactions) if you want to avoid getters.


just a pile of values that all modules/processes/what have you can go an monkey with. Oops! Isn't that what a transfer object that comes straight out of a (single, system-wide) database ends up being? This is not looking so good now.

What a load of crap. The only time a lot of modules could monkey with a database is if you designed the system like a moron. DAOs are the gatekeepers to the data source, be it a database, a file, something across a network, even a COBOL call. You might as well rail against the computer's filesystem, because a file is something "all modules/processes/what have you can go an monkey with".
You partition off areas specifically so that other parts of the program don't screw around with it. You write DAOs so that you can call getUsersWithUnpaidBills() instead of going directly in with "select name, balance from users, settlement where balance > 0", which the rest of your program has no business knowing about. You further safeguard these DAOs by keeping them behind a business layer that ensures access control and sanity checking.
Above all else, your data objects know NOTHING about how they are being used. It's "need to know", see?
And if you want to avoid globals or the overused singleton pattern, go with an IOC framework and save yourself a lot of hassle.

Greg Jorgensen said...

Keith, thanks for linking to your article in the comments to my Doing it wrong: getters and setters. I'm working on another "Doing It Wrong" article about ORMs and the "Active Record" pattern and how they distort both OOP and RDBMSs, and it looks like you've covered some of the same ground.

keithb said...

@Jim. Several other people have mentioned that same article to me, and none of them can remember where it is, either.

What you say about state is interesting. Side effects are deeply suspect (and for what else does a setter exist?), although I would say that without them a system has no mutable state, an important distinction. There's plenty of prior art on building systems without (or, with very, very well hidden) mutable state. It turns out that pushing hard on the TDD with mocks route leads to a de-emphasis on member variables.

They are hard to test with, so we tend only to put them in when we really need to. Which turns out to be a lot less often than people think.

keithb said...

@James Justin: in practice when code is written this way the interfaces become very small, because only the methods actually required for the objects' interactions are exposed.

That's what saves us from what would otherwise be the problem you mentioned. In designing with mocks the case I show would probably end up with a very small interface (let's say: TransactionReporter) carrying appendTransactionTo(TransactionHistory h) and another very small interface (TransactionHistory) carrying append(Transaction t).

In that case, renaming transaction to transfer affects exactly and only the collaborators of TransactionReporter, whereas if we use an Account that exposes lots of properties, when any one of them changes all collaborators are affected, whether they have any interest in that property or not.

keithb said...

@ Wolter: I'd want to have a more constructive approach than dismissing someone's actions as moronic.

My thesis is exactly that the current habit of building systems around state exposed in transfer objects tightly bound to database tables without really considering any alternative does result in poor design much of the time.

Michael Schuerig said...

Keith, I think the article(s) you're looking for are a actually multiple articles by Allen Holub.

http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html

http://www.javaworld.com/jw-07-1999/jw-07-toolbox.html
http://www.javaworld.com/jw-09-1999/jw-09-toolbox.html
http://www.javaworld.com/jw-10-1999/jw-10-toolbox.html

dibblego said...

You kids need to let go of your addiction to impure functions, their consequences and all your layers of hacks on hacks. All your arguments (including those you protest against) become completely meaningless when the premise is corrected. Stop taking painkillers and remove the tumour already.

Mister Bean said...

While Wolter's tone is not at the level most people would like, I think his points are well taken. I would hope you'd answer them.

His comment re: "account.appendTransactionTo(statement)" is on point. You've reversed the relationship between report and transaction, so that now they're completely mingled in order to adhere to a no-getter rule. Doesn't seem like an improvement, IMHO.

keithb said...

@dibblego: what you say is true, and yet is unhelpful because for too many programmers in too many programming shops there is no way to get there from here.

Lots of people are writing mutation-infested systems all over the world right now. Telling them to switch to pure functional approaches just won't help. It would help them if they did. but they can't, so it doesn't help to tell them that.

keithb said...

@Mr Bean: I'm increasingly dismayed by the viciously rude and personally insulting tone of much supposedly technical discussion on the net. I don't care how insightful or not Wolter's points may be, life is too short to deal with ugly behaviour.

Since you raise the point in a more civil way, I'm happy to respond to you.

To say "account.appendTransactionTo(statement)" tells us a lot about a certain interaction between objects playing role and not much about anything else.

In the TDD with mocks style we tend to find methods first, then choose what interface to put them on, and then find what classes should implement those interfaces. Rather that starting with a static model of some data and bolting behavior onto it.

There is nothing in that code fragment to imply any very strong need for the account object mentioned in it to be an instance of the same class as the transfer objects that pass between us and a data source, lets say.

Wolter said...

Keith: My people skills suck ,as you have noticed. However, my intent was not to say that your approach is moronic. What I'm saying is, you need to have a structure that protects sensitive resources from the morons who are going to be mucking about with the code.
If you leave it all in the open (like with globals), some idiot is going to touch it. If you have clearly defined access paths, you'll stop all but the most malicious coders.

Unless you have a structure that guides people away from pulling out the main support pegs, you're going to be in a world of hurt during the maintenance schedule.

Wolter said...

One more thing if I haven't pissed everyone off enough already:

To say "account.appendTransactionTo(statement)" tells us a lot about a certain interaction between objects playing role and not much about anything else.

Why not use statement.appendTransaction(account)?

Using account.appendTransactionTo(statement) is once again forcing account to know about statement, when it should only be handling account data, not reporting.

keithb said...

@Wolter: statement.appendTransactionFrom(account) would do equally well, so far as avoiding getters goes.

And in that case, statements know about accounts, when could argue that really they should only know about reporting transactions.

Which is preferable? There simply can't be one right answer to that which we can figure out without knowing a lot more about statements, accounts and transactions in the actual context being implemented.

You claim that it's wrong for account to know about statement, that it should only be handling account data, not reporting. I submit that without a load more information about the domain and the use cases in question you simply have no basis upon which to make that claim.

PS: I agree that sensitive resources should be protected. That's one more reason for not exposing them to all and sundry through setters and getters. It's difficult to tell through all the abusive language what you think you're arguing against but I don't think that it's the point I think that I'm making.

Wolter said...

And in that case, statements know about accounts, when could argue that really they should only know about reporting transactions.
Which is preferable? There simply can't be one right answer to that which we can figure out without knowing a lot more about statements, accounts and transactions in the actual context being implemented.


As you said, it depends on what exactly these objects are. However, we are obviously talking about a financial system (the industry in which I work), and in the common situation, the account deals with money in an account, and statements deal with reporting about things such as balances, transactions, etc. This means that there IS one right answer.

Actually, in the real world, you're probably not going to have account.getTransactions() anyway; you'd have transactionDAO.getTransactionsInvolving(account), because there could be a LOT of transactions that you don't want hanging around in memory (and making account able to look up stuff automatically in a database when requested is dangerous unless everyone using the system REALLY knows how it works and has a LOT of discipline).

But getting back to the point, a report by its very nature needs to know a lot of stuff about a lot of things. An account, on the other hand, could never justify knowing about a report.
And if you absolutely wanted the report to be agnostic, you'd use some kind of helper or DAO (such as I mentioned earlier) to get the transactions and then provide them to the report. Actually, if you wanted fully agnostic reports, you'd only be passing in a context object containing all the pertinent information, not a list of transactions.

PS. the abusive tone is not intentional. I have asperger's syndrome, and everything I say tends to come out wrong.

rwallace said...

This is a topic I've been coming back to repeatedly the past several years without ever being fully satisfied. In general, I don't like using setters and getters either. But I can never quite come up with an alternative that satisfies all the my applications needs. Let take a simple example, like representing a Contact in an AddressBook.

In the AddressBook, we can avoid exposing a getContacts() method that returns a collection of Contacts by putting add(Contact), remove(Contact), find(ContactSpecification) and some kind of method like forEach(ContactHandler) method where ContactHandler has a method like handle(Contact). So that takes care of the AddressBook and we haven't exposed any getters. The only thing that returns something is the find method, and even that we can change to a forEach(ContactSpecification, ContactHandler) if we wanted to.

But then we come to the actual Contact class. It has properties like name, email address, postal address, etc. How in the world are we supposed to expose these properties to allow them to be displayed and modified without getters and setters?

Wolter said...

Oh, another reason not to have account.getTransactions() or account.transactions is security.

Once the account object is passed out of the business layer, there can be no further control over what transactions can be pulled out of that account object, unless you tie the lookups back to the business layer, which will create ugly coupling.

Wolter said...

But then we come to the actual Contact class. It has properties like name, email address, postal address, etc. How in the world are we supposed to expose these properties to allow them to be displayed and modified without getters and setters?

This actually is a situation where you can simply expose the members publicly, PROVIDED you are 100% sure that they'll ONLY ever be just data storage, and not lookups.

If you were using a language that supported object level properties, there'd be no need for getters/setters (or this conversation) since they only exist to compensate for a deficiency in the language.

Colin Jack said...

On removing all getters (Holub style), I respect the approach but often I'm choosing one design over another based on different considerations. For example on your first example, do I really want the Account object involved in appending transactions to a statement...the coupling might bother me and I'm not sure if Account ends up being cohesive (so I'd perhaps be trading off SRP).

I'm not arguing for the controller style, but I do find getters quite useful despite that not least to preserve my layering and to allow me to choose where I don't want to put a responsbility for X with the class that has that data (as with the statement/account case).

I guess my favorite article on this sort of discussion is (surprise surprise) by Martin Fowler http://martinfowler.com/bliki/GetterEradicator.html.

I'm also interested in how you personally handle CRUD and display with objects without getters/setters, I know its possible but I'm wondering which approach you use?


"You claim that it's wrong for account to know about statement, that it should only be handling account data, not reporting. I submit that without a load more information about the domain and the use cases in question you simply have no basis upon which to make that claim."

Good point, as you say its not just about the aspects of design (coupling/cohesion) but also about careful domain analysis.


"First step: stop talking about mocking out infrastructure, start talking about mocks as a design tool."

Couldn't agree more, I'm not mock mad but I think you guys have interesting views but they are being lost in translation. Truth is that if an article talks about mocking right now then it'll discuss mocking for conveniance, as in "I can't rely on this e-mail server so I extract an interface and mock it". Arguments about the design of the interface, how you come up with it (upfront) and the effects on the design of the system are hardly ever voiced.

I've tried the mocking approach and I found it somewhat satisfying but I ran into a lot of issues. I'm sure you guys have addressed these issues so reading more about how you use your techniques in real situations would be useful.

garethm said...

To remove the accessors for the Contact object, you have a couple of options:

1. foreach(ContactAttributeHandler) where ContactAttributeHandler has a method handle(attributeName, attributeValue).

2. Change your ContactHandler to have a method handle(name, email, address)

The only way of avoiding mutators that I can think of is to perhaps use a replace function of some kind - AddressBook.replace(Contact old, Contact new). You pass the contact you got through foreach() with a newly constructed Contact instance that represents the new data for that contact.

Dan North said...

@James: Every class would need a method for everything you wanted to do with every property..

Not really. Every class would need a method for everything it was supposed to do, which in a well-factored application will not be much. If it uses lots of properties to do this, well that's its own secret. If that secret is let out, by means of getters, then suddenly it can be used for "everything you wanted to do with every property", which as you argue would be bad, right?

@Wolter: the Account doesn't need to know about a Statement. The method could be declared as Account.appendTransactionTo(TransactionReceiver) and Statement would implement the interface TransactionReceiver. In OO terms this means it could take the role of a transaction receiver. This decouples the Account from knowing anything about the target of the method - just that it knows how to receive transactions.

Philip Schwarz said...

jbullock said

I don't recall the exact link or reference right now, but there's a design piece from some years back titled "Getters and Setters Considered Harmful."

Here it is:

Why getter and setter methods are evil

Wolter said...

dan: That will work just as well, so long as account is never exposed outside of the business layer. If an account object with such a method ever got out, it would provide an access point to sensitive data that likely should not be given carte blanche.

Another point that can screw this up is distributed systems.
Since the accounts system is likely to be useful to a number of programs, you'll eventually end up giving it a network interface such as web services or the like.
If you implement a pull interface [getTransactionsFor(account)], the implementation is simpler and less error prone than a push interface [appendTransactionTo(TransactionReceiver)].
Marshalling/unmarshalling, which is automated and so on the surface a non-issue, is actually very unsafe in the big bad world. All those morons writing moronic programs will inevitably come up with TransactionReceivers that contain unbelievably HUGE amounts of data, which the marshaller will dutifully encode and choke up your bandwidth. That's if they manage to come up with a concrete class that can be serialized/deserialized properly in the first place.
And you wouldn't DARE expose such an interface to a large audience. The potential for DOS attacks would skyrocket.
Passing a simple data object "account" (actually call it "accountData" or "accountInformation" instead), with all its publicly accessible properties, avoids these issues at the cost of using a pull interface to get at the account data.

The central OO tenets work fine within an ivory tower (where everyone is trusted), but in the real world you are often faced with a choice: purity or security.
The pragmatic choice is: OO where you can, ease-of-use where others will access, paranoia at all times.

Isaiah said...

I agree with what Dan is saying, the Account doesnt need to know anything about a Statement. Now since an Account holds all info about its transactions, IMO it should be the Accounts reposibility to push this infomation out in messages (or methods)to an appropriate collaboration role (in this case something TransactionReceiver) to do the specifc processing.
In my opinion doing "statement.appendTransaction(account)" means the Statement will have to pull information out of the Account object, and use this info to do some processing. this violates the "Tell dont ask principle". Minimizing getters, makes your objects more focus on behaviour and resposiblities, which interact with specific roles. I think this is what keith is trying to get across and totally I agree with him

Colin Jack said...

"Minimizing getters, makes your objects more focus on behaviour and resposiblities, which interact with specific roles. I think this is what keith is trying to get across and totally I agree with him."

We might well be happy for the Account to get involved in statement generation. However we might also want to send representations of the Account to external systems, we want to serialize them to disk in a custom manner, we want to save them to the DB, we want to display them in the GUI and allow updates.

We could get the Account heavily involved in each of these cases (using the technique you describe for the statement generation). However I'd probably aim to keep my Account class concentrated on the important domain/business logic rather than having it get involved in all sorts of supplementary work.

Wolter said...

We could get the Account heavily involved in each of these cases (using the technique you describe for the statement generation). However I'd probably aim to keep my Account class concentrated on the important domain/business logic rather than having it get involved in all sorts of supplementary work.

Exactly. The whole point of encapsulation is the separation of concerns, and you'd be hard pressed to convince me that an account need be concerned about reporting, or most of the other multitude of operations that could conceivably involve an account.
The more things you have an object do or know about, the harder it is to maintain.

Another issue is implementation stability. Which is more likely to change: An account or a report?
If your account knows about the report, then a change to the report will likely require a change to account, as well as any other central objects that are used in the generation of that report (transactions, balances, etc).
The reverse is also true, but far less likely.

Colin Jack said...

"Another issue is implementation stability. Which is more likely to change: An account or a report?"

Yeah thats also a key decision making factor, if we want to swap to displaying a summary of the transactions or whatever do we really want to be changing the Account?

I think this is what Martin Fowler is discussing at the end of his "GetterEradicator" post (in that keeping things that you believe will change together close together is a good idea).

Wolter said...

Agreed.
For the most part, this discussion comes down to knowing the rules vs knowing the game.
You can teach someone poker in 5 minutes, but if they simply follow the rules, they'll get creamed.

The "avoid getters and setters" rule is in general a good rule to a point, but you'd hardly call them "evil". You wouldn't put them in business objects, or the user interface, or even DAOs (excepting setters for IOC). You WOULD, however, put getters and setters in data transfer objects.

When people start quoting rules-of-thumb as gospel, or harp on about purity, they betray their lack of experience. Most builders know how to use a circular saw. The seasoned builder knows not to use it to cut into a wall near a power outlet. If you're not thinking consequences, you're not thinking right.

That's why it is wisely written: You gotta know when to hold 'em, know when to fold 'em, know when to walk away, know when to run.

Isaiah said...

"However I'd probably aim to keep my Account class concentrated on the important domain/business logic ...."
I agree I would also keep Account concerned with key domain concepts. An Account has transactions, the role of TransactionReceiver is meaningful to an Account so Account.appendTransactionTo(TransactionReceiver)
makes sense, IMO it seems right to give the Account this responsiblity. Things like persisting an Account to DB, Serialising it to some formatt, or displaying it on a GUI, dont make sense in the domain of Account so I wouldnt place all these responsibilities on to the Account.

"If your account knows about the report, then a change to the report will likely require a change to account ...."
The Account doesnt know about reports or anything else, all it does know is about the role of a TransactionReceiver, by Account.appendTransactionTo(TransactionReceiver), its simply sending it transactions information to this object. So any object can implement the TransactionReceiver interface and provide their own specific implementaion of how the transactions information is processed, if the transaction information should be displayed as a summary or whatever..

"You wouldn't put them in business objects, or the user interface, or even DAOs (excepting setters for IOC). You WOULD, however, put getters and setters in data transfer objects." I agree, i dont think anyone is saying there should be no getter/setter at all, there are certianly cases where getter make a lot of sense even in a domain object. The whole point is that getters/setter in genral are overused.

Wolter said...

all it does know is about the role of a TransactionReceiver, by Account.appendTransactionTo(TransactionReceiver), its simply sending it transactions information to this object. So any object can implement the TransactionReceiver interface and provide their own specific implementaion

Which is fine on paper, but once issues of distributed computing and security creep in, that solution won't look quite so elegant anymore.

Wolter said...

Actually, I should qualify that more:

If this scenario happens entirely inside the business layer, and the object is never exposed outside, then there's no problem.
Once you pass through the business membrane, however, the rules of engagement change drastically.

Colin Jack said...

@Isaiah
"Things like persisting an Account to DB, Serialising it to some formatt, or displaying it on a GUI, dont make sense in the domain of Account so I wouldnt place all these responsibilities on to the Account."

So from this I'm thinking that you have multiple models. For example you'd have the one with the complex business object, one to handle CRUD and so on?

Personally I find that I can often have one class supporting both CRUD/display (though not databinding) and domain/business logic but that does mean getters/setters on those domain classes.


@Wolter
"You wouldn't put them in business objects, or the user interface, or even DAOs (excepting setters for IOC). You WOULD, however, put getters and setters in data transfer objects."

This bit interested me. If you have a domain/business class that you wanted to do CRUD for but you also have more complex logic in it would you still avoid getters/setters? If so how do you approach the situation, do you have multiple classes or would you perhaps send the domain/business classes in little DTO's (messages) explaining at a high level what changes you want them to make?

Wolter said...

If you have a domain/business class that you wanted to do CRUD for but you also have more complex logic in it would you still avoid getters/setters?

Hmm... It would depend on the situation, really.
If exposing the object were a security risk, I'd go with a DTO, or maybe even put a DTO facade on the object itself if I could maintain integrity (I'd think long and hard before trying something like that, though!)

If the security risks were low enough not to warrant that level of paranoia when crossing boundaries, and if there was little chance of that part going distributed, I'd consider other, more OO-like designs.

Technically, I shouldn't really be considering distributed systems at the early design phase, but it happens so often, and can be so expensive to refactor for that I tend to err on the side of safety.

Philip Schwarz said...

Hi Keith,

great blog entry, full of meaty stuff; I agree with a lot of what you said.

I just wanted to clarify something about robustness analysis (RA). I am no expert on the subject but I have recently finished reading Use Case Driven Object Modeling with UML -
Theory and Practice
, which promotes the use of the ICONIX process, in which Robustness Analysis (RA) plays several essential roles.

In essence, the authors say that robustness analysis (wich they call one of the industry's most useful and yet best kept secrets) is preliminary design, whereas detailed design (e.g. using sequence diagrams) is where you do responsibility allocation.

The best thing you can do is to read the book (unless of course you have already done that). I am just going to quote some of your statements and follow them with some excerpts from the book that seem to contradict your statements.

You said: A design found by robustness analysis is built out of "controllers", big old lumps of logic, and "entities", bags of named values. (And a few other bits and bobs) Can you see where this is going? There are rules for robustness analysis and one of them is that entities are not allowed to interact directly, but a controller may have many entities that it uses together.

Book says: preliminary design (RA) is all about discovery of classes (aka object discovery), ... detailed design is, by contrast, about allocating behaviour (aka behaviour allocation) - that is, allocating the software functions you've identified into the set of classes you discovered during preliminary design.

You said: It's seductively easy to bash in a controller for a use case and then bolt on a few passive entities that it can use without really considering the essence of the domain.

Book says: ...notice that there are no controller objects on the sequence diagram (although there could be). This is because when you draw the sequence diagrams, the controllers (the verbs) are turned into messages on the boundary and entity objects (the nouns). Sometimes you'll find real controller classes, such as a "manager", or a "dispatcher" class, and sometimes a framework (self: e.g. Spring MVC) might tempt you to litter your design with dozens of tiny "controller classes", but as a general rule of thumb, 80% or so fo the controllers from the RA diagrams can be implemented as one or more operations on the entity and boundary classes.

You said: In RA we assume that objects either know stuff or do stuff.

Book says: Having performed RA, you should by now have identified at least three quarters of the attributes (the data) on your classes, but very few, if any, operations (the behaviour). ...we advocate a two-pass approach to design: The first pass (preliminary design) is driven by thinking about attributes while deliberately ignoring "who's doing what to whom". Then the second pass (self: detailed design i.e. drawing sequence diagrams) focuses all your attention on that exact question.

keithb said...

@Philip: Thanks for these pointers. I'm not familiar with ICONIX, nor have I read that book.

Pondering these quotes has lead me to a new realization of how much my ideas about discovering objects has changed over the years. What's described here looks as if it could well be a misinterpretation of what's in the Syntropy process, which i used extensively in my first programming job. It's enlightening to see more clearly how far and where I've come. Thanks.

What's described here is rather different from what I see in the wild, so I've amended the posting to suite.

Meanwhile, some thoughts on the quotes themselves.

"preliminary design (RA) is all about discovery of classes (aka object discovery), ... detailed design is, by contrast, about allocating behaviour" — firstly, allocating behaviour is not a matter of "detail"

"[...] that is, allocating the software functions you've identified into the set of classes you discovered during preliminary design" — but why would I introduce a class other than to support a "software function"?

"[...] there are no controller objects on the sequence diagram [...] because when you draw the sequence diagrams, the controllers (the verbs) are turned into messages on the boundary and entity objects (the nouns)." — I really don't like this noun/verb thing. To start with, it's a bogus analogy: This sentence no verb. And further more, verbing weirds nouns. That said, it's good that they would recommend getting rid of the controllers. Why not not have them in the first place?

"sometimes a framework might tempt you to litter your design with dozens of tiny "controller classes"" — damn straight they sometimes. Again, good advice to avoid this.

" (preliminary design) is driven by thinking about attributes while deliberately ignoring "who's doing what to whom". Then the second pass focuses all your attention on that exact question" — Well, in Syntropy we seek to delay describing a solution in terms of explicit message sends until as late as possible, so as to avoid premature allocation of responsibilities.

But that doesn't mean a first cut that ignores behaviour. Instead, in Syntropy behaviour is first captured in terms of objects and (broadcast, instantaneous, asynchronous) events. A very different notion.

Winterstream said...

This is an interesting thought. It almost verges more towards an actor-based approach. When I write process-oriented code, my messages end up telling the receiver to do something and I can't think of times when I ask explicitly about a process's inner state (the equivalent of getters and setters).

I suspect (and I would like to know whether you agree) that writing code in this fashion also eases the drawing of error handling boundaries around various modules. I'm asking, because at least to me, process oriented programming makes this easy (and I see getter-free style programming as similar).

keithb said...

@winterstream: I think I do agree. After all, objects want to be actors when they grow up.

I don't think that this is a co-incidence, either. There was a heavy Smalltalk influence at connextra, although they worked in Java, so a message-passing model was in their minds.

As error handling goes, yes I think that this approach does help because we spend more time with state on the stack (so we are also moving towards a functional approach).

Philip Schwarz said...

It's seductively easy to bash in a controller for a use case and then bolt on a few passive entities that it can use without really considering the essence of the domain. What you end up with is the moral equivalent of stored procedures and tables.

That's right, if you use the Transaction Script pattern, you end up with an Anemic Domain Model.

That's not necessarily wrong, and it's not even necessarily bad depending on the circumstances.

That's right, according to the Design Stamina Hypothesis, it is sometimes not worth investing in a rich Domain Model.

Philip Schwarz said...

In Holub on Patterns (great book), the author (also the author of Why getter and setter methods are evil) talks a lot about getters and setters. Here is how he describes the basic issues around setters and getters:

* The maintainability of a program is inversely proportional to the amount of data that flows between objects.

* Exposing implementation harms maintainability. Make sure that the accessor or mutator really is required before you add it.

* Classes that directly model the system at the domain level, sometimes called business objects, hardly ever need accessors or mutators. You can think of the program as partitioned broadly into generic libraries that have to relax the no-getter/no-setter rule and domain-specific classes that should fully encapsulate their implementation. Getters and setters at this level are an indication that you didn't do enough up-front design work. In particular, you probably didn't do enough dynamic modeling.

* By keeping the design process in the problem ("business") domain as long as possible, you tend to design messaging systems that don't use getters and setters because statements such as "Get this" or "Set that" don't come up in the problem domain.

* The closer you get to the procedural boundary of an OO system (the database interface, the UI-construction classes, and so on), the harder it is to hide implementation. The judicious use of accessors and mutators has a place in the boundary layer.

* Completely generic libraries and classes also can't hide implementation completely so will alwats have accessors and mutators.

* Sometimes it's not worth the trouble to fully encapsulate the implementation. Think of trivial classes such as Point and Dimension. ...

BTW, by dynamic modeling he means the kind of modeling that you do when you act out use cases with Class Responsibility Collaboration (CRC) cards.

MoffDub said...

Nice post. Your post has been linked to in my post about the same topic: http://moffdub.wordpress.com/2008/06/16/the-getter-setter-debate/

Colin Jack said...

@Philip Schwarz
Although I've started reading Holubs book a couple of times I never stuck with it so it was good to read a summary, makes me want to follow through with it at some stage.

Anyway you say:

"you tend to design messaging systems that don't use getters and setters because statements such as "Get this" or "Set that" don't come up in the problem domain"

This is kida what I was expecting and it makes a lot of sense, so instead of setting Customer.Name to be a new Name you pass in a message describing the name change and allow the Customer to process that as it sees fit. Seems sensible but you'll presmably end up with a lot of mesage classes (I guess I could read the book to find out :)).

"The closer you get to the procedural boundary of an OO system (the database interface, the UI-construction classes, and so on), the harder it is to hide implementation. The judicious use of accessors and mutators has a place in the boundary layer."

OK so lets take a Customer domain class, I need to display it suitable in a GUI. If I use the Customer class then it will have getters (and presumably setters) and Holub advises against that for domain classes (where possible). So would you instead use a different class and leave the Customer domain class to do just the domain logic (no CRUD or display maybe)?

keithb said...

@Colin: It does seem as if its awkward to do CRUD operations on domain classes, doesn't it? One theoretical answer is indeed to have the domain object push a data transfer object through an interaction domain to the gui. Seems a bit clunky. We might also expect an MVP style of thing, with interactors coming inbound which record the user's intention rather than brute-force updates to the model.

The subtlety here (I think) is that in a world free of getters and setters we have to be more imaginative than merely providing the user with a long and convoluted path through a bunch of objects only to be running SELECTs and UPDATEs on an RDBMS. We really want to present rich objects that the user can interact with to augment their work.

Colin Jack said...

@keithb
I may be going into too much detail, if so I appologize but it is an interesting topic...

Oh and just to be clear upfront, although I use getters/setters on my domain classes we don't do direct databinding so we do have a lot of control. For example we can use patterns such as using value objects (in the DDD sense) which does help a lot.

Anyway can I add another option, would you even involve the domain classes in the CRUD work or would you just have two seperate models one customized for CRUD and other other for the more complex behavior (assuming your app doesn't just do CRUD and validation)? This splits your model in ways that could be un-natural and if you are totally against getters/setters then I guess its not an option, but I know some people do seem to go down this road.

On the higher level messaing approach, another advantage would be that they can be sent outside the system allowing you to get into the advanced design approaches that others use. So after a Customer object has processed a message relating to an address change that message (or a transformation of it) is sent out to other interested parties. Greg Young (and others) are driving forward this sort of thinking within the ALT.NET community and you might find his blog entries interesting (see DDDD posts):

http://codebetter.com/blogs/gregyoung/

Anyway all very interesting stuff and looking forward to reading more about it and how you personally use these practices in real systems.

keithb said...

@Colin: The issue for me is that CRUD is a solution domain concept.

I want the business problem expressed in the business objects in the model of my application. It may be that some of these objects are long-lived, and it may be that therefore some data from them might be persisted to a database. Why should this be of concern to the user? And yet the idea permiates so many systems, all the way up to the UI.

The error seems to creep in even at the requirements level. I see too many users and analysts who've been trained by their IT folks into thinking in terms of data on forms as not merely one presentation aspect of one solution option, but as the only way of thinking about their problem. We, as an industry, have really screwed up on this one.

It we approach problems with the idea at the solution will have at the front end data in forms (maybe even with a "save" button!) and at the back end data in tables, it's not too hard to see why the bit in the middle ends up they way it so often does.

So no, I wouldn't have a business model and a CRUD model. I'd have a business model and treat the CRUDdiness as an implementation detail best well hidden from the user. And from most of the implementors.

Colin Jack said...

@keithb
I'm guessing you'll be blogging more about this so I'm definitely looking forward to reading about the specifics of the approach you guys use.

Philip Schwarz said...

@Colin Jack

You said:

"Although I've started reading Holubs book a couple of times I never stuck with it so it was good to read a summary, makes me want to follow through with it at some stage."

I must stress that it is not my summary, I reproduced it verbatim from the book (p34). I found it very useful because Holub's style (in my opinion) means that some ideas are repeatedly described, but sometimes only partially, over many pages. This means that when you look for something in his book, you sometime have to re-read large-ish sections of it: you can't zero-in on where he said this or that.

You said:

"instead of setting Customer.Name to be a new Name you pass in a message describing the name change and allow the Customer to process that as it sees fit. Seems sensible but you'll presmably end up with a lot of mesage classes"

No, when Holub says messaging systems he is using 'message' as the fundamental way of getting things done in an OO system. As Meilir Page-Jones says in Fundamentals Of OO Design in UML:

A message is the vehicle by which a sender object O1 conveys to a target object O2 a demand for object O2 to apply one of its methods.

And as Rebecca Wirfs-Brock says in Designing Object-Oriented Software:

A message consists of the name of an operation and any required arguments. When one object sends a message to another object, the sender is requesting that the receiver of the message perform the named operation and (possibly) return some information.
...When a receiver receives the message, it performs the requested operation in any manner it knows... by executing a method.

e.g.you can send the print message to a document and it will execute its print method.

I rarely hear developers talking about sending messages in their programs. At least not Java programmers. Smalltalk programmers probably actually

E.g. consider the following program: Document d = getNextDocument(); d.print(); where Document is an abstract class with many concrete subclasses or an interface with many implementations.

Instead of saying that the program sends, to the object referenced by d, a message asking it to execute it no-args print method
we just say the program calls the document's print method.

The reason why some people still talk in terms of messages is that they want to stress the important difference between calling a routine, which is which is what procedural programs do, and which will simply result in the execution of a specific method that is known at compile-time, and making a polymorphic call, which is what OO programs do, and will result in the execution of a yet-unspecified method that can only be determined at run-time.

When I saw Kent Beck stressing this distinction in his Smalltalk Best Practice Patterns, I thought it might partly be due to the fact that the concept of message is key in Smalltalk, but he still makes this distinction in Implementation Patterns, which is aimed at Java developers.

You said:

"If I use the Customer class then it will have getters (and presumably setters) and Holub advises against that for domain classes (where possible). So would you instead use a different class and leave the Customer domain class to do just the domain logic (no CRUD or display maybe)?"

I have not yet seen domain objects that don't provide getters for fields that need to be displayed in a UI. Holub seems to be suggesting that the way we get away with domain objects without getters is either by either getting the domain objects to display themselves (in simple cases), or by using the Builder pattern to separate a business object from implementation-specific details such as how to display the business object on the screen. I am afraid I'll have to refer you to his book (p212) for more details, because I have yet to read that section. Basically, instead of ASKING the BO for its details so you can display them, you TELL (don't ask, tell - The Law of Demeter) it to build you a representation of itself that you can then pass to the UI.

Philip Schwarz said...

@Keith

Use Case Driven Object Modeling with UML -
Theory and Practice says: "(preliminary design) is driven by thinking about attributes while deliberately ignoring "who's doing what to whom". Then the second pass focuses all your attention on that exact question"


You replied:
Well, in Syntropy we seek to delay describing a solution in terms of explicit message sends until as late as possible, so as to avoid premature allocation of responsibilities. But that doesn't mean a first cut that ignores behaviour. Instead, in Syntropy behaviour is first captured in terms of objects and (broadcast, instantaneous, asynchronous) events. A very different notion.

It seems that you too share my concern for the two pass approach advocated by the book:
We advocate a two-pass approach to design: The first pass (preliminary design) is driven by thinking about attributes while deliberately ignoring "who's doing what to whom". Then the second pass (detailed design) focuses all your attention on that exact question. .

This approach is criticized by David West in his excellent book Object Thinking. On page 155 He groups the plethora of object development methods advanced in the nineties into three general categories: data-driven (e.g. OMT), software-engineering, and behavioural (e.g. CRC).

Of the data driven group, he says (p124): In a data-driven approach, the attributs of an object are discovered first, and then the responsibilities are meted out as a function of which object holds which data. A behavioural approach mandates the assignment of responsibilities first. Only when you are satisfied with the distribution of responsibilities among your objects are you ready to make a decision about what they need to know to fulfil their responsibilities and which parts of that knowledge they need to keep as part of their structure - in instance variables or attributes. This is the biggest difference in the definition between data-driven and behaviour-driven, or responsibility-driven approaches to objects.

Of Data-driven methods he also says (p156): data-driven methods do not bring about the object paradigm shift...followers of this type of method "think like data", or "think like a relational database"...[the data-driven approach] isn't consistent with decomposition of the world in a natural way because the world isn't composed of computationally efficient data structures. There are no "natural joins" in the domain that map to normalized entities.
...
Data-driven methods tend to craete objects with more frequent and tighter coupling than do other object methods.


Keith: how does your following statement relate to these ideas: in Syntropy we seek to delay describing a solution in terms of explicit message sends until as late as possible, so as to avoid premature allocation of responsibilities

keithb said...

@Philip
I'm not sure what sort of answer you're looking for.

Syntropy (presumably missing from West's survey?) puts equal emphasis of data and behaviour. And very different emphases on other things.

It's actually a three-pass model (which should be iterated, of course): one pass to understand (someone's view of a situation in) the problem domain, one to specify a system/components to deal with that, and one to design an implementation of a system that meets that specification to deal with that situation.

All three passes produce models that capture both data and behaviour together. Data in class diagrams, behaviour in statecharts (and other things).

A specification model is the closest to what folks seem to do with UML these days: objects send and respond to messages. As I understand it Cook and Daniels decided that these semantics were inappropriate for specification. In those kinds of model behaviour is captured using events, which objects respond to, maybe by causing other events to occur. And this was viewed as inappropriate for modelling the world, in essential/domain/business models objects change state when events occur, but do not themselves cause events to occur (almost a monadic view—Leibnitz monads, not Haskell ones).

In this way models speak throughout of what objects know, and how they respond, but only when appropriate do we say exactly how they work internally to achieve that. In particular, in an essential model object may not generate new events. This is what I meant by delaying allocation of responsibilities.

As you might imagine, candidates to come to interview with me and who have asserted an expertise in "object modelling" tend to have an interesting time.

Colin Jack said...

Please note there is a related discussion in a WIKI setup to discuss aspects of the Entity Framework functionality that MS are adding in the .NET space:

http://entities.pbwiki.com/Getters+and+Setters+Anti-Pattern

You need to sign up to comment but you might be interested in having your say.

Philip Schwarz said...

@Keith

Thanks for that...interesting to learn more about Syntropy and how it relates to data and behaviour.

Anonymous said...

You might want to check with Jerry Weinberg about TDD being "the first really new development technique to come along for a long time...".

Seems it was there at the start and was then lost for a time, to be rediscovered recently.

Shards Henry said...

This is an interesting read Keith. The comments are very insightful as well. As somebody who recently learned and used Objective-C quite a bit, which uses many of the Smalltalk lingo, I've started to appreciate message passing quite a bit.

It helps a lot when doing TDD. I find myself setting up mocks quickly since there's very little to set up. With getters (setters not so much), more often than not we would have to setup not just the mock containing the getter, but also the object being returned by the getter. Without knowing the details on how the object being fetched by getter is used by the class under test, it becomes really hard to write test quickly.