Things learned while pairing

Why not unit test private methods? Does an object know if it lives or dies? Beats me.

I spent a Sunday in January at Corey Haines’s Code Retreat, out on the LeanDog boat. Aside from being too darned early in the morning, and aside from the fact that I had to bail at the 3pm break because my kid thinks she needs a ride to college, it was a fantastic use of time.

The event is like this: You show up, hang out, grab a pair partner, and start doing test-driven development (TDD) on a specific programming problem. After forty-five minutes, you stop programming, join the big group again for a debriefing, and do it again from the beginning with a new pair partner.

And you do this like six times during the day. (I only stuck around for four iterations.)

The problem space

The actual programming problem was to mplement Conway’s Game of Life. It’s a devilishly clever assignment, partly because anyone who’s taken more than one Computer Science class thinks they’ve already solved it. But oh no.

If you don’t know about the Conway game, it’s not really a game you win or lose. You have an unlimited “board” of square cells arranged in rows and columns. Each cell is considered either “alive” or “dead.” On each “turn” every cell counts its eight immediate neighbors (including diagonals); if it has fewer than two live neighbors it “dies” of loneliness; if it has more than two live neighbors it “comes to life”; and if it has exactly two live neighbors it just stays the way it is. That is all. You just watch it go, turn after turn. It’s just cool. There is no object other than watching the patterns that “grow” from various starting configurations.

(If the above sounds like a really stupid waste of time, you are not a geek. Please find another career.)

Harder than it looks

This is actually kind of hard to implement in a real object-oriented way. How do you design your classes? Is there a Board that contains (“has-many”) an infinte number of Cell objects? That won’t work quite the way I made it sound, will it? How about keeping a collection (like a C# generic) of just the live Cells? (It’s a lot easier if you get to assume a limited playing field. But alas.)

What I learned

Somewhat importantly, Corey didn’t expect us to use any particular programming language. He encouraged us to pair with colleagues who were using languages we might not even know yet.

I learned a bit about object-oriented design, if by “learn” you mean “find things to argue with.” Briefly, master crafters like Corey seem to prefer a lot more abstraction in class design than I feel like wrapping my head around. I usually like a one-to-one relationship between code space and problem space; so to me an object is “a thing that does things,” and a class is the idea of what that object looks like and does. If I can’t think of it as a thing, it’s not an object– it’s a method. Maybe.

I didn’t get very far with my pair partners on any of the iterations. We kept getting hung up on the TDD cycle, confused with the problem space, or stuck on mundane tactical issues. In that way it was frustrating.

On the good side, I got to hang out and pair with Angela Harms on an iteration. There was a mutual aha moment when I said out loud, “A cell knows if it’s going to live or die.” Angela thought it was some Zen kind of cool. I thought I was having fun anthropomorphizing the code. We’re both right.

My man Chris Sanyk was there too. He wrote about it, and wrote about it again, and wrote about it yet one more time. Our favorite insight occurred when Corey said something about running unit tests only on public methods, which annoys me because sometimes all the interesting logic is in private methods. I was launching into a “rant” on this topic, according to Chris, when Chris simply said, “So write the test method inside your class.”

I was stunned for a moment, then opened my mouth to argue, realized what I was about to say would be wrong, then started to say something else that was wrong, and finally looked at Chris and went “aaaaaaah.”

Is that the right answer? Beats me. It’s not mainstream as far as I know. But it turned my idea of unit testing on its side.

What to ask first

Sometimes you can’t make the project not suck right away. You need to do a pre-unsucking project to figure out the actual unsucking.

Oh my. Here’s a Project That Sucks. It sucks so much I seriously don’t (yet) know how to make it not suck. How crazy is that?

Good thing I that Making It Not Suck isn’t the immediate task. My job now is Figuring Out How It Might Potentially Be Made To Not Suck. In other words, it’s a pre-desucking evaluation. The deliverables: one report with a recommended technical process, and one report with business recommendations.

Continue reading “What to ask first”

When they won't let you do Agile

If management won’t let you do Agile, don’t fight it. Find the best ways to apply the Principles regardless.

If you’re a fan here, or if you’ve had more concrete experiences of working on Agile software teams, you probably have a good sense of how well the Agile values and practices can really work.

Problem: Not everybody sees that. Especially (perhaps) your boss or project sponsor.

Continue reading “When they won't let you do Agile”

The W part of DTSTTCPW

When developing iteratively, what exactly constitutes premature optimization? When doing the simplest thing that could possibly work, what does it mean to “work”?

A couple of weeks ago, a question turned up on the BaseCamp site we use to coordinate one of my projects. One programmer asked what we thought of a certain calculation he was setting up on the database. It had to do with accumulating “rating” points of an item in a tree-shaped threaded discussion.

Continue reading “The W part of DTSTTCPW”

The things you learn

There are no seven-page proofs in a 200-level class. Even when it’s Adelberg.

It was something like the spring of 1984, and I was (as usual) struggling in Professor Adelberg‘s class on Series and Differential Equations. And as usual, I took advantage of his office hours, on the second floor of ARH.

I don’t remember where the proof was supposed to end up, but I had about seven handwritten pages that sort of got me there. Not quite.

Continue reading “The things you learn”

One of those seamless migrations

So when the client said it had to work on Oracle, like two years ago, but they would eventually migrate everything in the whole enterprise to Microsoft SQL Server (sigh), we went with Doing The Simplest Thing That Could Possibly Work (TDSTTCPW). In retrospect, that was kind of brilliant.

Working alongside my geek pal Beth, I wrote the Web Service to work only with Oracle, figuring the Microsoft issue was for later. You literally can’t save time as such–time goes by whether you’re doing anything with it or not!–so it didn’t make sense to write both interfaces at once. It could hypothetically “save time” but only in the sense of doing something unnecessary in the present.

Let’s not get crazy here

There’s a difference between “the simplest thing” and “the simplest thing that can possibly work” though, depending on what you mean by “work.” For this project, making it “work” definitely meant not painting ourselves into the proverbial corner. Obviously it made sense to separate the database-specific stuff from most of the business logic. And we implemented a fair amount of code in stored procedures, which would definitely have to be rewritten after the engine migration. That’s okay.

You totally want to read more about this, and you will, if you get onto my eZine mailing list. You get a complete mailing about once a month, with articles and links and special offers that help you do your job, as well a quick note every week or so with something helpful or at least interesting, including a really good recipe now and then. Do you need more stress? No? Then this is for you.

Shazam!

Then a funny thing happened with implementation. The DBA group informed Beth that they’d have to review all of our stored procedures and charge the hourly cost back to our project. Which didn’t have a budget for that. She went back to the client and said we’d have to rewrite a lot of our code to do without stored procedures or they’d have to figure out the chargeback issue with the DBA staff.

The client chose the former, so we took all those lovingly hand-crafted Oracle stored procedures and converted them to ugly C# logic sprinkled with OracleCommand objects.

Guess what?

Right there, we more than paid for the decision not to support both Oracle and Microsoft at the outset. We would have written stored procedures for Microsoft SQL Server, or at least the client-side support for them, for no reason at all. That must have saved eighty hours or so.

Anyway

A couple of weeks ago, Beth let me know the client was finally ready for the Microsoft migration. We decided that we wanted both Oracle and Microsoft support in the runtime, so the switch between database engines could be done in configuration rather than at build time. Which meant we couldn’t just drop in MS equivalents of all our Oracle client calls. We had to set up both.

The technique we came up with was pretty clean, probably about what you’re thinking of.

  1. Hit every Web Method, wrapping every command parameter in an overloaded method called AddInputParameter() or one called AddInputParameterWithValue(), or in a few cases AddOutputParameter(). These methods instantiated an OracleParameter, initialized it where required, and added it to the indicated IDbCommand‘s parameter list.
  2. Every place that used an OracleConnection was actually fine with an IDbDataConnection. Search and replace.
  3. Ditto OracleCommand and IDbDataCommand.
  4. (Later on) modified the getConnection() factory method we’d already made to return an OracleCommand object to return either an OracleCommand or a SqlCommand depending on a compiler setting. But the return type was IDbCommand for compatibility. That meant we could throw command objects around without caring about where they came from. (The magic of polymorphism.)
  5. Ditto the getCommand() factory method.
  6. Had to rewrite one particularly gnarly method that copied a record in a table, more or less in place except for the primary key.
  7. Safely ignored all the stored procedure client calls. We knew they weren’t being used anyway.

Now this got us to the point where the Web Service class had methods that touched Oracle and methods that touched those methods. Period.

It was easy to drag the Oracle methods into a class called FooDbOracle. It was also fairly easy to convert all existing calls to those methods to go through the FooDbOracle object. Thus:

[sourcecode language=”csharp”]
IDbDataCommand com = getCommand();
AddInputParameter(com,"xyzzy",DbType.Int);
[/sourcecode]

became

[sourcecode language=”csharp”]
IDbDataCommand com = getCommand();
db.AddInputParameter(com,"xyzzy",DbType.Int);
[/sourcecode]

The db pseudo-variable was itself just a property of the main class:

[sourcecode language=”csharp”]
FooDb db
{
get
{
return new FooDbOracle();
}
}
[/sourcecode]

We ran the unit tests–the few that we’d actually bothered to write ahead of time anyway–and found success.

Given that the FooDbOracle class worked so well, it was easy to abstract an interface called FooDb. And then I re-implemented that interface as FooDbMsSql, so we then had two engine-specific classes that implemented the same FooDb interface.

The next thing to do is to figure out at runtime which implementation of the FooDb interface to use. Since we set things up so the calling code always went through that localized db property, it was really easy to modify its get to act as something like a factory method. It produces a FooDbOracle or a FooDbMsSql depending on a configuration setting (which is not interesting here).

[sourcecode language=”csharp”]
FooDb db
{
get
{
switch (which_engine_config_file_says_to_use) // <== obvious pseudocode
{
case ORACLE:
return new FooDbOracle();
case MSSQL:
return new FooDbMsSql();
default:
throw new NotImplementedException("Only Microsoft SQL Server and Oracle are supported!");
}
}
}
[/sourcecode]

Regrets

We didn’t really have enough unit tests to make this a safe upgrade. I wrote much of the original code before being comfortable enough with NUnit to rely on it, and in the intervening couple of years Beth just didn’t feel like maintaining the unit tests to keep up with all her refactoring. So there’s an awkward homemade test suite that doesn’t cover very much.

Also, this code is clearly not optimized for performance. We’re creating that db object over and over again when it should obviously be cached. And there should be connection pooling too. But the service wasn’t designed to support persistence, and we’d probably have to pay some attention to concurrency if we’re going to pool connections, and it honestly just wasn’t worth the time to think about as this application doesn’t handle a very high volume of interactions.

Again, DTSTTCPW dictates we go with the techniques that work pretty well in the simple cases that we’re actually facing in production. We can worry about the load issues later if they ever arise, but they probably won’t. (I’m pretty sure that about 95% of anticipated performance issues are imaginary.)

Is this perfect?

Definitely not. It could be faster. It could be easier to maintain and more flexible than it is. But it got done! When we tally up the time for billing purposes it might end up around, oh, like fifty hours total. Best of all, we can roll this out in every shop that is ready, and flip the configuration switch when the DBAs give the all-clear.

Coding sideways

You don’t have to hack legacy code from top to bottom. You can go bottom to top too!

Am I right that some of the hardest programming is when you’re modifying some existing code that is not quite well enough documented?

I’m looking at this scientific application, in which I struggled with all the code that leads up to drawing some graphs, and the graphs are still obviously wrong. Between me and success I have a few layers of trigonometry and matrix algebra.

Obviously the physicist who wrote the original code had some intention for the methods with names like get_x() and InverseTransform(). And I’ve gotten through about half of this stuff with revelations like, “Oh, this converts the photo grid coordinates into screen coordinates!” or “I can cut this scaling out entirely because I already have a transformation matrix on the Graphics context.”

But you know what would really help?

Continue reading “Coding sideways”

Access: Why not?

I don’t suppose Access is ever the only solution to a problem. But in my experience it’s frequently a reasonable one and it spares a lot of drama over acquiring and installing a database engine.

I blogged a few times recently about ways to make Acess databases do kind of what you want when you’re programming with .NET. There was this one about multiple JOIN syntax. Then this one about “parameter” errors. And finally this one about weird column names. Yuck!

Continue reading “Access: Why not?”

When your C# has to be fast.

It’s a pain to test, and it’s sometimes hazardous, and it contaminates your entire application with the “unsafe” label, but dropping under the CLR and writing fast code with pointers is sometimes the only way to get acceptable runtime speed.

When Microsoft came up with .NET environment and the snazzy new C# programming language to go with it, one of the design goals was to support code that could be shoved willy-nilly across unsecure networks. Thus the Common Language Runtime (CLR), which among other things creates a runtime environment that’s a little like the Java Virtual Machine (JVM) from Sun.

Code for .NET doesn’t execute directly on the CPU, and it doesn’t talk directly to the operating system. Instead, the CLR runtime–part of that big download when you pull “the .NET Framework” from Microsoft’s download site–treats the code as input and does whatever needs to be done at the CPU and operating system level for you.

Safety first, usually

Why would anyone bother with that? Among other things, the CLR itself is assumed to be “safe.” The CLR won’t do anything to interfere with processes that it’s not supposed to control; it observes security restrictions; and it doesn’t let you access code or data through pointers. So in theory, and most of the time in practice, CLR code won’t mess up your computer and therefore it is safe to run CLR code from untrusted sources.

The downside, again one among many, is that as a programmer you don’t have access to pointers in safe code. They’re automatically considered “unsafe.” Any code with pointers in it is “unsafe.” Any code that uses code with pointers in it… is “unsafe.” Any code that uses code that uses code… blah blah blah.

When it has to be fast

Right now I’m working on a .NET application that needs to pull a few megabytes of data from a legacy device driver and do a lot of math on it. It’s too slow to load all the data into a big byte[] array, and then to access each element one by one for calculations. Instead, I wrote something like this, with the names changed to protect the innocent:

[sourcecode language=”csharp”]
unsafe public class BitContainer
{
private IntPtr data;
public static int Size; // initialized elsewhere
public static int Count // however many shorts can fit into Size
{
get
{
return Size / sizeof(short);
}
}
public BitContainer()
{
data = Marshal.AllocHGlobal(Size);
}
~BitContainer()
{
Marshal.FreeHGlobal(data);
}
public void LoadFromStream(BinaryReader input)
{
ushort* p = (ushort*) data.ToPointer();
for (int i = 0; i < Count; i++)
{
*p = input.ReadUInt16();
p++;
}
}
public void GetItemsOverThreshold(int threshold, int max, ref List<Item> items)
{
log.Write();
items.Clear();
unsafe
{
ushort* p = (ushort *) data.ToPointer();
for (long i = 0; i < Count; i++)
{
if (*p >= threshold)
{
if (items.Count >= max)
{
throw new ApplicationException("Too many items passing! Turn down the gain?");
}
else
{
long x = i % FRAME_WIDTH;
long y = i / FRAME_WIDTH;
items.Add(new Item(x, y));
}
}
}
}
}
}
[/sourcecode]

The alternative to tearing through data with a C-style pointer would have been to pick through the elements of a .NET array of byte or short values. Each array access goes through the CLR for bounds-checking and the like, and when you’re hitting literally millions of input values that’s just too slow. I originally did the BitContainer code above that way, and it took hours.

The takeaway

It’s a pain to test, and it’s sometimes hazardous, and it contaminates your entire application with the unsafe label, but dropping under the CLR and writing fast code with pointers is sometimes the only way to get acceptable runtime speed.

Code Farming: Sprouting Some Methods

You probably can’t impose unit testing on a whole system all at once. But you probably can increase the portion of the code under test a little bit at a time. When you’re busy but need to make small changes, consider Sprout Method (59).

I am trying, with only partial success, to apply what I’ve learned in Working Effectively With Legacy Code by Michael C. Feathers.

Feathers is a huge advocate of test-driven development. He puts it out there on page xvi: “Code without tests is bad code.” He defines “legacy code” as, strictly speaking, any code that isn’t already under unit tests. At first it struck me as a funny definition, because obviously lots of code is written today–even by me–without unit tests, and how can it be right to refer to software nobody’s even thought of yet as “legacy”? But for purposes of the book it works.

Continue reading “Code Farming: Sprouting Some Methods”