When Microsoft came up with .NET environment and the snazzy new C# programming language to go with it, one of the design goals was to support code that could be shoved willy-nilly across unsecure networks. Thus the Common Language Runtime (CLR), which among other things creates a runtime environment that’s a little like the Java Virtual Machine (JVM) from Sun.
Code for .NET doesn’t execute directly on the CPU, and it doesn’t talk directly to the operating system. Instead, the CLR runtime–part of that big download when you pull “the .NET Framework” from Microsoft’s download site–treats the code as input and does whatever needs to be done at the CPU and operating system level for you.
Safety first, usually
Why would anyone bother with that? Among other things, the CLR itself is assumed to be “safe.” The CLR won’t do anything to interfere with processes that it’s not supposed to control; it observes security restrictions; and it doesn’t let you access code or data through pointers. So in theory, and most of the time in practice, CLR code won’t mess up your computer and therefore it is safe to run CLR code from untrusted sources.
The downside, again one among many, is that as a programmer you don’t have access to pointers in safe code. They’re automatically considered “unsafe.” Any code with pointers in it is “unsafe.” Any code that uses code with pointers in it… is “unsafe.” Any code that uses code that uses code… blah blah blah.
When it has to be fast
Right now I’m working on a .NET application that needs to pull a few megabytes of data from a legacy device driver and do a lot of math on it. It’s too slow to load all the data into a big
byte array, and then to access each element one by one for calculations. Instead, I wrote something like this, with the names changed to protect the innocent:
unsafe public class BitContainer
private IntPtr data;
public static int Size; // initialized elsewhere
public static int Count // however many shorts can fit into Size
return Size / sizeof(short);
data = Marshal.AllocHGlobal(Size);
public void LoadFromStream(BinaryReader input)
ushort* p = (ushort*) data.ToPointer();
for (int i = 0; i < Count; i++)
*p = input.ReadUInt16();
public void GetItemsOverThreshold(int threshold, int max, ref List<Item> items)
ushort* p = (ushort *) data.ToPointer();
for (long i = 0; i < Count; i++)
if (*p >= threshold)
if (items.Count >= max)
throw new ApplicationException("Too many items passing! Turn down the gain?");
long x = i % FRAME_WIDTH;
long y = i / FRAME_WIDTH;
items.Add(new Item(x, y));
The alternative to tearing through data with a C-style pointer would have been to pick through the elements of a .NET array of byte or short values. Each array access goes through the CLR for bounds-checking and the like, and when you’re hitting literally millions of input values that’s just too slow. I originally did the BitContainer code above that way, and it took hours.
It’s a pain to test, and it’s sometimes hazardous, and it contaminates your entire application with the unsafe label, but dropping under the CLR and writing fast code with pointers is sometimes the only way to get acceptable runtime speed.
I am trying, with only partial success, to apply what I’ve learned in Working Effectively With Legacy Code by Michael C. Feathers.
Feathers is a huge advocate of test-driven development. He puts it out there on page xvi: “Code without tests is bad code.” He defines “legacy code” as, strictly speaking, any code that isn’t already under unit tests. At first it struck me as a funny definition, because obviously lots of code is written today–even by me–without unit tests, and how can it be right to refer to software nobody’s even thought of yet as “legacy”? But for purposes of the book it works.
It happens a lot, especially when working on legacy code, that you can’t figure out a “business logic” algorithm that isn’t already well documented. Sure, it’s in the code, but so are a million other things, and you can’t eyeball the part that does all the calculation. The client is asking for a change or a fix and you’re not sure where to start.
That’s when I think you can do three things at once: improve the overall structure, impose some unit testing, and solve the problem you were asked to. Refactoring does all of these.More
Because it’s the beginning of the week, I’m again presenting more about programming .NET with an Access database.
This question just came up on Stack Overflow. It reflects a pretty common misunderstanding of how C-style strings are represented by char pointers in both C and C++.
Greatly condensed, it goes:
- You read some data into a std:string object. You display the contents; it’s all there.
- You invoke c_str() on that std:string, and display its contents; it’s not all there.
I learned an interesting, although in retrospect somewhat obvious, technique from Michael Feathers’s great book, Working Effectively With Legacy Code.
Suppose you have a class that can’t be unit-tested in any automated way because it has side effects or requires user input. Or, in the case of my application, it drives hardware that I don’t actually have readily at hand. NUnit and the like provide input to methods and test the resulting output. How can you run tests on something where the “output” is the motion of a sensor arm, or the “input” is a stream of weather data?
The short answer is that you really can’t, but that shouldn’t stop you from setting up unit tests.
Last night I regaled The Smartest Guy I Know, as well as drsweetie’s high school pal Tony, with tales of how Kids These Days Don’t Know Anything and Nobody Does Things Right. It was a scene to warm the bitter soul of any software curmudgeon.
“So there I was,” I began, “optimizing the heck out of an ASP.NET application…” And my audience groaned–although I’m still not sure whether it was because they know how the story simply had to end, or because I was the teller and they couldn’t be sure that it would ever end. Because my stories get to be like that sometimes.