Wednesday, December 3, 2008

null is not a boolean

One coding practice I have a problem with is returning null to signify "no value here". This leads to null-check-littered code ans consequently decreases readability. Davy Brion has elaborated on this subject here and I can only agree wholeheartedly.

So, how do we deal with data that may or may not be present?

If possible, the best approach is to use the null object pattern. This allows the calling code to completely ignore the fact that the object is undefined. I'll not write anything more about it, check the web for more information.

There are however cases where the caller actually need to know if there is a value present or not. Consider the following code:
class Person
{
public Animal Pet { get; }
}
In this example, a person may or may not have a pet. If we use null to signify the absence of a pet then code that accesses a Person object would have to write code like:
void WritePet(Person person)
{
if (person.Pet != null)
{
Console.WriteLine("Pet name is: " + person.Pet.Name);
}
}
The first issue with this code is the "person.Pet != null" check. We're using the Pet reference do the extra work of keeping track of if there is a value present or not. The Pet property should be used to access the Pet, not to determine wither there is a value there or not.

The second issue is that nothing in this code indicates that null is a valid value. You could add this information in a comment but relying on comments should be avoided when possible.

I prefer the following construction:
class Person
{
private Animal pet = null;
public bool PetIsValid { get { return pet != null; } }
public Animal Pet { get { return pet; } }
}

Yes, this adds a bit of code to the class. And yes, I'm using the pet field as a boolean myself. The difference though is that I only perform the pet != null check internally in the class. It is not unreasonable for the class to know that the pet value may be null.

We can now rewrite the calling code as:
void WritePet(Person person)
{
if (person.PetIsValid)
{
Console.WriteLine("Pet name is: " + person.Pet.Name);
}
}

The first benefit of this pattern is that the calling code is simply more readable. (Yes, that is subjective.)

A second and more important benefit of applying this pattern consistently in a code base is that when you find a property that does not have an associated XyzIsValid property you can be sure that it will not be null!


(A small note on the naming: I think that HasPet would be a better name than PetIsValid. The drawback is that it would be harder to discover. Using the PetIsValid naming convention this method will appear in the intellisense list when you type person.Pet, hinting that you should probably check validity before proceeding).