Design Tip – None or NA Instead of NULL

We are working in a world of existing applications with known technical issues couple with development and business techniques that emphasize moving quickly to fail quickly and learn from the mistake. Now that does not mean you should be selectively ignorant when beginning a new design and forget all the fundamentals. If you are the developer, then it should not be a problem to change. If you work more on the database side of things, then it can take some work to convince your developer colleagues to try something new.

The SQL Language and the underlying relation database theory does not handle NULLs well. Consequently, if you can design to avoid them, you should.

Two Fundamentals to consider

The NULL value is problematic when it comes to SET theory.

Imagine a table containing address information with a column for State or Province. The table contains 3 records with Florida as the state, 2 with NULL value as the state, and the other 45 records have various other states. 50 records in total. A query searching for all the Florida records would be rather simple and the WHERE clause looked like STATE = FLORIDA. You run the query and the 3 records are returned. Now if you want the records that are not in Florida intuitively the WHERE clause would look like STATE <> FLORIDA. The query will return 45 records omitting the 2 records with NULL value as the state. Not What you would expect. Your WHERE clause would need the extra or condition of STATE <> FLORIDA or STATE IS NULL.

The NULL values are not indexed.

Imagine a situation where the absence of a value was a criteria for retrieving and processing a record. So, something that most developers are familiar with is a ticketing system for software errors. The screen where you would login display all the records from a ticket table where the Assigned developer was equal to NULL(is NULL is the keyword) Logically this works and if you deployed this application it would appear to work normally as long as the table itself resided in the database server’s memory buffers. You would not notice until the table was very large that there was a problem and the search become slow.

Academically this is not discussed much because the slow behavior is related more to how the RDBMS implements the execution of SQL statements rather than the theory of the language. Basically it is platform specific. But if you replaced the NULL value with a string like unassigned. The problem is solved. The value unassigned will become part of the index and you can benefit from the index when the table grows in size.

Here is some of my favorite string replacements for NULL

  • NA
  • None
  • Unavailable
  • Undefined

So, are you convinced?

Let me know in the comments section what you think about limiting NULL values when possible. I know that there are NULL values very often in Date/Time that cannot be populated with a arbitrary value when the field has no value. That will be the subject of another blog post.

Have a nice day.

Published by Marek

Just a Database Administrator from Montreal.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: