Interpreting NULL Values

Objective

After completing this lesson, you will be able to interpret NULL values in databases and identify why their presence can lead to unexpected query results.

NULL Values: Interpretation and Origin

A NULL value can be used to express the following:

  • The corresponding value exists in principle but is unknown temporarily or permanently for the row at hand (such as the birthday of a person).

  • The corresponding value does not exist, for example the owner of an unregistered car.

  • NULL is a special value in SQL.

    • For numeric datatypes, NULL does not equal 0.

    • For string datatypes, NULL does not equal an empty string or a space character.

NULL Values: Origin

NULL values can result in the following cases:

  • When reading rows (the SELECT statement).

    • If the table queried from already contains NULL values.

    • When using outer joins.

  • When inserting or updating rows (INSERT and UPDATE statements).

    • If NULLs are permitted in a column, no value for that column is provided, and no default value is defined.

    • If a NULL value is explicitly specified for a column and NULL values are allowed.

  • When adding a column to a table (ALTER TABLE statement). If the table has at least one row, NULL values are allowed for the added column and no default value is defined for the added column.

NULL Values Lead to Three-Value Logic

By allowing NULL values, a third truth value is necessary. In addition to TRUE and FALSE there is also the truth value UNKNOWN in SQL. Any comparison with a NULL value other than the IS NULL or IS NOT NULL predicates results in the truth value UNKNOWN.

This applies to all value-based comparison operators: =, >, >=, <, <=, <>

AA = 7A <> 7A >= 6
5falsetruefalse
7truefalsetrue
NULLunknownunknownunknown

Three-Value Logic Extends Logical Operators: AND and OR

Two-value logic is a subset of three-value logic. When NULL values are eliminated, the AND test simplifies.

XYX AND Y
truetruetrue
truefalsefalse
falsetruefalse
falsefalsefalse
trueunknownunknown
falseunknownfalse
unknowntrueunknown
unknownfalsefalse
unknownunknownunknown

When NULL values are eliminated, the OR test also simplifies.

XYX AND Y
truetruetrue
truefalsetrue
falsetruetrue
falsefalsefalse
trueunknowntrue
falseunknownunknown
unknowntruetrue
unknownfalseunknown
unknownunknownunknown

The introduction of the logical value UNKNOWN and the way expressions are evaluated leads to an unexpected result. That is that logical expressions that always evaluate to true in usual programming languages and according to common sense no longer evaluate to TRUE in all cases. The following list shows some examples:

  • A = A
  • (A<5) OR (A>=5)
  • 0*A = 0
  • 2*A = A+A
  • A OR (NOT A)
  • MAX(...) >= ALL (<set of all values>)
  • MIN(...) <= ALL (<set of all values>)

Do these Queries have the Same Result?

Do the following two SQL queries have the same result?

  1. Code Snippet
    123
    SELECT Name, Overtime FROM Official WHERE Overtime <= 10 OR Overtime > 10;
  2. Code Snippet
    12
    SELECT Name, Overtime FROM Official;

The Queries May Have Different Results!

The two SQL queries have different results!

  1. Only officials where OvertimeIS NOT NULL applies:

    Code Snippet
    123
    SELECT Name, Overtime FROM Official WHERE Overtime <= 10 OR Overtime > 10;
    NAMEOVERTIME
    Mr A10
    Mr B10
    Ms C20
    Mr E10
    Mr F18
    Ms G22
  2. All officials (including those with: Overtime IS NULL )

    Code Snippet
    12
    SELECT Name, Overtime FROM Official;
    NAMEOVERTIME
    Mr A10
    Mr B10
    Ms C20
    Ms D?
    Mr E10
    Mr F18
    Ms G22
    Ms H?
    Mr I?