What does it mean to do a "null check" in C or C++?

  • I have been learning C++ and I am having a hard time understanding null. In particular, the tutorials I have read mention doing a "null check", but I am not sure what that means or why it's necessary.

    • What exactly is null?
    • What does it mean to "check for null"?
    • Do I always need to check for null?

    Any code examples would be much appreciated.

    I would advise to get some better tutorials, if all the ones you read talk about null checks without ever explaining them and providing example code...

  • In C and C++, pointers are inherently unsafe, that is, when you dereference a pointer, it is your own responsibility to make sure it points somewhere valid; this is part of what "manual memory management" is about (as opposed to the automatic memory management schemes implemented in languages like Java, PHP, or the .NET runtime, which won't allow you to create invalid references without considerable effort).

    A common solution that catches many errors is to set all pointers that don't point to anything as NULL (or, in correct C++, 0), and checking for that before accessing the pointer. Specifically, it is common practice to initialize all pointers to NULL (unless you already have something to point them at when you declare them), and set them to NULL when you delete or free() them (unless they go out of scope immediately after that). Example (in C, but also valid C++):

    void fill_foo(int* foo) {
        *foo = 23; // this will crash and burn if foo is NULL
    }
    

    A better version:

    void fill_foo(int* foo) {
        if (!foo) { // this is the NULL check
            printf("This is wrong\n");
            return;
        }
        *foo = 23;
    }
    

    Without the null check, passing a NULL pointer into this function will cause a segfault, and there is nothing you can do - the OS will simply kill your process and maybe core-dump or pop up a crash report dialog. With the null check in place, you can perform proper error handling and recover gracefully - correct the problem yourself, abort the current operation, write a log entry, notify the user, whatever is appropriate.

    But this works better in a language that initialises all variables to zero automatically. In C++, when you create a pointer, it can point anywhere, and NULL checks won't work.

    @MrLister what do you mean, null checks don't work in C++? You just have to initialise the pointer to null when you declare it.

    What I mean is, you must remember to set the pointer to NULL or it won't work. And if you remember, in other words if you *know* that the pointer is NULL, you won't have a need to call fill_foo anyway. fill_foo checks if the pointer has a value, not if the pointer has a *valid* value. In C++, pointers are not guaranteed to be either NULL of have a valid value.

    An assert() would be a better solution here. There's no point trying to "be safe". If NULL was passed in, it's obviously wrong, so why not just crash explicitly to make the programmer fully aware? (And in production, it doesn't matter, because you've *proven* that nobody will call fill_foo() with NULL, right? Really, it's not that hard.)

    @MrLister in a very simple application, this may be true. I can think of many more complex situations where you may want to try dereferencing a pointer without knowing whether or not it is valid. if you ensure all pointers are initialised to null in your constructor initialisation and you set its value back to null when destroying the object it references, then you can assume that the point will either have a valid value or NULL.

    @AmbrozBizjak trying to dereference a pointer may not be the be-all/end-all of the application. it's perfectly reasonable to expect that the "error" may be recoverable. I believe tdammers' example was probably meant to be simplistic to get a point across, as in "this is what null checking looks like in C++".

    Don't forget to mention that an even better version of this function should use references instead of pointers, making the NULL check obsolete.

    @MrLister but you make a good point that people working with C++ should be aware it doesn't auto-initialise values on variables.

    @MrLister I don't think it's too much hassle to remember to initialise your variables.

    @AmbrozBizjak If you're providing a framework instead of an application you can't prove anything, and have to think like the worst programmer

    @James: same applies for frameworks. Document the interface and put assertions in, and expect the framework users to read the documentation, or at the very least test with a debug build of your framework with assertions enabled. If they don't do either of that, they obviously have no intention of writing reliable code.

    This is not what manual memory management is about, and a managed program will blow up too, (or raise an exception at least, just like a native program will in most languages,) if you try to dereference a null reference.

    @TZHX: No, it doesn't auto-initialize *some primitives*. The two are not the same. Oh, and, in C++, it's `nullptr`.

    It's only `nullptr` for the lucky few that can use a compiler that supports C++11. Most of us are still stuck on older compilers. If you have an older compiler, you are stuck with NULL.

    @MasonWheeler: Manual memory management is about taking care of memory allocation yourself. Knowing which of your pointers point to something valid is a vital part of that. I admit my wording was a bit unfortunate, I'll edit in a second.

    @MrLister: Read the answer again. Even though the examples don't show it, I explicitly recommend initializing all your pointers to NULL.

  • The other answers pretty much covered your exact question. A null check is made to be sure that the pointer you received actually points to a valid instance of a type (objects, primitives, etc).

    I'm going to add my own piece of advice here, though. Avoid null checks. :) Null checks (and other forms of Defensive Programming) clutter code up, and actually make it more error prone than other error-handling techniques.

    My favorite technique when it comes to object pointers is to use the Null Object pattern. That means returning a (pointer - or even better, reference to an) empty array or list instead of null, or returning an empty string ("") instead of null, or even the string "0" (or something equivalent to "nothing" in the context) where you expect it to be parsed to an integer.

    As a bonus, here's a little something you might not have known about the null pointer, which was (first formally) implemented by C.A.R. Hoare for the Algol W language in 1965.

    I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

    Null Object is even worse than just having a null pointer. If an algorithm X requires data Y which you do not have, then that is a *bug in your program*, which you are simply hiding by pretending that you do.

    It depends on the context, and either way testing for "data presence" beats testing for null in my book. From my experience, if an algorithm works on, say, a list, and the list is empty, then the algorithm simply has nothing to do, and it accomplishes that by just using standard control statements such as for/foreach.

    If the algorithm has nothing to do, then why are you even calling it? And the reason you might have wanted to call it in the first place is *because it does something important*.

    @DeadMG Because programs are about input, and in the real world, unlike homework assignments, input can be irrelevant (e.g. empty). Code still gets called either way. You have two options: either you check for relevance (or emptiness), or you design your algorithms so that they read and work well without explicitly checking for relevance using conditional statements.

    I came here to make almost the same comment, so gave you my vote instead. However, I would also add that this is representative of a bigger problem of *zombie objects* - anytime you have objects with multi-stage initialisation (or destruction) that are not fully live but not quite dead. When you see "safe" code in languages without deterministic finalisation that has added checks in every function to see if the object has been disposed, it is this general problem rearing it's head. You should never if-null, you should work with states that have the objects they need for their lifetime.

    Which also means no Null Object...

  • The null pointer value represents a well-defined "nowhere"; it is an invalid pointer value that is guaranteed to compare unequal to any other pointer value. Attempting to dereference a null pointer results in undefined behavior, and will usually lead to a runtime error, so you want to make sure a pointer is not NULL before attempting to dereference it. A number of C and C++ library functions will return a null pointer to indicate an error condition. For example, the library function malloc will return a null pointer value if it cannot allocate the number of bytes that have been requested, and attempting to access memory through that pointer will (usually) lead to a runtime error:

    int *p = malloc(sizeof *p * N);
    p[0] = ...; // this will (usually) blow up if malloc returned NULL
    

    So we need to make sure the malloc call succeeded by checking the value of p against NULL:

    int *p = malloc(sizeof *p * N);
    if (p != NULL) // or just if (p)
      p[0] = ...;
    

    Now, hang on to your socks a minute, this is going to get a bit bumpy.

    There is a null pointer value and a null pointer constant, and the two are not necessarily the same. The null pointer value is whatever value the underlying architecture uses to represent "nowhere". This value may be 0x00000000, or 0xFFFFFFFF, or 0xDEADBEEF, or something completely different. Do not assume that the null pointer value is always 0.

    The null pointer constant, OTOH, is always a 0-valued integral expression. As far as your source code is concerned, 0 (or any integral expression that evaluates to 0) represents a null pointer. Both C and C++ define the NULL macro as the null pointer constant. When your code is compiled, the null pointer constant will be replaced with the appropriate null pointer value in the generated machine code.

    Also, be aware that NULL is only one of many possible invalid pointer values; if you declare an auto pointer variable without explicitly initializing it, such as

    int *p;
    

    the value initially stored in the variable is indeterminate, and may not correspond to a valid or accessible memory address. Unfortunately, there's no (portable) way to tell if a non-NULL pointer value is valid or not before attempting to use it. So if you're dealing with pointers, it's usually a good idea to explicitly initialize them to NULL when you declare them, and to set them to NULL when they're not actively pointing to anything.

    Note that this is more of an issue in C than C++; idiomatic C++ shouldn't use pointers all that much.

  • There are a couple of methods, all essentially do the same thing.

    int *foo = NULL;  //sometimes set to 0x00 or 0 or 0L instead of NULL
    

    null check (check if the pointer is null), version A

    if( foo == NULL)
    

    null check, version B

    if( !foo )  //since NULL is defined as 0, !foo will return a value from a null pointer
    

    null check, version C

    if( foo == 0 )
    

    Of the three, I prefer to use the first check as it explicitly tells future developers what you were trying to check for AND it makes it clear that you expected foo to be a pointer.

  • You don't. The only reason to use a pointer in C++ is because you explicitly want the presence of null pointers; else, you can take a reference, which is both semantically easier to use and guarantees non-null.

    What you're in kernel mode, have no exception handler and need to find out if a 'new' failed?

    @James: Is a completely special case and has no relevance to general C++ advice.

    @James: 'new' in kernel mode?

    @DeadMG What, pray tell, is 'general C++'?

    @NemanjaTrifunovic Yes, why not? You can't do it for real-time sure, but there's nothing stopping you otherwise.

    @James: An implementation of C++ which represents the capabilities that a significant majority of C++ coders enjoy. That includes *all* C++03 language features (except `export`) and all C++03 library features *and* TR1 *and* a good chunk of C++11.

    I *do* wish people wouldn't say that "references guarantee non-null." They don't. It is as easy to generate a null reference as a null pointer, and they propagate the same way.

    @mjfgates: It is most assuredly guaranteed. Dereferencing a NULL pointer is UB, whereas having one is not, and it is, generally, impossible to gain a NULL reference unless you have already violated your invariants elsewhere, as the most common way to gain a reference is not at all by de-referencing a pointer, and binding it to a variable or temporary guarantees non-NULL.

    -1. Another C++ idealist chooses to tell the op the correct way to use C++, while at the same time *completely ignoring the question*. How could advice on how to avoid NULL be useful when he doesn't even know what NULL is? Answer the question first. Evangelize correct coding principles after.

    @Stargazer: The question is 100% redundant when you just use the tools the way the language designers and good practice suggest you should.

    @DeadMG, it doesn't matter whether it is redundant. You *didn't answer the question*. I'll say it again: -1.

    The OP also asks about C

    References _promise_ non-null. And null references are undefined behaviour, so a correct program will never have a null reference. But that doesn't mean _your_ program is correct and doesn't run into a null reference. It does mean unfortunately that when you write "if (&ref == NULL)..." the compiler can say that null references are UB, so &ref _cannot_ be NULL (even if it is NULL).

  • If you don't check NULL value, specially, if that is a pointer to a struct, you maybe met a security vulnerability - NULL pointer dereference. NULL pointer dereference can be lead to some other serious security vulnerabilities such as buffer overflow, race condition ... that can be allow attacker take control of you computer.

    Many software vendor like Microsoft, Oracle, Adobe, Apple ... release software patch to fix these security vulnerabilities. I think you should check NULL value of each pointer :)

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM