The Road to KDE Devland – step 2

logo

Can anyone give me some pointers?

After one week with Sams Teach Yourself C++ in 21 Days, it feels like I have the basics of C++ down: constants and variables, functions, some operators, loops, if and switch statements etc. Object oriented programming was introduced surprisingly early (Day 6), and the memory discussion at the end of Day 5 took some time to digest, but other than that everything went smoothly.

In the second week, the book takes up a topic that I’ve found pretty hard: pointers and references. In this step, I’ll write about some of the things about pointers and references that confused me. It’s assumed that you already know the basics of pointers/references.

Why pointers?

Pointers are introduced with some silly examples that don’t make much sense – why manipulate a variable using a pointer instead of directly assigning a value to the variable? Soon, however, the book tells you why pointers exist. (In this case, I think I would’ve preferred the other way round – first present the problem, then talk about the solution(s). It’s probably the way Accelerated C++ approaches the subject, although I haven’t arrived at pointers in that book yet).

If you’ve studied some C++, you have probably come across the stack and the free store (the heap). Local variables (and function parameters) are declared on the stack, and go out of scope when the function returns. Variables on the heap, however, remain until you free the memory manually or when the program ends.

In C++, you allocate space on the heap with new. new returns a pointer, which is how you access data on the heap. I like the analogy in Sams Teach Yourself C++ in 21 Days: let’s think of the memory address as a telephone number, say, to the local pizza store. Since you eat pizza all the time, you program your phone to call that number when you press a special button. You don’t have to remember the phone number anymore, and you don’t need know where the pizzeria is located – you can still access it by pressing that button on your phone.

Now back to programming. You can think of the pizzeria as being on the heap (it’s “somewhere”). Isn’t it good that our phone has this handy button? You probably know where I’m going – that button is our pointer to the pizzeria. In a similar way, you can access variables on the heap just by using pointers!

Another use of pointers is to pass variables by reference to functions. A classic example is the void swap(int a, int b) function. If you pass two variables to the function, you’ll notice that they aren’t swapped at all! This is because the function receives copies of the values, not the actual variables. One way to solve this problem is to pass pointers to the variables instead: void swap(int *a, int *b). This way the function can access the original variables and do the swap.

Pointers vs References

If you want to use the swap function from the previous section, you have to pass it the address of the variables (on the stack) using the address-of operator (&):

swap(&first, &second);

In the swap function body, you have to dereference the pointers with *:

int temp = *second;
*second = *first;
*first = temp;

Quite troublesome, isn’t it? (Here, we’ve even skipped to check if the pointer is a null pointer). Fortunately, you can accomplish the same thing with references. A reference is an alias for an object. If we pass references to our function, void swap(int &a, int &b), we can do everything the “normal” way since a and b in the function body are aliases for the variables we pass into the function (and not copies!).

Seeing how easy it is with references, pointers feel kind of, well, pointless. So when should I choose to use pointers instead of references?

The rule of thumb seems to be:

Don’t use a pointer when a reference can serve the same purpose.

With that said, there are a few situations when you want to use pointers according to Johnny Bigert (Swedish):

  1. If it’s possible for the object to be null. Sams Teach Yourself C++ mentions that some compilers support null references, but recommends to avoid them; instead, you should use a pointer, which can be assigned to null (0).
  2. If you want to change the object being pointed to. References can’t be reassigned, so you should use a pointer in this case.

klebezettel posted a link in the comments that explained the standpoint of Qt Software:

“Most C++ books recommend references whenever possible, according to the general perception that references are “safer and nicer” than pointers. In contrast, we at Qt Software tend to prefer pointers because they make the user code more readable.”

See the link for more details and also Karellen’s comment.

The use of * and &

If you look at the swap example, you see a lot of * and &. If the function prototype looks like void swap(int &a, int &b);, should you also prefix the arguments with * or & when you call the function?

This is how I thought in the beginning, and obviously it was not the right approach. What I did was to confuse the address-of operator with the reference operator, and the same for *. The compiler can see the difference from the context, but how do you do it?

It’s very simple, really. You declare a pointer by writing the type followed by * and finally followed by the pointer name, for example

int *pointer;

Personally I think it makes more sense to think that * belongs to int, to declare a pointer to an int. The problem is that if you write it like

int* pointer1, pointer2;

you would expect two pointers, but what you really get is only one pointer – pointer2 is an int. How the whitespace is placed varies from programmer to programmer.

When you use the dereference operator (*), you don’t have a type in front of the *, for example

someFunction(*dereference);

It’s the same for & when declaring references (type &ref) and using it as  the address-of operator (&variable).

Now back to functions. When looking at a function prototype, I now think: “what kind of arguments does it expect”? Let’s say it looks like this:

void increment(int *var);

The function takes a pointer to an int. I can pass it a pointer, or I can use the address-of operator (&) to pass it  the address of an int:

int i = 1;
increment(&i);

If the function takes a reference,

void increment(int &var);

I just think that the function, if properly written, will adjust my variable in some way.

There are times when you don’t want to modify the original variable, but still want to pass by reference. This has to do with performance – when passing by reference the function can access the original variables and doesn’t need to make a copy. When passing an int, it won’t make a big difference; but when it comes to big classes, you can gain a lot by passing function arguments by reference.

What you want to do is to tell the function, “OK I give you access to my variable, but don’t you dare to touch it!”. You already know how to do the first part, and the second can be accomplished with the const keyword. References are always “constant” (they can’t be reassigned), so const together with a reference is always intended to make the object referred to constant:

int const &a =; // correct
const int &b =
; // correct
int &const c = …; // not valid

as litb pointed out. However, this doesn’t apply to pointers:

const int * pointer1; // pointer to constant int - the value pointer1 points to can't be changed
int * const pointer2; // constant pointer to int - the object pointer2 points to can't be changed

Of course, you can combine the two:

const int * const pointer3; // the object pointer3 points to and the value can't be changed

The trick taught in Sams Teach Yourself C++ is to look to the right of const. In the first example, the int (pointed to) is constant; in the second, pointer2 (the pointer) is constant.

Finally, just a few words about the cute -> operator. If you want to access members of an object using a pointer, we would have to dereference the pointer first:

(*pointer).anotherFunction();

Since this is something we’ll do quite often, C++ provides the -> operator for indirect access. That means that we can as well write it like this:

pointer->anotherFunction();

Pointers in Qt

You’ll see -> being used a lot in Qt – in fact, almost all widgets (pushbuttons, labels etc.) are created on the heap in a way similar to this:

QPushButton *okButton = new QPushButton("OK");

Why aren’t they created on the stack? That’s something I wondered for a long time, and The Book of Qt 4: The Art of Building Qt Applications by Daniel Molkentin finally cleared it up for me. In C++, you have to remember to delete objects on the heap. Qt makes memory management easier by providing a parent-child hierarchy for objects. All objects derived from the QObject class can benefit from this; when a parent is deleted, it also deletes all its children. If a child acts as a parent for some other widgets, it also deletes its children, and so it goes on until all descendants are deleted.

If I understand Daniel Molkentin correctly, objects have to lie on the heap to take advantage of Qt’s memory management. illissius commented that “that’s sort of backwards”, make sure to read his whole comment.

When you create a new object, you can specify the parent:

QWidget window;
QVBoxLayout *mainLayout = new QVBoxLayout(&window);

In this case window lies on the stack, since it’s the top-level widget.

When using some functions, for example mainLayout->addWidget(okButton), the widget gets automatically added to a parent-child hierarchy. Here, mainLayout‘s parent (window) assumes parentage of okButton. mainLayout is not a parent of okButton, as one might think.

Phew, this was one quite long step. Remember, however, that there are much more to learn about pointers and references – I’ve only picked a few areas that I’ve found hard. For example, I skipped the topic of dangerous pointers entirely  – something you shouldn’t do if you’re studying C++.

I hope to keep the next steps shorter, to only report my progress. I’ll blog again when I’ve finished my Sams Teach Yourself C++ book and started with some basic Qt (yay, screenshots?). Now, I’m waiting for the obligatory comment

Advertisement

17 Responses to “The Road to KDE Devland – step 2”

  1. Hans Chen (mogger) 's status on Saturday, 25-Jul-09 22:57:59 UTC - Identi.ca Says:

    […] The Road to KDE Devland – step 2 « Who Says Penguins Can’t Fly? […]

  2. ABCD Says:

    One correction:

    The following two lines of code mean the exact same thing:
    const int *pointer1;
    int const *pointer2;

    What you were trying to get at was:
    int *const pointer3;

    Which is read as “constant pointer to int” (read it from right to left), whereas “const int*” or “int const*” both mean “pointer to constant int”. Also, “const int const*” won’t compile (unless I’m mistaken, and even if it did, it wouldn’t do what you would want it to do). What you would want is “const int *const” (“constant pointer to constant int”).

  3. illissius Says:

    “If I understand Daniel Molkentin correctly, objects have to lie on the heap to take advantage to Qt’s memory management.”

    Actually, that’s sort of backwards. Objects which you only need to use until the end of the current scope (function, if statement, etc.) can be created on the stack — they are automatically destroyed at the end of the scope (the closing curly brace). This is very convenient — you don’t have to worry about memory management, it’s all automatic — but restrictive. If you want something to stick around longer than that, you have to create it on the heap, where they aren’t destroyed automatically. (QWidgets fall into this category — by definition, you want them to stick around and be displayed on the screen, rather than be created and then destroyed again almost immediately). The drawback of this is that when you no longer need the object, you have to delete it manually — the memory management is explicit.

    What Qt’s parent-hierarchy based memory management does is make this last task easier, because you no longer have to delete every single thing manually (which is tedious and error-prone) — instead, you can specify a parent object for it, and Qt will delete it for you when the parent object is destroyed. (This makes sense when you only need to use the child object for as long as the parent object is still around.)

  4. Coque Says:

    Hans,

    Thank you for these posts. I think they are very useful for begginers like me. Hope you talk about the project setup soon.

    Regards.

  5. Karellen Says:

    The rule of thumb I tend to use for passing pointer or reference parameters to functions is:
    If the function modifies the value passed in, use a pointer; otherwise use a reference.
    The idea behind this is that you can look at a function call, without knowing anything about the semantics of the function, and still be able to tell what its input and output parameters are. So, imagine you have the following:

    int foo(int xyzzy);
    int bar(int & xyzzy);
    int baz(int * xyzzy);

    foo() does some calculation on xyzzy and returns a result. bar() and baz() both do the same, but both modify the int they have a reference/pointer to. In you code, you have:

    int a = 5;
    int b = foo(a);
    int c = bar(a);
    int d = baz(&a);

    From there, without knowing anything about foo, its obvious that the call to baz() *could* case “a” to change. We also know that, because foo() takes the parameter by value, the call to it *cannot* make “a” change.

    Unfortunately, the call to bar() looks like a call to foo(), when it is in fact much more like (from the caller’s perspective) a call to baz().

    The idea is to make it as easy as possible to follow what the code is doing. If it looks like you’re passing a parameter by value, that’s what should be happening. If you’re passing a value “by ref”, where the value you’re passing in can be changed, then making that obvious is a good thing.

    Note that if bar were instead:

    int bar(int const & xyzzy);

    then that is fine, because bar() cannot change the variable referenced by xyzzy, and as far as the caller is concerned, they’re just passing in a value.

    Note that this is not held by the standard library. As you pointed out, std::swap() violates this rule. That’s OK by me. I know the standard library well and have room in my head to remember all^H^H^Hmost of its quirks. That doesn’t mean I can, or want to have to, remember every quirks of every API call in every third-party library that I use.

  6. klebezettel Says:

    And of course you want to be a good api programmer, too:

    http://qt.gitorious.org/qt/pages/ApiDesignPrinciples

    Especially, take a look at the “Little Manual of API Design” there.

    And if you want to see really good code, look into, e.g. the qtcreator sources.

  7. Hans Says:

    Thanks for your insightful comments!

    @ABCD:
    Hah, I knew some errors would slip through. Post updated, thank you for pointing it out.

    @illissius:
    Excellent explanation. I’ll add it to the main post shortly.

    @Coque:
    When I see comments such as yours, it makes me very motivated to continue to blog. 🙂
    I assume that you want to know which IDE etc. I use when you say “project setup”? Sounds like a good idea, I’ll do it in the next step.

    @Karellen:
    According to klebezettel’s link, it seems like the people at Qt Software prefer pointers for the same reason as you explained. I’ll make a note about it.

    @klebezettel:
    That’s a cool link, thanks.

  8. Diederik van der Boor Says:

    Nice post. 🙂

    Something I got in my mind to write a long time ago. One thing I just tought of: would it be nice to draw some scheme of how an object with “pointer on the stack” looks like in comparison with an “object allocated at the stack” ? 😉

  9. atomopawn Says:

    Another reason to use pointers over references is that pure C doesn’t really support reference parameters. So when you’re doing (for instance) kernel programming, you have to go back to pointers and de-referencing.

    I find it helpful to think of a reference as “a different name for the same variable” and to think of pointers as “the memory address of a variable”.

    I do think it’s confusing that the & and * symbols have two different meanings when used on the left and right side of an assignment. But with practice, it becomes a little easier to remember.

  10. Hans Says:

    @Diederik van der Boor:
    Books usually have pretty good pictures, and I’m too lazy to do them. 😛

    I think pictures are useful when introducing pointers, but I’m not sure if a picture would help much here?

  11. The Road to KDE and Qt Development | veracity Says:

    […] Road to KDE devland Step 0 Road to KDE devland Step 1 Road to KDE devland Step 2 […]

  12. The Road to KDE and Qt Development « cogito ergo vagus Says:

    […] Road to KDE devland Step 0 Road to KDE devland Step 1 Road to KDE devland Step 2 […]

  13. litb Says:

    Nice summary. Some comments follow.

    “References are always “constant” (they can’t be reassigned), so const together with a reference will always make the object constant. However, this doesn’t apply to pointers:”

    This needs some elaboration, i think. Only when the const appears at the type referred to, then you create a reference to const. Otherwise, your code is not valid:

    int &const a = …; // not valid
    int const &a = …; // correct way

    However, as you say, a reference conceptually is const. A const added to a reference type using a typedef is ignored (the same applies to template parameter: this is done to simplify generic programming):

    typedef int &intref_type;
    intref_type const a = …;
    // valid, equivalent to “int &a = …;”

    Regarding whether to pass by const reference or by value, i follow the following rule:

    “If the function always takes a copy, then pass by value”.

    For example, in the following case, pass by value:

    string toupper(string s) {
    /* some code that changes s directly */
    return s;
    }

    Passing by reference would look like the following:

    string topper(string const &s) {
    string tmp(s);
    /* some code that changes tmp */
    return tmp;
    }

    In general, make the function prototype as transparent to the caller as possible, which opens a good range of optimization possibilities for the caller (compiler).

  14. AhmedG. Says:

    Hey nice article! your explanation of pointers was well done. I wish it was around when I first learned pointers!

  15. Top Posts « WordPress.com Says:

    […] The Road to KDE Devland – step 2 Can anyone give me some pointers? After one week with Sams Teach Yourself C++ in 21 Days, it feels like I have the […] […]

  16. Hans Says:

    @litb:
    “Only when the const appears at the type referred to, then you create a reference to const. Otherwise, your code is not valid:”
    Post updated, hope it’s more clear now. Thanks for taking your time to explain. 🙂

    @AhmedG.:
    Thank you for your comment. It’s always encouraging to see that people find the article useful.

  17. The Road to KDE Devland – step 4 « Who Says Penguins Can't Fly? Says:

    […] parent-child mechanism is out of scope for this article. I touched upon the subject in step 2, when I wrote about pointers in […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: