nullstream weblog - Coding Style: Part, the first

« Don't read this | Shopping Spree »

Coding Style: Part, the first


January 2, 2005 03:04 AM PST

Over the years, I've encountered a number of coding styles from different companies, books and open source projects. I find that I am really picky about coding consistency, and that I get annoyed with code that looks ugly.

Therefore, a brief summary of stuff that I find annoying:

Hungarian Notation

Encoding the type into a variable name in a statically typed language is a really brain dead idea. What happens when you make a design change and need to alter the type of a variable from:
    hash_map<int, char *>
Uh oh, now you have to rename all those variables. Hello compile errors!

A particularly egregious example of this sort of stupidity is encoding "sz" (meaning string that is zero terminated) into string variables: "char szName[] = ...". WTF?!? Strings in C and C++ are always zero terminated unless you go out of your way to make your own string storage (and let's not get into how bad an idea that is). This is left over junk from the Pascal string days where the length of the string was encoded into the first byte, but who uses Pascal today? C'mon! Hungarian. <sigh>: anyone using it deserves a boot to the head.

Camel notation

I generally don't have a problem with this, with the exception of function names. Functions should be capitalized and variables should not:

    int myInt, my_int;        // These are okay
    int MyInt;                // Over my dead body

    void SrDmsOpenConn();     // Yes
    void SRDmsOpenConn();     // Meh, I guess
    void srDmsOpenConn();     // No way, you Java weenie
    void SRDMSOpenConn();     // Why are you yelling at me? No.

Note that in the above functions, SR is the product name (Self Reliant) and DMS is the module name (Distributed Messaging System). Public APIs should include the product / company name and the module name: SrDmsOpenConn(), whereas private or static APIs should only include the module name: DmsOpenConn(). The above example applies to C, since C++ methods implicitly have a namespace as defined by the class they belong to:

    // A publicly usable class, declared in a
    // header that is given to the customer:

    class SrHttpConn {
            void OpenConn();

    // A private class, declared in a definition file (.cpp/.cc)
    // or a header not distributed to a customer:

    class HttpConnection : public Connection {
            void OpenConn();


Returning to the first code sample, I'm not so concerned about using either camel or underscore notation for variables as long as it is consistant. An interesting idea is to use camel notation since the STL uses underscores. This will make it very clear in the code whether a function call is an STL call or a custom call:

    class IntContainer {
            void InsertValue(int i);

    int main(int argc, char *argv[])
        vector<int> v;
        v.push_back(17);        // STL method

        IntContainer i;
        i.InsertValue(17);      // Custom method


C++ member variables are an interesting case: how to distinguish between a local variable and a member variable? Here are some conventions:

    class Foo {
            int _i;    // style 1
            int i;     // style 2
            int i_;    // style 3
            int mI;    // style 4
            int m_i;   // style 5

I'm sure there are more styles, but I've already had 1.5 glasses of wine. Onwards.

The argument against style 1 is that apparently, some C/C++ compilers create an internal representation of user code by prepending an underscore character to the variable names, hence there may be a clash which may not be a simple thing to detect. Lord, save us from stupid compiler engineers! The compiler should work around my code, not the other way around. I promise not to use LANGUAGE keywords, and the compiler should promise not to do stupid stuff that is easy to avoid (the VxWorks compilers exempt themselves from not doing stupid stuff, apparently: try using nested varargs... I dare you!).

I don't like style 3 because I read from left to right, and the "memberness" of a variable should come first:

    class Foo {
            void DoSomethingCool();

            Obj *_worker;    // style 1
            Obj *worker_;    // style 3

    void Foo::DoSomethingCool() {
        Obj *worker = new Obj();
        _worker->Swap(worker);    // Clear and obvious: Yay!
        worker_->Swap(worker);    // ASCII line noise:  boo.

The prefix version of the underscore is more readable than the postfix version. It is instantly apparent that it is a member of the "this" object, even though the difference is only 6 characters in distance. Style 3 just looks ugly and awkward. Style 2 would require that the locally created object is named something else to avoid masking, so I'd highly discourage using style 2. Style 4 is used by Mozilla in conjunction with the camel style, which seems pretty consistant to me. Style 5 can be a little jarring if you are using camel notation for everything, and you suddenly throw in an underscore... but it does serve the purpose of making the code immediately obvious that the variable in question is a member, and not a local. If you are going to pick a style, I'd recommend 1, 4 and 5 in that order.

That's all for now folks! Join me in my next episode, wherein I tackle:

Comments (1)
J, January 3, 2005 11:56 PM:

Well, I don't capitalize function names out of Java habit - Object start with a capital, methods don't. So it is consistent. With C I just stick with this to avoid pressing an extra shift key... I guess not a big deal. I agree with not putting 'Sr' in front of non-extern functions. Way better than sr_dms_open_conn(). Ugh. That always looks like nasty driver code to me.

Also, I like plain /* */ C comments for blocks, not leading '//' or ' *' in front of each line. I find they're just a lot of work to maintain when adding text or indenting the comments. The one drawback is compilers don't support nested /* */ comments, but that's something I only find is necessary when temporarily hacking out code, which can just as easily be done with #if 0. Anyway, I imagine you'll cover comments later.

Ohhh... I can't wait for getters and setters. Textbook Java is that you wrap every accesible variable with a getter and setter (yuck). Classic academic vs. practical battle!

All links will be marked with the nofollow tag, making them useless for search rankings. Any posts containing spam URLs will then be deleted.