Skip to content

GraphflowDB Coding Style

Guodong Jin edited this page May 15, 2022 · 10 revisions

This is a document for our cpp coding style. The goal of this document is to make sure that our styles inside the codebase are consistent, and these style rules exist to keep the codebase manageable as time goes by and the team grows. This document should be used as a reference whenever someone in the team is not sure about a piece of code satisfying the style or not, and we also use this document to solve conflicts between different styles in our existing codebase. This document partially references Google C++ Style Guide.

Header files

In general, every .cpp file should have an associated .h file, and we use .h instead of .hpp in our codebase. There are some common exceptions that no .h files are associated, such as unit tests and small .cpp files containing just a main() function.

A header file should follow the following template:

#pragma once

#include <xxx>

#include “yyy”

namespace nnn {

Class ccc {}

} // namespace nnn

The header file starts with #pragma once as the first line with an empty line afterwards. Then the header file goes with #include statements, and the ordering of include statements should follow our clang-format file. Finally, main content is wrapped inside namespace {}, and notice that usually the namespace is nested, such as

namespace graphflow {
namespace common {
// empty line here
    ...
// empty line here
} // namespace common
} // namespace graphflow

we always keep an empty line both after the inner most namespace { and before the inner most } // namepsace ....

Function Declarations in header files

Functions in header files should only contain declarations without implementations, except for following cases:

  • The function is inlined.
  • The function is a template function.
  • The function is an empty constructor with only initializer list and optional assert statements.

In-line Functions

Define functions inline only when they are very small, usually 1 or 2 lines, and shouldn’t be larger than 5 lines.

Forward Declaration

TODO

Classes

Structs vs. Classes

Use a struct only for passive objects that carry data; everything else is a class.

Structs vs. Pairs and Tuples

Prefer to use a struct instead of a pair or a tulle whenever the elements can have meaningful names. While using pairs and tuples can avoid the need to define a custom type, potentially saving work when writing code, a meaningful field name will almost always be much clearer when reading code than .first, .second, or std::get<X>. While C++14's introduction of std::get<Type> to access a tuple element by type rather than index (when the type is unique) can sometimes partially mitigate this, a field name is usually substantially clearer and more informative than a type. Pairs and tuples may be appropriate in generic code where there are not specific meanings for the elements of the pair or tuple. Their use may also be required in order to interoperate with existing code or APIs.

Destructors

For virtual classes, always add a virtual destructor when it's meant to be manipulated polymorphically, i.e., derived classes have different destructors implemented and to be called. otherwise, it might cause undefined behavior when the deletion of a derived object is through the base class. See this post for an example.

"If the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined."

Ownership and Smart Pointers

(partially borrowed from Google C++ Style Guide).

Prefer to have single, fixed owners for dynamically allocated objects. Prefer to transfer ownership with smart pointers.

"Ownership" is a bookkeeping technique for managing dynamically allocated memory (and other resources). The owner of a dynamically allocated object is an object or function that is responsible for ensuring that it is deleted when no longer needed. Ownership can sometimes be shared, in which case the last owner is typically responsible for deleting it. Even when ownership is not shared, it can be transferred from one piece of code to another.

"Smart" pointers are classes that act like pointers, e.g., by overloading the * and -> operators. Some smart pointer types can be used to automate ownership bookkeeping, to ensure these responsibilities are met. std::unique_ptr is a smart pointer type which expresses exclusive ownership of a dynamically allocated object; the object is deleted when the std::unique_ptr goes out of scope. It cannot be copied, but can be moved to represent ownership transfer. std::shared_ptr is a smart pointer type that expresses shared ownership of a dynamically allocated object. std::shared_ptrs can be copied; ownership of the object is shared among all copies, and the object is deleted when the last std::shared_ptr is destroyed.

unique_ptr

If dynamic allocation is necessary, prefer to keep ownership with the code that allocated it. If other code needs access to the object, consider passing it a copy, or passing a pointer or reference (prefer reference if nullptr is not allowed) without transferring ownership. Prefer to use std::unique_ptr to make ownership transfer explicit. For example:

std::unique_ptr<Foo> FooFactory();

void FooConsumer(std::unique_ptr<Foo> foo); // move the ownership of Foo from the caller to this function.
bool FooCheck(const Foo& foo); // pass the reference without moving the ownership of foo.

void func() {
    auto foo = FooFactory();
    if (FooCheck(*foo)) {
        FooConsumer(move(foo));
    }
}

shared_ptr

Do not design your code to use shared ownership without a very good reason. One such reason is to avoid expensive copy operations, but you should only do this if the performance benefits are significant, and the underlying object is immutable (i.e., std::shared_ptr<const Foo> ). If you do use shared ownership, prefer to use std::shared_ptr.

auto_ptr

Never use std::auto_ptr. Instead, use std::unique_ptr.

Smart pointers as parameters

We use smart pointers (both unique_ptr and shared_ptr) as a parameter only when we intend to make changes to the ownership. In this case, for unique_ptr, we never pass it by reference, such as void func(unique_ptr<Foo>& foo); for shared_ptr, we either pass it by value or by reference, such as

Bar::Bar(shared_ptr<Foo> foo): foo{move(foo)} {}

or

string func(shared_ptr<Foo>& foo) { return foo.name; }

Notice that we pass the shared_ptr by value and move it when we intend to assign it to another local variable, such as in the constructor; and we pass it by reference when we only intend to read or change its value.

Otherwise, we use raw pointers or references to the object as parameters.

Comments

TODOs

TODOs should include the string TODO in all caps, followed by the name, bug ID, or other identifier of the person or issue with the best context about the problem referenced by the TODO. The main purpose is to have a consistent TODO that can be searched to find out how to get more details upon request. A TODO is not a commitment that the person referenced will fix the problem. Thus when you create a TODO with a name, it is almost always your name that is given.

// TODO(Guodong): change this to use relations.
// TODO(issue #12345): remove the "Last visitors" feature.

Exceptions

TODO

Bazel

TODO