Archive for May, 2008

Real Modules for C++

May 27, 2008 by olsner

C++ sucks. C++ needs a proper module system, where you can actually separate modules from each other rather than tangle them together in circular header include trees. This is it! Or, rather, when thinking about it has terminated and culminated into the start of an implementation, this will be the start of it that started it. Think of it as a collection of a few fluffy ideas that will transform C++ hell into the cozy wonderland it should be ;-)

I'm not entirely sure about this, but maybe I'll call it M++ as in "C++ with Modules". In any case, this will become a C++ dialect as there is no hope of not breaking any code. There will also probably never be a very easy transition path for existing large-ish C++ codebases. (That's not even a goal until this thing is sufficiently awesome to motivate someone to convert a large body of C++ code to it...)

What is M++?

Basically, it is C++ with "modules", where modules are somewhat like ordinary C++ translation units, but with special magic for how names are imported between modules.

Importing a module imports a well-defined set of names into the local scope. Contrast with C/C++ where module importing is implemented by inclusion of header files. (You know this already, but I'll repeat it for the sake of rhetoric) Header files may #define just about anything and wreak whatever havoc on the environment of following files. This means that any kind of higher-level reasoning on the effects of changes in header files on the actual code is pretty much moot. Any compiler must recompile every included header file from every including source file every time any part of this pool of mud changes. A wonky define in module A may produce error messages in system headers included from module B, while compiling module C. Modules A and B may not even be your code, or code you can't change. World of woe!

With a proper module system, declarations from different modules will never clash, except maybe in the module that is importing conflicting names from more than one module. But that would be caused by code in your module, and you would have the tools to solve the problem!

Guiding principles

  • Keep as much as possible of the syntax and semantics of C++
  • Remove the need for preprocessor inclusion
  • Keep the preprocessor around for e.g. importing external interfaces
  • Replace the current text-level importing with a symbol-table-level import
  • Provide good means of separating unrelated components by:
    • Limiting the set of exported symbols from components
    • Providing easy means to cherry-pick subcomponents (using namespace::name;)
    • Providing robust non-conflicting importing of whole components (using namespace;)

Basic proof-of-something example

For context, this module would reside in a file Main.mpp (maybe even just call these files cpp?), and exports the class Main::Foo, the method Main::Foo::Bar and the function Main::main(...).

// This is where the *really* cool part is. These modules (namespaces) are
// automatically imported by searching the system module path. Either they
// are defined by local source tree cpp (mpp?) files named e.g. Win32.cpp
// and Net/HTTP.cpp (Net_HTTP.cpp would also work), or they could be
// binary self-describing modules. Nothing said here on how that binary
// self-description would look. That's the magic left as an exercise for the reader.
using namespace Win32;
using namespace Net::HTTP;

// These are some provisional ideas on how to wrap the C/C++ standard library in
// M++ form. More on importing legacy C/C++ functions and classes later on.
using stdlib::atoi;
using stdio::printf;

// This is a private function. It would not be linkable from other translation
// units (default linkage in the top-level is static), but is usable from all
// definitions in the Main namespace below.
void Foo_Bar()

namespace Main
    // Normal classes here
    class Foo
        void Bar();

    // Maybe the main function could move into the Main namespace from the
    // global namespace like this:
    int main(int argc, char *argv[])
        Foo foo;

Unresolved/other issues

  • How do you expose things in the global namespace from M++ modules (these exposed names should follow the relevant C++ ABI and mix-and-match with C++ code.)

This one is slightly harder than the latter question. Many solutions here are bad. So I think I'll just let this one noodle for a while ;-)

  • How do you import external C or C++ classes/functions into M++?

I'm thinking you'd #include the external headers in one module, then use using ::asdf; to import the top-level declarations into the namespace exported from that module. This means that a M++ compiler must be able to understand the full wonderful ambiguity of C++. But hopefully, the modularization through use of namespaces would mean that only one of a large number of modules need to go through the hassle of actually parsing all that crud in order to build a small symbol table.

One remaining issue is how to distribute the kind of macros that are required/useful (i.e. file/line-tracking allocation functions, asserts, debug/release-dependent code). Would you be including some small number of headers into each module to do that, would the module system somehow get involved in preprocessing and let macros be imported as part of a namespace?

Getting rid of macros in the first place is a pretty good idea anyway, but exactly how far can you practically take it? Some things like stdint.h define a large quantity of useful macros. In an ideal world, these would be constant varibables const int INT_MAX; etc defined in a suitable namespace (perhaps something like stdint, since they come from stdint.h).