<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>unsafePerformHack &#187; Projects</title>
	<atom:link href="http://olsner.se/category/projects/feed/" rel="self" type="application/rss+xml" />
	<link>http://olsner.se</link>
	<description>Perversions in Computer Science</description>
	<lastBuildDate>Tue, 03 May 2011 23:32:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Some current projects&#8230;</title>
		<link>http://olsner.se/2011/02/23/some-current-projects/</link>
		<comments>http://olsner.se/2011/02/23/some-current-projects/#comments</comments>
		<pubDate>Wed, 23 Feb 2011 20:02:28 +0000</pubDate>
		<dc:creator>olsner</dc:creator>
				<category><![CDATA[Projects]]></category>

		<guid isPermaLink="false">http://olsner.se/?p=36</guid>
		<description><![CDATA[A bit of a mixed bag. I&#8217;ve started working on a compiler for my programming language, see m3.git. Since code generation is boring, it targets LLVM. And anything related to &#8220;M++&#8221; is gone, and so are all object oriented features. This is simply an imperative language with modules, and maybe some type-system features. That hasn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>A bit of a mixed bag.</p>

<p>I&#8217;ve started working on a compiler for my programming language, see <a href="http://git.olsner.se/m3.git/">m3.git</a>. Since code generation is boring, it targets LLVM. And anything related to &#8220;M++&#8221; is gone, and so are all object oriented features. This is simply an imperative language with modules, and maybe some type-system features.</p>

<p>That hasn&#8217;t been moving for some time though, due to my replacement project: <a href="http://git.olsner.se/os.git/">an operating system</a>. This is going to be a dirt simple but (at least theoretically) fully functional operating system for x86-64/amd64/em64t, written in Assembly. I&#8217;ve decided that it&#8217;s not going to be running C code until I finish support for loadable processes and/or kernel modules. As a general rule, I&#8217;m aiming for a microkernel structure. In part because microkernels are cool, in part because Assembly is hard to write so I really want to keep the &#8220;kernel&#8221; part as small as possible!</p>

<p>Now, the kernel thingy needs to do some memory management internally to keep track of some data structures where most of them are not as big as a page. I have a simple page-frame allocator, but now I need virtual memory management (which will reside inside the microkernel part of the OS) and processes, and to do that conveniently I need a kernel malloc to allocate my metadata.</p>

<p>There goes the third interrupting project: <a href="http://git.olsner.se/?p=mallog.git;a=tree;f=sbmalloc;h=d3c3b0d0a29d72229437f5bd3189617cd38e6778;hb=malloc">the malloc</a>. In order to have something working in the kernel, written in assembly, I thought it best to start by even implementing malloc at all. So this is a malloc that you build into a .so (you can also link to it statically, if you compile it for that yourself) that loads into any program using LD_PRELOAD to replace the C library&#8217;s malloc. That is, on Linux. Haven&#8217;t tried it on anything else <img src='http://olsner.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>

<p>I&#8217;ve stolen most of the ideas from a paper about phkmalloc, so I guess it&#8217;s not very interesting in terms of malloc design. And it&#8217;s single-threaded (protected by a single lock, if thread safety is enabled) so there will be no scalability contest between this and e.g. jemalloc! Then again, it shouldn&#8217;t be excessively hard to take this allocator and essentially have one heap per thread/core, with a shared pagepool in the bottom somewhere.</p>

<p>My malloc currently &#8220;works&#8221; with any programs I&#8217;ve thrown at it, except that it doesn&#8217;t actually free anything back to the OS. Properly implementing that seems to require a few changes to the underlying data structures. Oops! Which is precisely the reason I&#8217;m doing it at all &#8211; to find these problems out so I can redo it correctly in a write-once language like assembly <img src='http://olsner.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>

<p>Anyway, after the malloc works, the next step is to translate it into the kernel code, implement virtual memory management in the kernel and to use it instead of the hardcoded page tables that it currently sets up for user processes. Then it&#8217;ll be time to implement dynamically allocating processes based on an empty or existing address space, starting them up, scheduling them, ending them, and waiting for them! That&#8217;ll be exciting <img src='http://olsner.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://olsner.se/2011/02/23/some-current-projects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>M++ Design, Part 2 of N</title>
		<link>http://olsner.se/2008/07/02/m-design-part-2-of-n/</link>
		<comments>http://olsner.se/2008/07/02/m-design-part-2-of-n/#comments</comments>
		<pubDate>Wed, 02 Jul 2008 20:26:58 +0000</pubDate>
		<dc:creator>olsner</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[m++]]></category>
		<category><![CDATA[specs]]></category>

		<guid isPermaLink="false">http://olsner.se/?p=20</guid>
		<description><![CDATA[A short post about Syntax and Compiler usage. SPECS (new syntax for C++, see below) I have been dabbling with previously, and always found quite interesting. Regarding compiler usage, I guess my inspiration lies in the way Haskell compilation with ghc works (at least in the ghc --make variant). It should never be any harder [...]]]></description>
			<content:encoded><![CDATA[<p>A short post about Syntax and Compiler usage. SPECS (new syntax for C++, see below) I have been dabbling with previously, and always found quite interesting. Regarding compiler usage, I guess my inspiration lies in the way Haskell compilation with ghc works (at least in the <code>ghc --make</code> variant). It should never be any harder than that.</p>

<h3>Compiler usage</h3>

<p>I&#8217;m thinking it would be left up to the compiler to keep track of dependencies and determine which modules need recompilation. Rather than taking a list of source files to compile into objects, run the compiler for each source to make a .o file and then link your hundreds of .o files into a shared library or executable, you&#8217;d tell the compiler to either build module Main (which can have another name) into an executable, or to build a library exposing one or more given namespaces. The names declared in these namespaces would be exported as if in the global namespace (following the target C++ ABI, most likely &#8211; I would like to keep binary compatability as far as possible).</p>

<p><b>Example 1:</b> Compile the program contained in module Main and all dependent modules into a.out.</p>

<p><div>
<pre class="txt" style="font-family:monospace;">m++ --main Main -o a.out</pre>
</div></p>

<p><b>Example 2:</b> Compile modules A, B, C into a shared library.</p>

<p><div>
<pre class="txt" style="font-family:monospace;">m++ --export A --export B --export C -o libstuff.so</pre>
</div></p>

<p>As mentioned in the previous posts, modules referenced in the source (e.g. <code>Net::HTTP</code>) would be automatically looked up as ./Net/HTTP.mpp in the current include path. It is up to the compiler to apply necessary magic to that file in order to extract the information it needs to compile the current module. I&#8217;m thinking there&#8217;d probably be a local file containing a parsed representation of the module source which is automatically updated if the source file is outdated.</p>

<p>Due to the way this works, I think mutually recursive modules are basically impossible to write in this new language. C++, using headers with declarations separate from implementation files, allows to get around this problem in some small way by e.g. using forward declarations in the header. It may be possible for the compiler to apply a similar workaround automatically by parsing declarations and implementations separately, but I actually think it is a good thing not to be able to build mutually recursive modules.</p>

<p>Code generation would be driven entirely by the need to output the exported names (or just main() in the case of &#8211;main), so only names used recursively by those functions would be code-generated at all. In ordinary C++, it is very hard to control the set of functions you&#8217;re exposing to the world. In M++, it should be very easy and very explicit.</p>

<h3>Syntax</h3>

<p>I originally thought using the syntax of C++ would be a good thing. After all, what I secretly want this language to do is replace C++ entirely, which I thought more likely to happen using a syntax familiar to the old fashioned C/C++ tradition. Worked wonders for e.g. Java, C# and JavaScript, didn&#8217;t it? Too bad all of them punted on the opportunity to actually replace C++, rather than take some small niche where you never really needed C/C++.</p>

<p>However, since this is a C++ dialect anyway, why not go the extra mile and just throw out all the syntax, and all that ugly legacy that would come along with the syntax? For example, there&#8217;s this <a href="http://www.csse.monash.edu.au/~damian/papers/HTML/ModestProposal.html">Modest Proposal</a> for a new syntax for C++. Quite interesting read, and using a syntax like this would probably eliminate a lot of the quirky issues you run into when trying to parse C++. The proposal also suggests fixing a few quirks of the C++ language, such as making <code>this</code> a reference rather than a pointer. One of the most wonderful parts of SPECS (Significantly Prettier and Easier C++ Syntax) is the complete re-working of type syntax &#8211; even a complicated type signature like that of a pointer to an array of pointers to functions returning a pointer to a function is easily readable in SPECS:</p>

<p><div>
<pre class="txt" style="font-family:monospace;">type ComplicatedType : ^ [7] ^(int -&gt; ^(int -&gt; int));</pre>
</div></p>

<p>To read the SPECS typedef, just start at the left and read:</p>

<p><div>
<pre class="txt" style="font-family:monospace;">pointer to array of 7 pointers to a function taking int and returning a pointer to a function from int to int</pre>
</div></p>

<p>Contrast the equivalent C++:</p>

<p><div>
<pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">typedef</span> <span style="color: #0000ff;">int</span> <span style="color: #008000;">&#40;</span><span style="color: #000040;">*</span><span style="color: #008000;">&#40;</span><span style="color: #000040;">*</span><span style="color: #008000;">&#40;</span><span style="color: #000040;">*</span>ComplicatedType<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">7</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre>
</div></p>

<p>There is a somewhat reliable technique for reading this kind of nested type definition (start in the middle and go outwards? something like that&#8230;), but in my opinion: don&#8217;t bother. Give the original author a proper spanking and make them rewrite it as a set of typedefs instead <img src='http://olsner.se/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>

<p>SPECS would be so much nicer than C++ syntax, but then I&#8217;m much more talking about an entirely new language than a C++ dialect (even though some semantics would be similar).</p>

<p>So: keep C++ and the world of pain it represents or write something using a new (and therefore scary and controversial) syntax?</p>
]]></content:encoded>
			<wfw:commentRss>http://olsner.se/2008/07/02/m-design-part-2-of-n/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real Modules for C++</title>
		<link>http://olsner.se/2008/05/27/real-modules-for-c/</link>
		<comments>http://olsner.se/2008/05/27/real-modules-for-c/#comments</comments>
		<pubDate>Mon, 26 May 2008 23:02:56 +0000</pubDate>
		<dc:creator>olsner</dc:creator>
				<category><![CDATA[Projects]]></category>

		<guid isPermaLink="false">http://olsner.se/?p=19</guid>
		<description><![CDATA[C++ sucks. C++ needs a proper module system, where you can actually separate modules from each other rather than tangle them together in circular header include trees. This is it! Or, rather, when thinking about it has terminated and culminated into the start of an implementation, this will be the start of it that started [...]]]></description>
			<content:encoded><![CDATA[<p>C++ sucks. C++ needs a proper module system, where you can actually separate modules from each other rather than tangle them together in circular header include trees. This is it! Or, rather, when thinking about it has terminated and culminated into the start of an implementation, this will be the start of it that started it. Think of it as a collection of a few fluffy ideas that will transform C++ hell into the cozy wonderland it should be <img src='http://olsner.se/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>

<p>I&#8217;m not entirely sure about this, but maybe I&#8217;ll call it M++ as in &#8220;C++ with Modules&#8221;. In any case, this will become a C++ <b>dialect</b> as there is no hope of not breaking any code. There will also probably never be a very easy transition path for existing large-ish C++ codebases. (That&#8217;s not even a goal until this thing is sufficiently awesome to motivate someone to convert a large body of C++ code to it&#8230;)</p>

<h2>What is M++?</h2>

<p>Basically, it is C++ with &#8220;modules&#8221;, where modules are somewhat like ordinary C++ translation units, but with special magic for how names are imported between modules.</p>

<p><b>Importing a module imports a well-defined set of names into the local scope.</b> Contrast with C/C++ where module importing is implemented by inclusion of header files. (You know this already, but I&#8217;ll repeat it for the sake of rhetoric) Header files may <code>#define</code> just about anything and wreak whatever havoc on the environment of following files. This means that any kind of higher-level reasoning on the effects of changes in header files on the actual code is pretty much moot. Any compiler must recompile every included header file from every including source file every time any part of this pool of mud changes. A wonky define in module A may produce error messages in system headers included from module B, while compiling module C. Modules A and B may not even be your code, or code you can&#8217;t change. World of woe!</p>

<p>With a proper module system, declarations from different modules will never clash, except maybe in the module that is importing conflicting names from more than one module. But that would be caused by code in your module, and you would have the tools to solve the problem!</p>

<h3>Guiding principles</h3>

<ul>
<li>Keep as much as possible of the syntax and semantics of C++</li>
<li>Remove the need for preprocessor inclusion</li>
<li>Keep the preprocessor around for e.g. importing external interfaces</li>
<li>Replace the current text-level importing with a symbol-table-level import</li>
<li>Provide good means of separating unrelated components by:

<ul>
<li>Limiting the set of exported symbols from components</li>
<li>Providing easy means to cherry-pick subcomponents (using namespace::name;)</li>
<li>Providing robust non-conflicting importing of whole components (using namespace;)</li>
</ul></li>
</ul>

<h3>Basic proof-of-something example</h3>

<p>For context, this module would reside in a file Main.mpp (maybe even just call these files cpp?), and exports the class Main::Foo, the method Main::Foo::Bar and the function Main::main(&#8230;).</p>

<p><div>
<pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">// This is where the *really* cool part is. These modules (namespaces) are</span>
<span style="color: #666666;">// automatically imported by searching the system module path. Either they</span>
<span style="color: #666666;">// are defined by local source tree cpp (mpp?) files named e.g. Win32.cpp</span>
<span style="color: #666666;">// and Net/HTTP.cpp (Net_HTTP.cpp would also work), or they could be</span>
<span style="color: #666666;">// binary self-describing modules. Nothing said here on how that binary</span>
<span style="color: #666666;">// self-description would look. That's the magic left as an exercise for the reader.</span>
<span style="color: #0000ff;">using</span> <span style="color: #0000ff;">namespace</span> Win32<span style="color: #008080;">;</span>
<span style="color: #0000ff;">using</span> <span style="color: #0000ff;">namespace</span> Net<span style="color: #008080;">::</span><span style="color: #007788;">HTTP</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// These are some provisional ideas on how to wrap the C/C++ standard library in</span>
<span style="color: #666666;">// M++ form. More on importing legacy C/C++ functions and classes later on.</span>
<span style="color: #0000ff;">using</span> stdlib<span style="color: #008080;">::</span><span style="color: #0000dd;">atoi</span><span style="color: #008080;">;</span>
<span style="color: #0000ff;">using</span> stdio<span style="color: #008080;">::</span><span style="color: #0000dd;">printf</span><span style="color: #008080;">;</span>
&nbsp;
&nbsp;
<span style="color: #666666;">// This is a private function. It would not be linkable from other translation</span>
<span style="color: #666666;">// units (default linkage in the top-level is static), but is usable from all</span>
<span style="color: #666666;">// definitions in the Main namespace below.</span>
<span style="color: #0000ff;">void</span> Foo_Bar<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span><span style="color: #008000;">&#125;</span>
&nbsp;
&nbsp;
<span style="color: #0000ff;">namespace</span> Main
<span style="color: #008000;">&#123;</span>
    <span style="color: #666666;">// Normal classes here</span>
    <span style="color: #0000ff;">class</span> Foo
    <span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">public</span><span style="color: #008080;">:</span>
        <span style="color: #0000ff;">void</span> Bar<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #008000;">&#125;</span>
&nbsp;
    <span style="color: #666666;">// Maybe the main function could move into the Main namespace from the</span>
    <span style="color: #666666;">// global namespace like this:</span>
    <span style="color: #0000ff;">int</span> main<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> argc, <span style="color: #0000ff;">char</span> <span style="color: #000040;">*</span>argv<span style="color: #008000;">&#91;</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
        Foo foo<span style="color: #008080;">;</span>
        foo.<span style="color: #007788;">Bar</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        Foo_Bar<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
        <span style="color: #0000dd;">printf</span><span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;Baz<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span></pre>
</div></p>

<h3>Unresolved/other issues</h3>

<ul>
<li>How do you expose things in the global namespace from M++ modules (these exposed names should follow the relevant C++ ABI and mix-and-match with C++ code.)</li>
</ul>

<p>This one is slightly harder than the latter question. Many solutions here are bad. So I think I&#8217;ll just let this one noodle for a while <img src='http://olsner.se/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>

<ul>
<li>How do you import external C or C++ classes/functions into M++?</li>
</ul>

<p>I&#8217;m thinking you&#8217;d #include the external headers in one module, then use using ::asdf; to import the top-level declarations into the namespace exported from that module. This means that a M++ compiler must be able to understand the full wonderful ambiguity of C++. But hopefully, the modularization through use of namespaces would mean that only one of a large number of modules need to go through the hassle of actually parsing all that crud in order to build a small symbol table.</p>

<p>One remaining issue is how to distribute the kind of macros that are required/useful (i.e. file/line-tracking allocation functions, asserts, debug/release-dependent code). Would you be including some small number of headers into each module to do that, would the module system somehow get involved in preprocessing and let macros be imported as part of a namespace?</p>

<p>Getting rid of macros in the first place is a pretty good idea anyway, but exactly how far can you practically take it? Some things like stdint.h define a large quantity of useful macros. In an ideal world, these would be constant varibables <code>const int INT_MAX;</code> etc defined in a suitable namespace (perhaps something like stdint, since they come from stdint.h).</p>
]]></content:encoded>
			<wfw:commentRss>http://olsner.se/2008/05/27/real-modules-for-c/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

