Profiling DMD's Compilation Time with dmdprof

Compilation time is frequently linked to productivity, and is of particular interest to D programmers. Walter Bright designed the language to compile quickly (especially compared to C++), and DMD, the reference D compiler implementation, is itself quite optimized.

Direct quote from a colleague (after a long day of bisecting a regression in Clang, each step taking close to an hour):

Whaaaaaaaaaaat? How can a compiler compile in 3 seconds?!?

Indeed, a clean build of DMD itself (about 170’000 lines of D and 120’000 lines of C/C++) takes no longer than 4 seconds to build on a rather average developer machine. Code which takes advantage of more advanced language features, like string mixins and CTFE, compiles slower; on this subject, Dmitry Oslhansky is working on CTFECache for caching CTFE execution across compiler invocations, and there’s of course Stefan Koch’s new CTFE implementation.

Walter has long touted the importance of using profilers. Though we have plenty of the kind that instrument or sample the program being profiled, including two in DMD itself, there aren’t as many of the kind that allow profiling the input data, i.e. the D source code being compiled. Generic tools for reducing input data sets according to some condition are theoretically applicable, but specialized tools are generally much more effective.

One such tool is DBuildStat, which enumerates the list of modules used in a program and then proceeds to collect various metrics about their compilation time. The subject came up recently in the discussion of Phobos pull request #5916, “Add a global convenience package file”; a curious outlier was std.net.curl, which took an entire 40 milliseconds to import:

Though DBuildStat provided useful insight on a larger scale, it did not answer the question: yes, but why does std.net.curl take an entire 40 milliseconds to import?

In theory, we could do something like this:

  • Start compiling the D code;
  • Stop the compiler at an arbitrary point;
  • Look at its stack trace and make note of what code it was compiling at that time;
  • Resume the compilation;
  • Repeat the steps above until a satisfactory number of samples has been gathered.

This way, we could get a picture of which parts of our D code take the longest to compile.

Putting theory to practice with a bit of GDB Python scripting, a dash of postprocessing, a healthy dose of visualization, and some obligatory GraphViz, we get something actually usable. Enter DMDProf:

Click the above image to see the full graph, and to get hover tooltips and clickable hyperlinks.

The data has hierarchy, meaning that we can see which modules import which (and at what cost), which code instantiates which templates, and how it all comes together. At the root, the profiler leaves some nodes from the DMD stack trace verbatim, so that we can see how much time each compilation stage uses, and what code is compiled in which stage.

Following the links is entertaining in seeing how Phobos modules interact and how deep some of these interactions go, but also allows quickly identifying immediately actionable improvements.

Hopefully, the tool will come in useful to D users as well as D developers in the future.

Source code and detailed usage instructions can be found in the GitHub repository:

My thanks go to:

  • Sebastian Wilzbach, for his work on related topics;
  • José Fonseca, for the gprof2dot tool used for visualization;
  • Walter Bright and Michael Parker, for their input on this article.

Note: this article was written on the date below, but published only on 2018-11-06.