Code Generation Process

arthuro555 · August 25, 2020, 4:06pm

Hey!

So I am pretty familiar with GDevelop’s source now, but one part I try to read and understand every week and don’t manage to is the Code Generation. I don’t really know what people mean with spaghetti code (I think just bad code in general?) but that’s how I would name it considering it is so over-modularized.

The code is currently split into GDCore and GDJS, which is fine for the Platform architecture, but then there are many MANY functions, some are overridden, some use the ones from another file, etc. And then the code generation is split in files for Layers, Functions, etc. And then there is the Common Instructions Extension which define the generator of each event used by the code generator that themselves use the code generator.

Well, all that rant to say I just can’t understand that code as everytime I jump from files to files while trying to follow whats happening and loose track. Would it be possible for someone who understands it (@4ian?) to explain what are the most important functions, or/and what an export process usually looks like?

Example:
→ Start Project Generation
----> For Each event sheet
------> Find used extensions and add to global includes
------> Read some data from objects and use the data to create this function …

I have many contributions ideas that I can’t do because I don’t understand that code

4ian · August 26, 2020, 8:54pm

Code generation is indeed a complex part of the app It’s both due to the nature of it and the fact that it was made to support C++ and JS code generation (which is now a bit useless and could be made simpler).

Here is how an export wors:

The GDJS Exporter class is called. It’s calling into ExporterHelper, notably for example this function for previews
- This is the function launching all the “steps”.
You can find the function launching the export of each scene events here.
- It’s creating a LayoutCodeGenerator and calling it, which is just an interface calling this function that will do the export of events of a scene to code.
- It’s then calling this function which is doing the export of events (from a scene or not).

This is now that things are getting interesting. You can see that EventsCodeGenerator in GDJS is a subclass of EventsCodeGenerator in Core.

[EventsCodeGenerator in Core] is important because it’s the class that will traverse the events (events are a “tree”), and for each event, it will call the function to generate the code of the event.
As each event is different, each has a function that is called to generate the code.

This function is visible in CommonInstructionsExtension (because events are provided by an extension :D). See here for a standard event.
When the function is called, the Core EventsCodeGenerator is calling it, passing itself as a parameter.

Why doing that? It’s to pass all the tools needed by the event to generate its code. For example, almost all events will need to generate actions, so there is a method to do that.

WHen this method is called, we’re back inside EventsCodeGenerator. For example, in the method to generate the code for actions. See notably the code to generate a single action
At this moment, we need to generate action in C++ or JS. How do we do that? Well, we’re inside CodeGenerator, which is subclassed in GDCpp and GDJS. So we have functions generating the code either in C++ or in JS! So GenerateActionCode will call GenerateFreeActionCode, GenerateObjectActionCode or GenerateBehaviorActionCode, all of which are subclassed in GDJS EventsCodeGenerator.

So here we are, we are now capable of generating code of events

The rationale is:

Core EventsCodeGenerator is containing the common logic.
GDJS/GDCpp EventsCodeGenerator are doing the part specific to JS/C++.
Events in extensions have a function called to generate the code specific to the events.

This is almost the “Strategy Pattern”.

Note that this is not so simple because we have also to:

handle objects! We have to declare list of objects, filter them. For that, we have a class that acts as the context to know which objects list should be in declared, used, etc…: EventsCodeGenerationContext
we have events, actions, conditions… but we also have expressions! How do we translate formulas into code? Well for this we have ExpressionParser.

ExpressionParser2 is at the name implies a “parser”. It’s basically a part of a small compiler that I wrote inside GDevelop. Its goal is to interpret the formula into a tree of “nodes” representing the expression.
When you have this tree of node (called an Abstract Syntax Tree), you can traverse it (just like events, conditions and actions!) and output anything that you want… including code This is what the ExpressionCodeGenerator is doing.
It’s using the “Visitor Pattern”: it has functions called for each node, that creates code.

There are other “visitors” (that I called “workers”) that can be used for example to change the name of an expression. Indeed, when you have nodes, you can browse them, change the name of a function, then use another worker to “print” again the expression to text!
So we have a printer, that does not print code but print the original expression: GDevelop/ExpressionParser2NodePrinter.h at ca6f11b55aeae7f38d0bf31120aac2c078929d19 · 4ian/GDevelop · GitHub

Useful to avoid doing search/replace, which is error prone and would not work well at all.

So now we are able to create code for events, conditions, actions and expressions. That’s basically it.

In terms of code health:

The ExpressionParser2 is new and “state of the art” in the sense that it follows good practices. But I highly recommend that you read about parsers (welcome to the wonderful world of compilers! Beware, there are dragons) if you want to understand. Note that it’s unlikely you need to do changes here! It’s pretty stable.
The EventsCodeGenerator is rightfully a bit too complicated for what it is. The main suspect? Inheritance! Because I used inheritance, it’s hard to identify what is in GDJS, and what is in Core.

For my defense at the time inheritance was the “usual” way of reusing logic. The idea is legit:

Everything that is common to C++/JS goes into Core EventsCodeGenerator
The rest goes in GDJS EventsCodeGenerator (or GDCpp), by overriding methods.

Unfortunately, it’s harder to spot “what belongs” where. If it was to be done again, I would split EventsCodeGenerator into:

EventsCodeGenerator in Core only, doing the job of traversing events.
Another different class called something like EventsCodeWriter, one in GDcpp and one in GDJS, that would be called by EventsCodeGenerator to write C++ or JS.

This would be an interesting refactoring.
ExpressionParser2 is very well tested and should be bullet proof, but EventsCodeGenerator is not well tested (there are a few tests though in GDevelop.js), so manual tests is needed to ensure events still work.

Hope this helps! It’s a complex topic that is in between compilers and algorithms, so understand it’s not easy