[GSoC] WebAssembly for collisions idea

mo7sen · February 21, 2020, 9:17pm

Hey! I’m interested in the WASM idea for GSoC. Are there any specific things that I should get familiar with to gain a better understanding of the project? Thanks in advance!

4ian · February 22, 2020, 12:49pm

Hey! Thanks for your interest
Basically, take a look at this: Experimental object collision and positions handling improvements by 4ian · Pull Request #1393 · 4ian/GDevelop · GitHub

Try to first ynderstand what I’m trying to do: it’s basically a better collision manager, but all in JavaScript. It unfortunately still need work to avoid being less performant than the current version in some cases (read the thread).

The goal would be to rewrite a part of it (gdjs.ObjectPositionsManager) into WebAssembly.

For the language, it would be Rust/C++/AssemblyScript/whatever actually! It’s up to you but you would have to tell the rationale behind picking one (in terms of code size notably).

It’s a medium/hard project: first you need a good understanding of my current PR, then see what can be done in WebAssembly, ensure this will be performant (what data structure to use), verify the improvement on various games and benchmarks, etc…

mo7sen · February 27, 2020, 6:35am

Ok so I read most of the gdjs.ObjectPositionsManager and I think I’m starting to get it. Might take some time to think about the data structures for WebAssembly, though.

I think the best language for the job would be Rust. That’s primarily because of how smart its compiler is and how it could detect non-logical bugs at compile-time. I don’t have much experience with WebAssembly on C++ but I don’t think there would be much of a difference aside from the fact that Rust has some macros to make boilerplate less of a pain.

Is gdjs.ObjectPositionsManager that you’re planning on changing to WebAssembly?

4ian · February 27, 2020, 6:13pm

I don’t have much experience with WebAssembly on C++ but I don’t think there would be much of a difference aside from the fact that Rust has some macros to make boilerplate less of a pain.

There are difference of size of the generated wasm file, potentially performance (but depends on your implementation) or availability of existing data structures (C++ is used a lot in game development, so might have more ready made things).

The size of the wasm file is particularly important => We won’t bundle a 5 megabytes library with games (unless it’s optional).

Is gdjs.ObjectPositionsManager that you’re planning on changing to WebAssembly?

Yes In fact I designed gdjs.ObjectPositionsManager in JavaScript, but made it to be easy to port to WebAssembly.
Unfortunately for now I had to stop working on it because of the performance gain/loss that are not always satisfying when dealing with lots of objects moving at the same time. Depending on how things are going, I hope to be able to continue it a bit before GSoC, but not sure!

mo7sen · February 27, 2020, 10:20pm

According to this article, it is possible to reduce the final binary size by removing features that won’t be useful in this situation, like some string operations. So it might be more worthy to give rust a shot as it is less of a pain when dealing with bugs and can end up being of a small size.
It is also mentioned in this article that the difference in size is between C and rust. And that C++ should have a much larger binary size than C.
Also when it comes to size, I don’t think a couple of extra megabytes will make much of a difference, Assuming that the difference will be that big.

This part of the rust wasm book also has a section called “Small .wasm sizes” which means that it is probably a priority of the rust libraries for wasm. The book also has some tips on how to shrink the size of the binaries substantially.

This article also does a way better job at saying why rust might be a better choice.

triptych · February 29, 2020, 5:15pm

I would just pipe in to add a vote for using Rust vs C++ if only because you can be a first mover in this area and help the transition to Rust.

mo7sen · February 29, 2020, 7:05pm

I do not mind using C++ if that’s the preferred option. But I think that Rust would be better unless @4ian doesn’t want to introduce another language to the repo to avoid turning it into a mess of different languages.

4ian · March 1, 2020, 12:55pm

That’s an experiment so we could try Rust. Note that there is already implementation of Polygon collision in C++ (was used for the previous native engine):

github.com

4ian/GDevelop/blob/master/Core/GDCore/Extensions/Builtin/SpriteExtension/Polygon2d.cpp

/*
 * GDevelop Core
 * Copyright 2008-2016 Florian Rival (Florian.Rival@gmail.com). All rights
 * reserved. This project is released under the MIT License.
 */
#include "Polygon2d.h"
#include <SFML/System/Vector2.hpp>
#include <cmath>
#include <iostream>

void Polygon2d::Rotate(float angle) {
  float t, cosa = cos(-angle),
           sina = sin(-angle);  // We want a clockwise rotation

  for (std::size_t i = 0; i < vertices.size(); ++i) {
    t = vertices[i].x;
    vertices[i].x = t * cosa + vertices[i].y * sina;
    vertices[i].y = -t * sina + vertices[i].y * cosa;
  }
}

This file has been truncated. show original

https://github.com/4ian/GDevelop/blob/master/GDCpp/GDCpp/Runtime/PolygonCollision.cpp

But this should be “easy” to port to Rust. I’ve already done it in JavaScript (see polygon.js).
The challenge is more about how to integrate the (Rust/C++/whatever) Wasm library in the game engine so that it’s efficient to call (how to pass the data from JS to this library? Most probably the objects should not store their polygons. They must use gdjs.ObjectPositionsManager to declare polygons, and these can then be transmitted to the (Rust/C++/whatever) Wasm library.

Well that’s just an idea, I’m just cautious about the overhead that can exist when calling Wasm from JS - you probably want to do as much work as possible in Wasm and have all the data in Wasm. The JS should just call the Wasm library to update the “world” representation.

mo7sen · March 1, 2020, 6:00pm

Can you please elaborate on what I should do with these 2 files? Should I just turn them into Rust? Do I need to make them wasm? If so, where do I plug the generated binary for usage?

4ian · March 1, 2020, 6:38pm

We’re discussing about a GSoC project that should be ~3 months long - so there is certainly more than than
I’ll leave you to investigate what is needed and try to define boundaries of the project. If you’re working on GSoC, you’ll need this autonomy.
I’ll for sure be there to clarify things and help starting - but you still must be in capacity of doing a proposal about something that you understand by yourself. It’s normal to have some things that are unsure and you can always ask questions, but I expect things like “how do I plug the generated binary” to be precised by you.

mo7sen · March 1, 2020, 6:52pm

Oh, that’s not what I meant. When you said “That’s an experiment so we could try Rust”, I thought you meant that those 2 files were a separate thing that we can try applying Rust to so that we can check its effectiveness. But now that I look at it again, I think you meant that the project itself was kind of an experiment so Rust wouldn’t be a problem.

4ian · March 1, 2020, 7:10pm

Ah sorry for the misunderstanding!

So yeah I mean the whole project as an experiment. But this being said, it would be interesting to do some smaller experiment before fully commiting to Rust, C++ or another language I’m thinking of creating a small library with a few example functions, then call them from JavaScript and measure the performance (how much can we call? Is there an overhead compared to calling a JavaScript function). Would be also interesting to see how we can share memory, like should we pass the polygons of objects as an array of integers representing points?

Making a few smaller experiments like this will be very useful to better understand what we can achieve in this project (and this can be done independently of GSoC - will be useful in all cases!)

mo7sen · March 2, 2020, 2:27am

I can do the “small library with example functions” thing. What kind of functions would most simulate the real work that the Wasm library might do?

4ian · March 2, 2020, 8:46am

I would say a function that is taking a large array of integers, to simulate the loading of updates to the object in the collision manager (see Experimental object collision and positions handling improvements by 4ian · Pull Request #1393 · 4ian/GDevelop · GitHub). The array of integers should in fact be an array of arrays of integers. The arrays of integers are representing the gdjs.ObjectPosition: Experimental object collision and positions handling improvements by 4ian · Pull Request #1393 · 4ian/GDevelop · GitHub

This function can store them (in a list). ObjectPosition have a unique number called objectId.
We can then have another function to do some fake work on this list. The parameters would be an array of integers, representing the list of objectId to do the work on. For example, you can imagine this is a function doing a collision checking between two lists of objects: you need two arrays of integers as parameters. The return value would be something like two lists of integers, which are the containing the ids of the objects in collision.

This should give a first idea at how easy and costly it is to:

send some datas (here, array of (arrays of) integers).
get back some datas
check the cost of calling a function every frame (if you manage to get the library, we can integrate it in a GDevelop game or even a Pixi.js demo, call your two functions and see how it performs).

Let me know if you’re interested in trying these few functions Will be very useful!

mo7sen · March 3, 2020, 10:01am

I wrote the first function (the one taking the large array of integers). I made it so that 24 bytes are passed per object. I don’t know how big are the numbers we’re dealing with so I decided to use 64 bits per number to get an idea about how it would be in a worst-case scenario.

pub struct GDObjectPosition {
    objectId: u64,    // 8 bytes
    x       : i64,    // 8 bytes
    y       : i64,    // 8 bytes
}

For the sake of the test, the objects were passed to the wasm function as follows:

PositionManager.addObjects([
    [1, 2, 3],
    [2, 2, 3],
        .
        .
        .
    [9, 2, 3],
]);

This function then parses each array to a GDObjectPosition object and is added to a HashMap in the wasm memory. It was then tested with 1000, 10000, 100000 and 1000000 objects being loaded.

 ---------------------------------------------------------
|  # of objects  |  1000  | 10000  |  100000  |  1000000  |
|----------------|--------|--------|----------|-----------|
|  Time (in ms)  |     3  |    12  |      96  |     1400  |
|----------------|--------|--------|----------|-----------|
|  After changes |     2  |    10  |      50  |      450  |
 ---------------------------------------------------------

So apparently, it takes exponential time to load the objects into the PositionManager. It is worth mentioning that this is far from being optimized thoroughly, and can probably be sped up when doing the real thing.

Some changes to the way the parsing was done led to a massive speed up, especially with a big number of objects.

4ian · March 5, 2020, 8:13am

Interesting! Thanks for doing that Would you be able to post your code on a git?
Can you try with 0 and 1 objects? I would be interested to know what’s the overhead here (basically to call a function doing nothing) - because 3ms is quite a lot if you have a game where you want to render at 60 fps (you only have 16ms in a frame).
Not sure if the 3ms are coming from the library, for the call to wasm, or both? Would be interesting to get the result in micro or nanoseconds to be also a bit more precise?

mo7sen · March 5, 2020, 10:57am

This is only the loading of the objects, which I don’t think happens every frame. I only measured milliseconds so far but when it’s a small number of objects, the time displayed ends up being 0ms.

I am using an ‘insert’ operation for each object in a hashmap, so the hashing function may even be playing a part in this. There is probably a better approach than that but this is what I came up with on the spur of the moment.

Also when passing the objects to the wasm function, I made it so that they are serialized and deserialized because that was a bit too convenient and should provide a worst case scenario for the performance. I’ll make the fake operation function, try to make the current addObjects(...) function a bit faster, then I’ll push everything to a git repository.

4ian · March 5, 2020, 8:02pm

Yep, that’s why you must find a way to get more precise measurements In particular, you can run the function multiple times.

Also when passing the objects to the wasm function, I made it so that they are serialized and deserialized because that was a bit too convenient and should provide a worst case scenario for the performance. I’ll make the fake operation function, try to make the current addObjects(...) function a bit faster, then I’ll push everything to a git repository.

Sure, thanks! There might be multiple way to serialize stuff, but I tried to have the “object position” to be consisting only of numbers for the reason of being able to quickly send them to wasm (at least sending a bunch of numbers can’t be worse than sending more complex data structures).

This is only the loading of the objects, which I don’t think happens every frame

If 1000 objects are moving at every frame, this means that we’ll have to send every frame 1000 changes in object positions - so that’s why it’s important to check the overhead Hopefully the gdjs.ObjectPositionsManager will “batch” the calls so that a call to wasm is only done when really needed (i.e: when a collision is checked, or at least once a frame but not more if not needed).

mo7sen · March 6, 2020, 8:12am

Changed add_objects from “loop and parse each object” to “parse the whole thing then loop over it”
- Speed Up by 2x-3x (the higher the number of objects, the more significant the speed up becomes)
- Increased the cost of calling add_objects with empty parameters from 600ns to 900ns
Assuming that DOS attacks are not to be expected in the HashMap, it was changed to FxHashMap:
- Faster hashing function, but not cryptographic
- Speed up not measured, but exists
Created the fake_collision_check function:
- Need to reimplement as I don’t know what is the expected complexity
- When the 2 arrays passed are of size 500 each, and the return arrays are empty, it takes approximately 3.2-3.5us to just call the function without any operations happening inside it
- The current fake_collision_check is a nested loop with the outer loop being over the first id list and the inner one being over the second id list. After that, an if condition (that I wrote randomly. I literally have no idea about what it does, but I know that it is true in most cases, if not all) is used and within it another if condition that does almost the same thing whether it’s true or false (it adds the object ids to different lists). Because of how badly it is written, I’m assuming that the return lists are way bigger than they should be (over 100x the length that they can be in reality). However, this produces a delay of average 6ms.
- Considering that this is 6ms in a messed up function that wastes a lot of resources. The real deal (with the RTree and all) will definitely cause a huge speed up.

New table for the add_objects function benchmark:

 ---------------------------------------------------------
|  # of objects  |  1000  | 10000  |  100000  |  1000000  |
|----------------|--------|--------|----------|-----------|
|  Time (in ms)  |     3  |    12  |      96  |     1400  |
|----------------|--------|--------|----------|-----------|
|  After changes | 890us  | 4.7ms  |   41.5ms |    405ms  |
 ---------------------------------------------------------

Also, here is the repo

4ian · March 8, 2020, 11:04pm

I was not able to spend much time reviewing this during this week-end, but I’ll give it a look tomorrow!
Thanks for your work so far!

Faster hashing function, but not cryptographic

Yep that’s perfectly fine. We’re almost only interested in speed in this use case

Because of how badly it is written, I’m assuming that the return lists are way bigger than they should be (over 100x the length that they can be in reality)

That’s indeed a fair assumption.

Considering that this is 6ms in a messed up function that wastes a lot of resources. The real deal (with the RTree and all) will definitely cause a huge speed up.

The RTree (or any spatial data structure) should indeed reduce this cost

New table for the add_objects function benchmark

Can you add a benchmark for calling the function 1000 times with 1 object? And 500 times with 2 objects?
I want to get an idea of the cost of just calling a function.

There might be a case where the JS is calling the function to update the spatial data structure a lot - even if not a lot of things have changed (for example, if you move an object, then immediately check its collision, then we are forced to call the Wasm module. And then you can imagine doing this for 100 objects in a row).