Creating Minecraft in One Week with C++ and Vulkan

I took a crack at recreating Minecraft in one week using a custom C++ engine and Vulkan. I was inspired by Hopson, who did the same using C++ and OpenGL. Hopson was, in turn, inspired by Shane Beck, who was inspired by Minecraft, which was inspired by Infiniminer, which was presumably inspired by real world mining.

The GitHub repo for this project is here. Each day has it’s own git tag.

Of course, I’m not planning to literally recreate Minecraft. This project is meant to be a educational project. I want to explore how to use Vulkan in something more complicated than vulkan-tutorial.com or Sascha Willem’s demos. Therefore, the main focus will be on the design of the Vulkan-based engine, not on the design of the game.

Goals

Vulkan is a lot slower to develop than OpenGL, so I can’t include a lot features from actual Minecraft. No mobs, no crafting, no redstone, no block physics, etc. From the start, the goals of the project are:

  • Create a terrain rendering system
    • Meshing
    • Lighting
  • Create a terrain generator system
    • Terrain
    • Trees
    • Biomes
  • Add ability to modify terrain, move blocks

I have to find a way to do all of this without having a GUI in the final game, since I can’t find any GUI libraries that work with Vulkan and are easy to integrate.

Libraries

Of course, I don’t want to write a Vulkan application completely from scratch. I’m going to use existing libraries where possible to speed up development. These include:

Day 1

The first day I set up the Vulkan boilerplate and the skeleton of the engine. A lot of code was boiler plate and could just be copy-and-pasted from vulkan-tutorial.com. This included the trick of storing the vertex data as part of the vertex shader. This meant that I didn’t even need to have memory allocation set up. Just a simple pipeline that could do one thing: draw a triangle.

The engine is simple enough to support this triangle renderer. It has a single window, and a game loop that you can attach systems to. The extent of the GUI is the framerate that’s printed in the title of the window.

The project is divided into two parts: VoxelEngine and VoxelGame.

my_first_triangle.png

Day 2

I integrated the Vulkan Memory Allocator library. This library handles a lot of the boiler plate regarding Vulkan memory allocation, such as memory types, device heaps, and sub allocation.

Now that I had memory allocation, I made classes for meshes and vertex buffers. I updated the triangle renderer to use the mesh class instead of arrays embedded in the shader. Right now, transferring the mesh data to the GPU is handled manually by the triangle renderer.

Not much has changed

Day 3

I added a render graph system. This class is based off of this blog post, but heavily simplified. My render graph only has the bare minimum needed to handle Vulkan synchronization.

The render graph allows me to define nodes and edges. The nodes represent work done on the GPU. The edges are data dependencies between nodes. Each node gets it’s own command buffer to record into. The graph handles double buffering the command buffers and syncing them to previous frames. The edges are used to automatically insert pipeline barriers before and after every command buffer is written by the node. The pipeline barriers synchronize each resource’s usage and transfers ownership between queues. Edges also insert semaphores between nodes.

The nodes and edges form a directed acyclic graph. The render graph then performs a topological sort on the nodes, which produces a flat list of nodes sorted so that every node comes after all of the nodes that it depends on.

The engine provides three nodes. They are AcquireNode, which acquires an image from the swapchain, TransferNode, which transfers data from the CPU to the GPU, and PresentNode, which presents the swapchain image to be displayed.

Each node can implement preRender, render, and postRender, which are executed every frame. The AcquireNode acquires a swapchain image during preRender. The PresentNode presents that image during postRender.

I refactored the triangle renderer to use the render graph system instead of handling everything itself. There is an edge between AcquireNode and TriangleRenderer and between TriangleRenderer and PresentNode. This ensures that the swapchain image is properly synchronized as it is used throughout the frame.

I swear it’s changing under the hood

Day 4

I created the camera and 3D rendering system. The camera gets it’s own uniform buffer and descriptor pool for now.

I got bogged down this day trying to figure out the right configuration for rendering 3D with Vulkan. Most material online is about rendering with OpenGL, which uses slightly different coordinate systems than Vulkan. In OpenGL, the clip space Z-axis is defined as [-1, 1] and the top of the screen is at Y = 1. In Vulkan, the Z-axis is defined as [0, 1] and the top of the screen is at Y = -1. This slight differences mean that GLM’s default projection matrices don’t work correctly, since they are designed for OpenGL.

GLM provides the GLM_FORCE_DEPTH_ZERO_TO_ONE option, which fixes the Z-axis problem. Then the Y-axis problem can be fixed just by negating the element at (1, 1) in the projection matrix (GLM uses 0 based indexing).

Flipping the Y-axis means that the vertex data needs to flipped, since until now it used the negative Y-axis as the up direction.

Now in 3D!

Day 5

I added user input and the ability to fly the camera around the scene with mouse look. The input system is a bit over engineered, but it smooths over some oddities with GLFW’s input. Specifically I ran into a problem with how the mouse position changes when the mouse is locked.

The key and mouse button inputs are basically a thin wrapper over GLFW, just exposed through some entt signal handlers.

Just for comparison, this is about the same place that Hopson was after Day 1 of his project.

Day 6

I started adding the code to generate and render voxel chunks. Writing the meshing code was easy since I’ve done this before and knew of some abstractions to make it less error prone.

One abstraction is creating a template class ChunkData<T, chunkSize> that defines a cube of type T that is chunkSize on each side. This class stores the data in a 1D array and handles indexing the data with a 3D coordinate. The size of each chunk is 16 x 16 x 16, so the underlying data is just an array of length 4096.

Another abstraction is creating a position iterator that generates coordinates from (0, 0, 0) to (15, 15, 15). These two classes ensure that the chunk data is iterated in a linear order, to improve cache locality. The 3D coordinate is still available for other operations that need it. For example:

for (glm::ivec3 pos : Chunk::Positions()) {
    auto& data = chunkData[pos];
    glm::ivec3 offset = ...;
    auto& neighborData = chunkData[pos + offset];
}

I have several static arrays which define offsets that are commonly used in the game. For example, Neighbors6 defines the 6 neighbors that a cube shares a face with.

static constexpr std::array<glm::ivec3, 6> Neighbors6 = {
        glm::ivec3(1, 0, 0),    //right
        glm::ivec3(-1, 0, 0),   //left
        glm::ivec3(0, 1, 0),    //top
        glm::ivec3(0, -1, 0),   //bottom
        glm::ivec3(0, 0, 1),    //front
        glm::ivec3(0, 0, -1)    //back
    };

Neighbors26 is all neighbors that a cube shares a face, edge, or vertex with. That is, a 3x3x3 grid without the center. There are similar arrays for other sets of neighbors and for 2D neighbor sets.

There is an array which defines the data needed to produce one face of a cube. The directions of each face in this array match the directions in the Neighbors6 array.

static constexpr std::array<FaceArray, 6> NeighborFaces = {
    //right face
    FaceArray {
        glm::ivec3(1, 1, 1),
        glm::ivec3(1, 1, 0),
        glm::ivec3(1, 0, 1),
        glm::ivec3(1, 0, 0),
    },
    ...
};

The mesher is then very simple. It walks over the chunk’s data and adds a face when a block is solid and it’s neighbor is not. It simply checks each face of each block in the chunk. This is the same as the “naive” method described here.

for (glm::ivec3 pos : Chunk::Positions()) {
    Block block = chunk.blocks()[pos];
    if (block.type == 0) continue;

    for (size_t i = 0; i < Chunk::Neighbors6.size(); i++) {
        glm::ivec3 offset = Chunk::Neighbors6[i];
        glm::ivec3 neighborPos = pos + offset;

        //NOTE: bounds checking omitted

        if (chunk.blocks()[neighborPos].type == 0) {
            Chunk::FaceArray& faceArray = Chunk::NeighborFaces[i];
            for (size_t j = 0; j < faceArray.size(); j++) {
                m_vertexData.push_back(pos + faceArray[j]);
                m_colorData.push_back(glm::i8vec4(pos.x * 16, pos.y * 16, pos.z * 16, 0));
            }
        }
    }
}

I replaced the TriangleRenderer with the ChunkRenderer. I also added a depth buffer so the chunk mesh would render properly. Another edge has to be added to the render graph, between TransferNode and ChunkRenderer. This edge transfers queue family ownership of resources between the transfer queue and the graphics queue.

Then I updated the engine so that it could properly handle window resize events. This is simple in OpenGL, but is rather involved with Vulkan. Since the swapchain must be explicitly created and has a fixed size, it must be recreated when the window is resized. All resources that depend on the swapchain must be recreated.

Any commands that depend on the swapchain (which right now, is all of the drawing commands) must complete execution before the old swapchain is destroyed. This means the entire GPU must be stalled.

The graphics pipeline must be updated to allow dynamic viewport and scissor sizes.

The swapchain can’t be created at all if the window’s size is 0 on the X or Y-axis. This includes when the window is minimized. So the entire game pauses when this happens and only resumes when the window is restored.

The mesh right now is just a 3D checkerboard. The RGB colors of the mesh are set to their XYZ position multiplied by 16.

Day 7

I updated the game handle multiple chunks at once, instead of just one. Multiple chunks and their meshes are managed by entt‘s ECS. Then I refactored the chunk renderer to render any chunks that are in the ECS. There’s still only one chunk, but I could add more if I felt like it.

I refactored the mesh to allow it’s data to be updated after it’s creation. This will allow me to update the chunk’s mesh in the future, when I add the ability to add and remove blocks.

When you add or remove a block, the number of vertices in the mesh can potentially increase or decrease. The previously allocated vertex buffer can be reused if the new mesh is the same size or smaller. But if the mesh is larger, then new vertex buffers need to be created.

The previous vertex buffer can not be destroyed immediately. There may be command buffers executing from previous frames that depend on that specific VkBuffer object. The engine must keep that buffer alive until those command buffers finish. That is, if you draw a mesh on frame i, the GPU may be using that buffer until frame i + 2 starts. The buffer can’t be deleted on the CPU until the GPU is finished using it. So, I updated the render graph to track resource lifetimes.

If a render graph node wants to use a resource (a buffer or image), it must call the sync method inside the preRender method. This method takes a shared_ptr to the resource. This shared_ptr makes sure that the resource is kept alive while the command buffers are executing. (This isn’t a good solution for performance reasons. More on this later.)

The chunk now has it’s mesh regenerated every frame.

Conclusion

That’s all for this week. I setup the basics for rendering a world with multiple voxel chunks. Check out week two of this one week long project. We’re only slightly behind schedule, so things are looking good.

Leave a Comment