Releases: ConfettiFX/The-Forge
Release 1.48 - May 20th, 2021 - Aura | New FSL Shader Language Translator | Run-time API Switching | Variable Rate Shading | MSAA | OpenGL ES 2 Update | PVS Studio
This is our biggest update since we started this repository more than three years ago. This update is one of those "what we have learned from the last couple of projects that are using TF" updates and a few more things.
- Aura - Dynamic Global Illumination - we developed this system in the 2010 / 2011 time frame. It is hard to believe it is 10 years ago now :-) ... it shipped in Agents of Mayhem at some point and was implemented and used in other games. We are just putting the "base" version without any game specific modifications in our commercial Middleware repository on GitHub. The games that used this system made specific modifications to the code base to align with their art asset and art style.
In today's standards this system still fulfills the requirement of a stable rasterizer based Global Illumination system. It runs efficiently on the original XBOX One, that was the original target platform, but might require art asset modifications in a game level.
It works with an unlimited number of light sources with minimal memory footprint. You can also cache the reflective shadow maps for directional, point and spotlights the same way you currently cache shadow maps. At some point we did a demo running on a second generation integrated Intel GPU with 256 lights that emitted direct and indirect light and had shadow maps in 2011 at GDC? :-)
It is best to integrate that system in a custom game engine that can cache shadow maps in an intelligent way.
Aura - Windows DirectX 12 Geforce 980TI 1080p Driver 466.47
Aura - Windows Vulkan Geforce 980TI 1080p Driver 466.47
Aura - Ubuntu Vulkan Geforce RTX 2080 1080p
Aura - PS4
Aura - XBOX One original
-
Forge Shader Language (FSL) translator - after struggeling with writing a shader translator now for 1 1/2 years, we restarted from scratch. This time we developed everything in Python, because it is cross-platform. We also picked a really "low-tech keep it simple" approach. The idea is that a small game team can actually maintain the code base and write shaders efficiently. We wanted a shader translator that translates a FSL shader to the native shader language of each of the platforms. This way whatever shader compiler is used on that platform can take over the actual job of compiling the native code.
The reason why we are doing this lies mostly in the unreliability of DXC and SPIR-V in general and also their lack of reliability if it comes to cross-platform translation.There is a Wiki entry that holds a FSL language primer and general information how this works here:
-
Run-Time API Switching - we had some sort of run-time API switching in an early version of The Forge. At the time we were not expecting this to be very useful because most game teams do not switch APIs on the fly. In the meantime we found a usage case on Android, where we have to reach a large number of devices. So we came up with a better solution that is more consistent with the overall architecture and works on at least PC and Android platforms.
On Windows PC one can switch between DX12, Vulkan and DX11 if all are supported. On Android one can switch between Vulkan and OpenGL ES 2.0. The later allows us to target a much larger group of devices for business application frameworks. We could extend this architecture to other platforms like consoles easily.
This new API switching required us to change the rendering interfaces. So it is a breaking change to existing implementations but we think it is not much effort to upgrade and the resulting code is easier to read and maintain and overall improves the code base by being more consistent. -
Device Reset - This was implemented together with API switching. Windows forces game developers to respond to a crashing device driver by resetting the device. We implemented the functionality already in the last update here on GitHub. This update integrates it better into the OS base layer.
We also verified that the life cycle management for Windows in each application based on the IApp interface works now for device change, device reset and for API switching so that we can cover all cases of losing and recovering the device.The functions for API switching and device reload and reset are:
void onRequestReload();
void onDeviceLost();
void onAPISwitch();
-
Variable Rate Shading (VRS) - we implemented VRS in a new unit test 35_VariableRateShading. It is only supported by DirectX 12 on Windows and XBOX Series S / X.
In this demo, we demonstrate two main ways of setting the shading rate:- Per-tile Shading Rate:
Generating a shading rate lookup texture on-the-fly. Used for drawing the color palette which makes up the background. The rate decreases the further the pixels are located from the center. We can see artifacts becoming visible at aggressive rates, such as 4X4. There is also a slider in the UI to modify the center of the circle.
- Per-tile Shading Rate:
- Per-draw Shading Rate:
The cubes are drawn by a different shading rate. They are following the Per-draw rate, which can be changed via the dropdown menu in the UI.
By using a combiner that overrides the screen rates, we ensure that cubes are drawn by an independent rate.
The cubes are using per-draw shading rate while the background is using per-tile shading rate.
-
Notes:
- There is a debug view showing the shading rates and the tiles' size.
- Per-tile method may not be available on certain GPUs even if they support the Per-draw method.
- The tile size is enforced by the GPU and is readable, as shown in the example.
- The shading rates available can vary based on the active GPU.
-
Multi-Sample Anti-Aliasing (MSAA) - we added a dynamic way of picking MSAA to unit test 9 and the Visibility Buffer example on all platforms.
-
Android & OpenGL ES 2 - the OpenGL ES 2 layer for Android is now more stable and tested and closer to production code. As mentioned above on an Android phone one can switch between Vulkan and OpenGL ES 2 dyanmically if both are supported.
Now Android & OpenGL ES 2 support additionally unit test 17 - Entity Component System Test.
In general we are testing many Android phones at the moment on the low and high end of the spectrum following the two Android projects we are currently working on, which are on both ends of the spectrum. -
PVS Studio - we did another manual pass on the code base with PVS Studio -a static code analyzer- to increase code quality.
Release 1.47 - December 18th, 2020 - OpenGL ES 2.0 Android support | Device Reset Support | DRED / Breadcrumb support | Lua driven functional tests | DX11 refactor | YUV support through Vulkan
As the year winds slowly down, we finally found time to do another release. First of all, Happy Holidays and a happy new Year!
Most of us will take off over the Holiday season and spent time with their families. We should be back online in the middle of January 2021.
- OpenGL ES 2.0: TF will run on probably several hundred million of mobile devices in the future. It will be the rendering layer of business application frameworks. For this usage case, we added OpenGL ES 2.0 support only for Android. The OpenGL ES 2.0 layer only supports unit tests 1, 5, 12 and 31 at the moment.
- Device change / reset: we finally implemented all the code that can deal with device changes, device resets or device removed scenarios on all platforms. The underlying design was always there but it took us 3+ years to finally add the functionality :-)
When you go into any of the*OSBase.*
files you can find a snippet of code that looks like this:
if (pApp->mSettings.mResetGraphics)
{
pApp->Unload();
pApp->Load();
pApp->mSettings.mResetGraphics = false;
}
- DRED / Breadcrumb support: to be able to better tell what the reason behind a removed device is, we implemented DRED support on PC with DirectX 12 and XBOX. We integrated this into the first functional test 01_Transformations. Here is a screenshot. Look for the "Simulate crash" button:
Breadcrumb are user defined markers used to pinpoint which command has caused GPU to stall.
In the Breadcrumb unit test, two markers get injected into the command list.
Pressing the crash button would result in a GPU hang.
In this situation, the first marker would be written before the draw command, but the second one would stall for the draw command to finish.
Due to the infinite loop in the shader, the second marker won't be written, and we can reason that the draw command has caused the GPU to hang.
We log the markers' information to verify this.
Check out this link for more info: D3D12 Device Removed Extended Data (DRED)
- More Lua Scripting support for all functional tests:
- For the scripted testing of the Unit Tests, this layer provides automated function registration of the UI elements to Lua State.
- Any UI elements added to the GUI will add a function or a pair of function(Getter/Setter) to the Lua state for using them in any script.
Lua function name resolution will work like this: - UI Widget "label" name will be included in the function name as follows,
- For Widget events: label name + "Event Name". e.g., Lua Function name for label - "Press", and event - OnEdited : "PressOnEdited"
- For Widget modifiers such as ints / floats: "Set" and "Get" function pair will be added as a prefix to label name e.g., "X" variable will have "SetX" and "GetX" pair of functions.
- After writing the scripts, you can let the layer know about the scripts using AddTestScripts() function call and run them on any frame by RunTestScript() defined in UIApp class. There are examples of these test scripts in most of the UTs showing how you can also add these scripts to UI and test them on runtime.
Here is how the current Lua support in the functional tests might look like:
-
DX11 refactor: we re-wrote the DX11 run-time a few times. We ended up with the most straighforward version. This version only recently shipped in Hades along with the Vulkan run-time on PC.
-
YUV support: we have now YUV support for all our Vulkan API platforms PC, Linux, Android and Switch. There is a new functional test for YUV. It runs on all these platforms:
-
Audio: we removed the audio functional test. It was the only test that was released unfinished and didn't run on all our platforms. Our customers show love for FMOD ... would make more sense to show an integration of that.
-
GitHub issues fixed:
Numerous other fixes ...
Release 1.46 - October 1st, 2020 - Supergiant's Hades | Windows Management | AMD FX Stochastic SS Reflection
- Supergiant's Hades we are working with Supergiant since 2014. One of the on-going challenges was that their run-time was written in C#. At the beginning of last year, we suggested to help them in building a new cross-platform game engine in C/C++ from scratch with The Forge. The project started in April 2019 and the first version of this new engine launched in May this year. Hades was then released for Microsoft Windows, macOS, and Nintendo Switch on September 17, 2020. The game can run on all platforms supported by The Forge.
Here is a screenshot of Hades running on Switch:
Here is an article by Forbes about Hades being at the top of the Nintendo Switch Charts.
Hades is also a technology showcase for Intel's integrated GPUs on macOS and Windows. The target group of the game seems to often own those GPUs.
- Windows management: there is a new functional test named 32_Window that demonstrates windows management on Windows, Linux and macOS.
- The window layout, position, and size are now driven by the client dimensions, meaning that
the values that the client demands are the exact values the client area will be represented with, regardless of the window style. This allows for much greater flexibility
and consistency, especially when working with a fullscreen window. - Multi-monitor support has also been improved significantly, offering smooth consistent transitions between client displays and guaranteeing correct window behavior and data retention. Media layer functionality has been expanded, allowing the client to control mouse positioning, mouse visibility, and mouse visual representation.
- It is now possible to create independent mouse cursors to further customize the application.
- The window layout, position, and size are now driven by the client dimensions, meaning that
Here are the screenshots:
- Screen-Space reflections: we renamed the functional test "10_PixelProjectedReflections" to 10_ScreenSpaceReflections. You have now two choices: you can pick either Pixel Projected Reflections or AMD's FX Stochastic Screen Space Reflection. We just made AMD's FX code cross-platform. It runs now on Windows, Linux, macOS, Switch, PS and XBOX.
Here are the screenshots:
- Resolved GitHub issues:
- Issue #183 - VERTEX_ATTRIB_RATE_INSTANCE ignored on macOS 10.12, iOS 10.0
Release 1.44 - July 16th, 2020 - Android | Linux
-
Mobile Devices: DPI scaling is properly handled now so we shouldn't see messed up UI anymore on mobile devices
-
Android: the following Unit-tests are now included for Android:
-
gamepad support: tested with PS4 controller
-
sample size reduction
-
proper closing of apps with the back button
-
proper handling of vSync
-
.zip filesystem handling
-
shader compile #include directive support
-
overall stability improvements
-
improved swapchain creation process and proper handling of current frame index
-
Linux:
- Window management is improved
- Borderless fullscreen is supported
- Implemented full screen toggle (usually alt-enter)
- Cursor position is now correct
- Camera movement with mouse now works properly
- Resources are freed properly
Release 1.43 - May 22nd, 2020 - MTuner | macOS / iOS run-time
- Filesystem: it turns out the file system is still confusing and not intuitive. It mixes up several concepts but is not consistent and somehow favors Windows folder naming conventions, that do not exist in most of our target platforms. We did a slight first step with this release. We need to make a deeper change with the next release.
- DirectX 11: the DirectX 11 run-time gets a lot of mileage now. For one game it went now successfully through a test center. This release holds a wide range of changes especially for multi-threaded rendering.
- MTuner : we are making another attempt on integrating MTuner into the framework. We need it to tune memory usage in some game titles. The current version only reliably supports Windows but we try to extend it to more platforms.
- Integrated Milos Tosic’s MTuner SDK into the Windows 10 runtime of The Forge. Combined with mmgr, this addition will provide the following features:
- Automatic generation of .MTuner capture file alongside existing .memleaks file.
- In-depth analysis of the generated file using MTuner’s user-friendly UI app.
- Clear and efficient highlighting of memory leaks and usage hotspots.
- Support for additional platforms coming soon!
- Integrated Milos Tosic’s MTuner SDK into the Windows 10 runtime of The Forge. Combined with mmgr, this addition will provide the following features:
MTuner
MTuner was integrated into the Windows 10 runtime of The Forge following a request for more in-depth memory profiling capabilities by one of the developers we support. It has been adapted to work closely with our framework and its existing memory tracking capabilities to provide a complete picture of a given application’s memory usage.
To use The Forge’s MTuner functionality, simply drag and drop the .MTuner file generated alongside your application’s executable into the MTuner host app, and you can immediately begin analyzing your program’s memory usage. The intuitive interface and exhaustive supply of allocation info contained in a single capture file makes it easy to identify usage patterns and hotspots, as well as tracking memory leaks down to the file and line number. The full documentation of MTuner can be found [here](link: https://milostosic.github.io/MTuner/).
Currently, this feature is only available on Windows 10, but support for additional platforms provided by The Forge is forthcoming.
Here is a screenshot of an example capture done on our first Unit Test, 01_Transformations:
- Multi-Threading system: especially for the Switch run-time we extended our multi-threading system to support a "preferred" core.
- macOS / iOS run-time got another make over. This time we brought the overall architecture a bit closer to the rest of the rendering system and we are also working towards supporting lower end hardware like a 2015 MacBook Air and macOS 10.13.6. Those requirements were based on the Steam Hardware Survey.
- there are now functions that help you calcuate the memory usage supported by all APIs: look for caculateMemoryUse / freeMemoryStats
- Windows management got a bit more flexible by offering borderless windows and more style attributes
- 17_EntityComponentSystem runs now better on AMD CPUs ... there were some inefficiencies in the unit test ...
Release 1.42 - April 15th, 2020 - macOS / iOS run-time
Most of us are working from home now due to the Covid-19 outbreak. We are all trying to balance life and work in new ways. Since the last release we made a thorough pass through the macOS / iOS run-time, so that it is easier to make macOS your main development environment for games.
Unit-tests fixes:
- Fixed wrong project ordering in XCode
- Fixed build-time macOS / iOS warnings.
- Fixed 03_Multithread until test not showing the appropriate charts for all threads
- Fixed warnings in 06_MaterialPlayground due to wrong GLTF validation
- Fixed visual artifacts in 08_GLTFViewer not producing correct normals due to wrong Metal shader.
- Added missing projects to macOS / iOS workspace and removed unnecessary ones.
- Fixed unit test 14_WaveIntrinsics macOS on AMD iMac. Implemented workaround for AMD driver issue on macOS.
Metal runtime fixes:
- Fixed Metal issue handling barriers: scheduled barriers were being ignored, introducing visual artifacts due to read-write race condition.
Closed Issues:
Release 1.41 - March 5th, 2020 - Path Tracing Benchmark | CPU Cacheline alignment | Improved Profiler | D3D12 Memory Allocator
-
Based on request we are providing a Path Tracing Benchmark in 16_RayTracing. It allows you to compare the performance of three platforms:
- Windows with DirectX 12 DXR
- Windows with Vulkan RTX
- Linux with Vulkan RTX
We believe that every benchmarking tool should be open-source, so that everyone can see what the source code is doing. We will extend this benchmark to the non-public platforms we support to compare the PC performance with console performance.
The benchmark comes with batch files for all three platforms. Each run generates a HTML output file from the microprofiler that is integrated in TF. The default number of iterations is 64 but you can adjust that. There is a Readme file in the 16_RayTracing folder that describes the options.
Windows DirectX 12 DXR, GeForce RTX 2070 Super, 3840x1600, NVIDIA Driver 441.99
Windows Vulkan RTX, GeForce RTX 2070 Super, 3840x1600, NVIDIA Driver 441.99
Linux Vulkan RTX, Geforce RTX 2060, 1920x1080, NVIDIA Driver 435.21
We will adjust the output of the benchmark to what users request.
- With this release we also aligned the whole renderer interface better to 64 byte CPU cache lines. We trimmed down all the structs substantially and removed many. This is a breaking change for the renderer interface and a major change to the whole code base.
- DirectX 12
- D3D12 Memory Allocator: we are using now AMD's D3D12 memory allocator for DirectX after having used the Vulkan equivalent for more than two years. We also extended it to support Multi-GPU.
- We upgraded to the latest dxgi factory interface in DirectX 12
- Microprofiler: because we need the microprofiler to offer the QA department help in reporting performance problems for some of the games that will be shipping with TF (and the benchmark mentioned above), we did another pass on its functionality and ease of use, especially on console platforms. The idea is that QA can quickly and easily store a screenshot or HTML file in a bug report. This is still work in progress and with every shipping game will probably be improved.
- Now that GDC 2020 was postponed, we will also postpone our GDC related activities. The user meeting and our GDC talk will be postponed until the next GDC happens. If there is a need we can also do a user meeting in an online conference room or in Discord in a private area. Let us know.
- Renamed CustomMiddleware to Custom-Middleware back ...
Release 1.40 - February 20th, 2020 - Resource Loader | glTF as Geometry Container | GDC Talk | User Group Meeting
This release took much longer than expected ... :-)
- We are going to give a talk at GDC during the GPU Summit day. It will cover our skydome system Ephemeris 2: GDC 2020 Ephemeris
- We will also have a user group meeting during GDC: The Forge User Group
- A new resource loader can now stream textures, buffers and additionally geometry (extracted from glTF) asynchronously. We replaced assimp with this loader to save compile time and space on GitHub. We still use assimp for our internal tools. Here are the underlying design principles of the resource loader:
- Generally glTF is just a geometry container for us. We do not apply any of the underlying principles like material or mesh or scene management that it offers because they are not tailored to our needs. The resource loader only loads a glTF file, extract its geometry and stores this data (including hair and ozz animation system data) in a vertex and index buffer stream.
- All texture loading and material loading is the responsibility of the app. Scene partitioning or material support is not used from glTF. Those remain on the App level. Each app has its own lighting and material models and it shouldn't be restricted to the very limiting architecture of glTF
- There is no glTF code in any of the unit tests or app examples with the exception of the glTF viewer. The resource loader loads geometry just with a addResource call as it loads textures and buffers ... it can generate a vertex and index buffer stream with offset values for draw calls or for ExecuteIndirect ...
- All model art assets were converted to glTF
- libzip was replaced with zip because it is easier to maintain.
- Console support: at the end of last year before our three week break, we made the PS4 and Switch run-times ready to ship games (we will see first games shipping this year). We also started on the PS5 and XBOX One Series X support. You need to be an acredited developer to receive the source code for any consoles. We will be asking the console owner for permission before we would provide you with any source code. That means you have to be part of their developer program.
- Improved Windows 7 support: one of the games TF is launching with requires Windows 7 support. This means we are now testing the Windows 7 run-time more rigourously and committed fixes with this release
- Math library: added missing vec2 functions
- Updated copyright statement
- Resolved issues on GitHub:
- issue 162 - 13_UserInterface - Crash
- issue 161 - 18_VirtualTexture breaks with dx and vk: only fairly decent cards support virtual textures. We added tracking support in the *.cfg system and throw an error message when the GPU doesn't support the feature.
- issue 124 - Missing KeyKpAdd mapping
Release 1.45 - July 29th, 2020 - TressFX | File System Rewrite
- TressFX: we upgraded TressFX a bit and retuned the lighting.
-
File system: our old file system was designed more for tools or Windows applications than for games. It consumed more memory than the whole rendering system and used Windows file methods extensively. That is the opposite of what you want in a game. It took us now several months to correct the mistake and come up with a file system that is tailored towards games. That means that the interface changed substantially. Thanks to all those who pointed this out. Sometimes it takes a couple of iterations to land on a design that is efficient.
If you look at the new interface there are still path related functions in there. They will be removed step-by-step.
Please check out the new file system interface and let us know what you think. -
Android Vulkan: validation layer is now supported
Release 1.39 - November 26th - Sparse Virtual Texture Support | Stormland
The Forge has now support for Sparse Virtual Textures on Windows and Linux with DirectX 12 / Vulkan. Sparse texture (also known as "virtual texture", “tiled texture”, or “mega-texture”) is a technique to load huge size (such as 16k x 16k or more) textures in GPU memory.
It breaks an original texture down into small square or rectangular tiles to load only visible part of them.
The unit test 18_Virtual_Texture is using 7 sparse textures:
- Mercury: 8192 x 4096
- Venus: 8192 x 4096
- Earth: 8192 x 4096
- Moon: 16384 x 8192
- Mars: 8192 x 4096
- Jupiter: 4096 x 2048
- Saturn: 4096 x 4096
There is a unit test that shows a solar system where you can approach planets with Sparse Virtual Textures attached and the resolution of the texture will increase when you approach.
Linux 1080p NVIDIA RTX 2060 Vulkan Driver version 435
Windows 10 1080p AMD RX550 DirectX 12 Driver number: Adrenaline software 19.10.1
Windows 10 1080p NVIDIA 1080 Vulkan Driver number: 418.81
Ephemeris 2 - the game Stormland from Insomniac was released. This game is using a custom version of Ephemeris 2. We worked for more than six months on this project.
Head over to Custom Middleware to check out the source code.