FEX 2207 Tagged!

This is going to be a very interesting release this month for users. Quite a large number of features landed for this release!

Automatic TSO mode migration

When FEX is running a single threaded application, we can be optimistic and disable heavy TSO-emulation related features. This significantly speeds up some single threaded applications. Once the program creates a thread then FEX will disable this optimize and clear its code cache to be safe.

EroFS rootfs image support

While FEX has supports SquashFS for a long time. We are now adding support for EroFS as well. The big advantage of EroFS is that it doesn’t serialize accesses to a single thread. When you’re having dozens of threads accessing the filesystem this is a real bottleneck. Low end devices would end up having a single CPU core maxed out inside of the squashfuse application while multiple threads are trying to request data.

erofsfuse solves this by allowing multi-threaded decompression that scales quite well depending on the number of file requests in flight. We can see how this scales in the following benchmark graphs.

As one can see, while erofsfuse scales quite well with multiple threads; squashfuse stays pretty much flat the entire time. The downsides to EroFS is that the compression ratio of its LZ4HC compression isn’t quite as good as ZSTD, causing the rootfs to be larger. But the reduction in memory usage, and lower read amplification plus higher bandwidth is worth it. Seriously improving performance of using a rootfs over a network mapped share like some people do.

An additional problem is that the erofsfuse application requires erofs-utils version 1.5, which came out on 2022-06-13. This is really bleeding edge currently.

Never the less, FEXRootFSFetcher will now allow you to download a FEX Rootfs image with this compression format. Just ensure you have a erofsfuse installed.

FEXServer

This is a fairly significant change to how FEX-Emu operates in the background. Similar to how wine has a wineserver, FEX is now requiring a FEXServer to always be running.

For now the FEXServer is taking over duty for rootfs image mounting and a logging server. In this future this will be expanded to also handle code caching services and more. FEXServer will automatically start on invocation of FEX and be running in the background until all instances of FEX close.

Pressure-vessel and Proton Fixes

FEX-Emu now officially works inside of pressure-vessel. This is the tool that Steam uses for running Proton games. Thunking doesn’t yet work in this case but it is coming.

If you’re wanting to test proton games, make sure to sign up to the latest SteamLinuxRuntime_soldier beta in the settings and give it a go. It’s not currently the speediest, but it should work.

Disable FEXServer rootfs when running under pressure-vessel

Pressure-vessel sets up an x86-64 rootfs. FEX shouldn’t be using the FEXServer provided rootfs in this case. We now detect when running inside of pressure-vessel, and disable the FEXServer RootFS

Enable Hypervisor bit

This change allows pressure-vessel to detect FEX-Emu and do FEX specific setup for games.

Fix open syscall path emulation

The open syscall is fairly rarely used so this has gone unnoticed for a while. We weren’t wrapping this syscall in our filesystem emulation and was breaking applications from running. With this fixed, the latest Proton Experimental branch from Steam now works!

Support thunks in pressure-vessel

Pressure-vessel uses a bunch of environment variable overriding to replace where libraries are inside of its chroot. Support this inside of FEX. While this is a step to getting Thunks working inside of pressure vessel, it is not yet supported.

Thunks

Lots of improvements to thunks, it’s hard to capture them all. There is a heavy amount of infrastructure work going on in here to make thunks more robust and stable. Starting with Vulkan and GL.

  • Work around lack of generic callback support in VK_EXT_debug_report (4771a340)
    • Disable debug report callback (751b66d4)
  • Allow building thunks on a wider range of platforms (ad6fd5ab)
  • Add fex:is_lib_loaded (88b94bef)
  • Support returning host function pointers to the guest (04a1ac96)

Fix clone3 syscall’s stack pointer again

In an edge case of how FEX-Emu handles clone3, it wasn’t handling the stack pointer size correctly again. Resolving this edge case once again gets Steam’s web helper working with glibc 2.34.

Fix 32-bit memory allocation range scanning

When scanning for free chunks of memory in the 32-bit range, FEX-Emu needs to use a custom allocator to ensure everything returned ends up in the lower 32-bit memory space. This fixes a bug where large allocations would never find an empty space. Fixes X-Plane 11!

Optimize file descriptor to filename mapping

It is a common occurrence that FEX needs to map an open file descriptor back to a file path. This used to take 14 system calls. Since each system call was querying filesystem metadata these could take some time. With this optimized approach it now takes only one system call instead. Significantly lowering file IO overhead!

Enable Wine application profiles

Wine applications when they are executing typically only showed up as wine or wine-preloader to FEX-Emu. Now we work around this issue by scanning the arguments to find the executable name, which allows application profiles to function. Now we can easily support SonicMania.exe.json!

FEXRootFSFetcher fix to file hashing

It was discovered that this tool was hashing files incorrectly. The new version is now hashing correctly and image files have been updated to be using the new hash. Nothing to see here

Fix 32-bit DRM ioctl DRM_IOCTL_WAIT_VBLANK

This ioctl does exactly what it says on the tin. Due to a copy and paste error, this wasn’t actually waiting on vblank.

Fix 32-bit ioctl structure copying

A feature of the DRM subsystem allows you to extend ioctl struct definitions safely. The kernel knows the size of the ioctl structure and if it differs from what the userspace application passes in, then it will only copy the smaller amount of data and zero out the rest. This allows older userspaces to safely work with newer kernels. FEX wasn’t reproducing this with its ioctl emulation in some V3D ioctls, resulting in unsafe execution of ioctls. This has been resolved.

Support CLMUL Extension

This instruction is heavily used to accelerate CRC and other hashing algorithms. This perfectly matches the AArch64 instruction as well. So implementing this was very straightforward!

Self-modifying-code frontend improvements

Allows FEX to track code pages inside our frontend decoding. This fixes some issues where code can be changed while we are decoding things in the frontend. Now FEX can detect this and throw away what it compiled.

Developer specific improvements

Check for binfmt_misc conflict before installing

To ensure building from source doesn’t result in a broken configuration, cmake will now check for conflicting binfmt_misc files before installing. How to uninstall the conflicting binfmt_misc files is specific to how the user has installed them, so it is left up to them to find out how.

Auto CI fetching

If the CI systems need an updated rootfs, the config can now be updated and they will fetch the latest.

unittests now longer forever recompiler

ASM unittests would always reglob on building which took time. This is now fixed

Fix ASAN bug in how register allocation data was allocated

This was hard to track, finally this annoying bug that has gone back and forth a bit has been resolved!

ARM64 CPU feature detection for ASM unit tests

Automatically disables some incompatible unit tests on ARM64 devices that don’t support some features. No more confusing failures.

GDB integration

This allows a plugin to be loaded in GDB to show more information that we would otherwise have. Giving us both backtraces and source inside of GDB even through the JIT. Should let debugging the JIT be that much easier.

See the 2207 Release Notes or the detailed change log in Github.

Written on July 7, 2022