for core stuff:
just remove unique ptrs that dont need any pointer stability at all (afterall its an allocation within an allocation so yeah)
for fibers:
Main reasoning behind this is because virtualBuffer<> is stupidly fucking expensive and it also clutters my fstat view
ALSO mmap is a syscall, syscalls are bad for performance or whatever
ALSO std::vector<> is better suited for handling this kind of "fixed size thing where its like big but not THAT big" (512 KiB isn't going to kill your memory usage for each fiber...)
for core.cpp stuff
- inlines stuff into std::optional<> as opposed to std::unique_ptr<> (because yknow, we are making the Impl from an unique_ptr, allocating within an allocation is unnecessary)
- reorganizes the structures a bit so padding doesnt screw us up (it's not perfect but eh saves a measly 44 bytes)
- removes unused/dead code
- uses std::vector<> instead of std::deque<>
no perf impact expected, maybe some initialisation boost but very minimal impact nonethless
lto gets rid of most calls anyways - the heavy issue is with shared_ptr and the cache coherency from the atomics... but i clumped them together because well, they kinda do not suffer from cache coherency - hopefully not a mistake
this balloons the size of Impl to about 1.67 MB - which is fine because we throw it in the stack anyways
REST OF INTERFACES: most of them ballooned in size as well, but overhead is ok since its an allocation within an alloc, no stack is used (when it comes to storing these i mean)
Signed-off-by: lizzie lizzie@eden-emu.dev
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3306
Reviewed-by: CamilleLaVey <camillelavey99@gmail.com>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Co-authored-by: lizzie <lizzie@eden-emu.dev>
Co-committed-by: lizzie <lizzie@eden-emu.dev>
This reworks the logic to improve performance in many games that heavily rely on DMA. It can help all platforms, but on desktop the performance boost can be noticeable, especially on dedicated GPUs. The option Sync Memory Operations must be enabled.
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3179
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
The GPU Accuracy level is now divided into Performance, Balanced and Accurate.
1. Performance prioritizes speed at all costs. It's faster, but it can be unstable and may have some bugs (which is expected).
2. Balanced maintains excellent performance and is safer against bugs and shader corruption.
3. Accurate is the most precise and the most expensive in terms of hardware. Only a few games still need this level to work properly.
The Release Early Fences toggle has also been removed by @PavelBARABANOV, as it's not needed anymore.
Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3129
Reviewed-by: Caio Oliveira <caiooliveirafarias0@gmail.com>
Reviewed-by: Maufeat <sahyno1996@gmail.com>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
Mainly because - while we can just give out an AppImage and call it a day - building natively should be an option for all major distros.
And "base" stable debian doesn't provide a new enough g++/clang++ so... we need to make some "fixups".
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/2763
Reviewed-by: crueter <crueter@eden-emu.dev>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Co-authored-by: lizzie <lizzie@eden-emu.dev>
Co-committed-by: lizzie <lizzie@eden-emu.dev>
This adds an option to control the DMA precision level at runtime.
Co-authored-by: crueter <crueter@eden-emu.dev>
Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/304
Reviewed-by: crueter <crueter@eden-emu.dev>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
This improves DMA logic and add an option to sync memory operations.
Thanks to Higgs for the new DMA logic.
Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Co-authored-by: crueter <crueter@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/276
Reviewed-by: crueter <crueter@eden-emu.dev>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
- Modify DmaPusher to use safe memory reads when handling compute
operations at High GPU accuracy
- Prevent potential memory corruption issues that could lead to
invalid dispatch parameters
- Previously, unsafe reads could result in corrupted launch_description
data in KeplerCompute::ProcessLaunch, causing invalid vkCmdDispatch
calls
- By enforcing safe reads specifically for compute operations, we
maintain performance for other GPU tasks while ensuring compute
dispatch stability
This change requires >= High GPU accuracy level to take effect.
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.
- Bindings are cached, allowing to skip work when the game changes few
bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
from the GPU to emulate constant buffers, instead GL_EXT_memory_object
is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
theory this should save one hash table resolve inside the driver
compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
that are not Nvidia's proprietary, due to their low performance on
partial glBufferSubData calls synchronized with 3D rendering (that
some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
cache more bindings than before, skipping unnecesarry work.
This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.