Commit Graph

57 Commits

Author SHA1 Message Date
lizzie 83a28dc251
[common, core] remove uneeded memory indirection overhead at startup (#3306)
for core stuff:
just remove unique ptrs that dont need any pointer stability at all (afterall its an allocation within an allocation so yeah)

for fibers:
Main reasoning behind this is because virtualBuffer<> is stupidly fucking expensive and it also clutters my fstat view
ALSO mmap is a syscall, syscalls are bad for performance or whatever
ALSO std::vector<> is better suited for handling this kind of "fixed size thing where its like big but not THAT big" (512 KiB isn't going to kill your memory usage for each fiber...)

for core.cpp stuff
- inlines stuff into std::optional<> as opposed to std::unique_ptr<> (because yknow, we are making the Impl from an unique_ptr, allocating within an allocation is unnecessary)
- reorganizes the structures a bit so padding doesnt screw us up (it's not perfect but eh saves a measly 44 bytes)
- removes unused/dead code
- uses std::vector<> instead of std::deque<>

no perf impact expected, maybe some initialisation boost but very minimal impact nonethless
lto gets rid of most calls anyways - the heavy issue is with shared_ptr and the cache coherency from the atomics... but i clumped them together because well, they kinda do not suffer from cache coherency - hopefully not a mistake

this balloons the size of Impl to about 1.67 MB - which is fine because we throw it in the stack anyways

REST OF INTERFACES: most of them ballooned in size as well, but overhead is ok since its an allocation within an alloc, no stack is used (when it comes to storing these i mean)

Signed-off-by: lizzie lizzie@eden-emu.dev
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3306
Reviewed-by: CamilleLaVey <camillelavey99@gmail.com>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Co-authored-by: lizzie <lizzie@eden-emu.dev>
Co-committed-by: lizzie <lizzie@eden-emu.dev>
2026-01-16 23:39:16 +01:00
MaranBr 8d31484f64
[video_core] Reduce synchronization overhead and improve performance in DMA operations (#3179)
This reworks the logic to improve performance in many games that heavily rely on DMA. It can help all platforms, but on desktop the performance boost can be noticeable, especially on dedicated GPUs. The option Sync Memory Operations must be enabled.

Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3179
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
2025-12-31 01:00:11 +01:00
MaranBr 9da38715fe
[video_core] Rework GPU Accuracy levels and remove Early Release Fences toggle (#3129)
The GPU Accuracy level is now divided into Performance, Balanced and Accurate.

1. Performance prioritizes speed at all costs. It's faster, but it can be unstable and may have some bugs (which is expected).

2. Balanced maintains excellent performance and is safer against bugs and shader corruption.

3. Accurate is the most precise and the most expensive in terms of hardware. Only a few games still need this level to work properly.

The Release Early Fences toggle has also been removed by @PavelBARABANOV, as it's not needed anymore.

Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3129
Reviewed-by: Caio Oliveira <caiooliveirafarias0@gmail.com>
Reviewed-by: Maufeat <sahyno1996@gmail.com>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
2025-12-09 18:11:05 +01:00
lizzie f55e560ac5
[compat] Debian stable gcc12/clang14 compilation fixes (#2763)
Mainly because - while we can just give out an AppImage and call it a day - building natively should be an option for all major distros.
And "base" stable debian doesn't provide a new enough g++/clang++ so... we need to make some "fixups".

Signed-off-by: lizzie <lizzie@eden-emu.dev>

Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/2763
Reviewed-by: crueter <crueter@eden-emu.dev>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Co-authored-by: lizzie <lizzie@eden-emu.dev>
Co-committed-by: lizzie <lizzie@eden-emu.dev>
2025-10-18 01:54:43 +02:00
MaranBr 19036c59b5
[video_core] Simplify DMA options (#525)
This simplifies DMA options in a clearer and more objective way.

Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/525
Reviewed-by: crueter <crueter@eden-emu.dev>
Reviewed-by: Shinmegumi <shinmegumi@eden-emu.dev>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
2025-09-16 18:42:48 +02:00
MaranBr 5b864d406d
[video_core] Add option to control the DMA precision level at runtime (#304)
This adds an option to control the DMA precision level at runtime.

Co-authored-by: crueter <crueter@eden-emu.dev>
Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/304
Reviewed-by: crueter <crueter@eden-emu.dev>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
2025-08-23 19:42:10 +02:00
MaranBr 7bfa2404a6
[video_core] Improve DMA logic and add an option to sync memory operations (#276)
This improves DMA logic and add an option to sync memory operations.

Thanks to Higgs for the new DMA logic.

Co-authored-by: PavelBARABANOV <pavelbarabanov94@gmail.com>
Co-authored-by: crueter <crueter@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/276
Reviewed-by: crueter <crueter@eden-emu.dev>
Co-authored-by: MaranBr <maranbr@outlook.com>
Co-committed-by: MaranBr <maranbr@outlook.com>
2025-08-20 00:21:25 +02:00
crueter f1e74f6855
[meta] remove MicroProfile (#185)
Signed-off-by: crueter <crueter@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/185
Reviewed-by: Lizzie <lizzie@eden-emu.dev>
2025-08-06 07:48:11 +02:00
MrPurple666 2d2e9208d2 Unified torzu and sudachi friend.cpp + fix android build on dma_pusher 2025-04-04 03:40:49 +02:00
Zephyron 0071e980b8 video_core: Enforce safe memory reads for compute dispatch
- Modify DmaPusher to use safe memory reads when handling compute
  operations at High GPU accuracy
- Prevent potential memory corruption issues that could lead to
  invalid dispatch parameters
- Previously, unsafe reads could result in corrupted launch_description
  data in KeplerCompute::ProcessLaunch, causing invalid vkCmdDispatch
  calls
- By enforcing safe reads specifically for compute operations, we
  maintain performance for other GPU tasks while ensuring compute
  dispatch stability

This change requires >= High GPU accuracy level to take effect.
2025-04-04 03:40:49 +02:00
Fernando Sahmkow b206089ea7 Core: Clang format and other small issues. 2024-01-18 21:12:30 -05:00
Fernando Sahmkow 9db159da71 SMMU: Initial adaptation to video_core. 2024-01-18 21:12:30 -05:00
Fernando Sahmkow 94dd857cda VideoCore: Implement DispatchIndirect 2023-08-27 04:26:22 +02:00
Fernando Sahmkow 8208becc49 DMA Pusher: Fix regression caused by guest memory optimizations 2023-08-26 22:00:43 +02:00
Kelebek1 42638691b5 Use spans over guest memory where possible instead of copying data. 2023-07-02 23:09:48 +01:00
Fernando Sahmkow 4bf1ee5bdc DMAPusher: Improve collection of non executing methods 2023-01-01 16:43:57 -05:00
Fernando Sahmkow d2643a61c3 Revert Buffer cache changes and setup additional macros. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow 12a76465b9 MacroHLE: Reduce massive calculations on sizing estimation. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow b4fcb0b2b2 MacroHLE: Refactor MacroHLE system. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow b5b0ec9429 MacroHLE: Implement DrawIndexedIndirect & DrawArraysIndirect. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow f2f2784817 MacroHLE: Add MultidrawIndirect HLE Macro. 2023-01-01 16:43:57 -05:00
ameerj 4d5adfb3c9 scratch_buffer: Explicitly defing resize and resize_destructive functions
resize keeps previous data intact when the buffer grows
resize_destructive destroys the previous data when the buffer grows
2022-12-19 22:40:50 -05:00
ameerj 284582a0b2 dma_pusher: Rework command_headers usage
Uses ScratchBuffer and avoids overwriting the command_headers buffer with the prefetch_command_list
2022-12-19 18:08:04 -05:00
Fernando Sahmkow 42ef10060a VideoCore: Refactor fencing system. 2022-10-06 21:00:52 +02:00
Fernando Sahmkow 8847b6645c VideoCore: implement channels on gpu caches. 2022-10-06 21:00:51 +02:00
Morph 2b87305d31 general: Convert source file copyright comments over to SPDX
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-23 05:55:32 -04:00
ameerj b837219423 video_core: Reduce unused includes 2022-03-19 15:01:31 -04:00
Fernando Sahmkow d9fc759460 BufferCache: Additional download fixes. 2021-07-09 22:20:36 +02:00
ReinUsesLisp 2dfce2fca6 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
Lioncash 2f181b6a90 video_core: Resolve more variable shadowing scenarios
Resolves variable shadowing scenarios up to the end of the OpenGL code
to make it nicer to review. The rest will be resolved in a following
commit.
2020-12-04 16:19:09 -05:00
bunnei 0b6324b3a6 video_core: dma_pusher: Remove integrity check on command lists.
- This seems to cause softlocks in Breath of the Wild.
2020-11-07 00:08:19 -08:00
bunnei af7ab45b45 video_core: dma_pusher: Add support for integrity checks.
- Log corrupted command lists, rather than crash.
2020-11-01 01:52:38 -07:00
bunnei 69f4a66d23 video_core: dma_pusher: Add support for prefetched command lists. 2020-11-01 01:52:38 -07:00
David Marcec 67d7c0f45e DmaPusher: Remove dead code in step 2020-05-16 12:42:27 +10:00
Fernando Sahmkow 4c11487d1e VideoCore/GPU: Delegate subchannel engines to the dma pusher. 2020-04-27 22:07:21 -04:00
Fernando Sahmkow ef3a0ae64a DMAPusher: Propagate multimethod writes into the engines. 2020-04-23 08:52:55 -04:00
Fernando Sahmkow fda21f5a93 GPU: Delay Fences. 2020-04-22 11:36:08 -04:00
Fernando Sahmkow de53bc96c0 BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow c689dc6804 GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Lioncash 8a37c63b9e dma_pusher: Remove reliance on the global system instance
With this, the video core is now has no calls to the global system
instance at all.
2020-04-19 16:12:08 -04:00
ReinUsesLisp 005f5ca883 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp c2d3732176 gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
Fernando Sahmkow e82d641357 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
Fernando Sahmkow 7c50842226 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow fc9a1b81cb Dma_pusher: ASSERT on empty command_list
This is a measure to avoid crashes on command list reading as an empty
command_list is considered a NOP.
2019-05-19 10:48:31 -04:00
bunnei 673cfd89c1 Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
ReinUsesLisp 7a56d07632 video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
Fernando Sahmkow 994393bd02 Use ReadBlockUnsafe for fetyching DMA CommandLists 2019-04-16 11:22:34 -04:00
Lioncash 44d91d561a video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei d3f26c1546 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00