Commit Graph

7176 Commits

Author SHA1 Message Date
liamwhite b932f304ad Merge pull request #11544 from Kelebek1/reduce_stream_buffer_renderdoc
Allow GPUs without rebar to open multiple RenderDoc captures
2023-10-07 12:49:19 -04:00
liamwhite d14f8815b1 Merge pull request #11688 from Kelebek1/x8d42
Implement X8_D24 pixel format
2023-10-07 10:55:14 -04:00
liamwhite bab15ef979 Merge pull request #11684 from Kelebek1/disable_push_descriptor_maxwell
Disable push descriptor for Pascal and older nVidia architectures
2023-10-07 10:54:52 -04:00
Squall Leonhart c3658018b1 update shader to confirmed format copy 2023-10-07 18:28:09 +11:00
Kelebek1 a1df96e84d Allow GPUs without rebar to open multiple RenderDoc captures 2023-10-06 07:52:06 +01:00
Kelebek1 5063305487 Implement X8_D24 format 2023-10-06 00:58:30 +01:00
Kelebek1 39bcdb4fe4 Rework nvidia architecture detection, disable push descriptor for Pascal and older 2023-10-05 03:13:42 +01:00
Kelebek1 294ffa29cc Mark a buffer GPU modified after the buffers are confirmed, do not double synch them 2023-10-05 00:19:11 +01:00
Squall-Leonhart 15a624a6df lets not convert depth to greyscale since this makes the exhaust and tire smoke light gray/white
tiresmoke should be a darker gray.
2023-10-05 03:14:53 +11:00
Squall-Leonhart 680081ea94 Fix CI Formatting check 2023-10-04 19:12:08 +11:00
Squall-Leonhart ec6ba091cf Implements D32_Float to A8B8G8R8_UNORM format copy
Corrects some visual issues in games such as Disney SpeedStorm
2023-10-04 19:07:05 +11:00
Liam 79e055318c vk_present_manager: recreate surface on any surface loss 2023-10-02 19:07:18 -04:00
Liam 445d504f94 ci: fix new codespell errors 2023-10-02 18:03:05 -04:00
Fernando Sahmkow ef38379737 Query Cache: Fix memory leak. 2023-10-01 11:47:14 +02:00
Fernando S dcf6de7bdf Merge pull request #11622 from liamwhite/qcr-reg1
renderer_vulkan: fix query cache for homebrew
2023-09-29 06:01:18 +02:00
Kelebek1 dd2d450e3f Enable depth test on stencil clear path 2023-09-28 21:19:51 +01:00
liamwhite da04fbdc2e Merge pull request #11402 from FernandoS27/depth-bias-control
Vulkan: Implement Depth Bias Control
2023-09-28 09:35:37 -04:00
Liam cb11232753 renderer_vulkan: fix query cache for homebrew 2023-09-27 19:11:47 -04:00
GPUCode 30c67e5bb0 host_shaders: More proper handling of x2 MSAA copies 2023-09-25 09:20:32 -04:00
GPUCode 5529df01e3 renderer_vulkan: Implement MSAA copies 2023-09-25 09:20:32 -04:00
liamwhite 8936ff8f89 Merge pull request #11225 from FernandoS27/no-laxatives-in-santas-cookies
Y.F.C: Rework the Query Cache.
2023-09-25 09:18:29 -04:00
liamwhite dab6876db5 Merge pull request #11562 from GPUCode/srgb-madness
vk_texture_cache: Limit srgb block to transcoding only
2023-09-24 10:50:28 -04:00
liamwhite 70126192aa Merge pull request #11165 from Morph1984/ds_blit
vulkan_device: Return true if either depth/stencil format supports blit
2023-09-24 10:50:04 -04:00
Fernando Sahmkow e0477e40bd Query Cache: Fix Prefix Sums 2023-09-23 23:05:30 +02:00
Fernando Sahmkow 509ebe61c6 Query Cache: Fix behavior in Normal Accuracy 2023-09-23 23:05:30 +02:00
Fernando Sahmkow 6b0a777d19 Query Cache: Simplify Prefix Sum compute shader 2023-09-23 23:05:30 +02:00
Fernando Sahmkow c2880497ce Query Cache: Implement host side sample counting. 2023-09-23 23:05:30 +02:00
Fernando Sahmkow 170c82ae7f Query Cache: Fix guest side sample counting 2023-09-23 23:05:30 +02:00
Fernando Sahmkow 93cd3d8efd Query Cache: address issues 2023-09-23 23:05:30 +02:00
Fernando Sahmkow a8fe81b3be QueryCache: Implement dependant queries. 2023-09-23 23:05:29 +02:00
Fernando Sahmkow 2221256e90 Macro HLE: Add DrawIndirectByteCount 2023-09-23 23:05:29 +02:00
Fernando Sahmkow 5ea12207f3 Query Cachge: Fully rework Vulkan's query cache 2023-09-23 23:05:29 +02:00
Fernando Sahmkow 7f78d844ab Query Cache: Setup Base rework 2023-09-23 23:05:29 +02:00
liamwhite 8216a30e35 Merge pull request #11557 from GPUCode/brr-format
renderer_vulkan: Correct component order for A4B4G4R4_UNORM
2023-09-22 09:56:04 -04:00
Kelebek1 ac61186061 Fix DMA engine register offsets 2023-09-21 20:21:00 +01:00
GPUCode 20994b9e95 vk_texture_cache: Limit srgb block to transcoding only 2023-09-21 21:46:35 +03:00
GPUCode 400b9449ac renderer_vulkan: Correct component order for A4B4G4R4_UNORM 2023-09-21 15:33:44 +03:00
Squall-Leonhart 55e400acd9 Reuse part of my previous idea to to use num_levels to check within AdjustMipBlockSize
The partial revert was not enough for Tsukihime, this might do the trick
2023-09-20 03:27:13 +10:00
liamwhite 8f2351603d Merge pull request #11258 from Squall-Leonhart/Z16_Assert_Fix
Fix a logged assert in the format lookup table for Z16
2023-09-18 09:31:05 -04:00
Squall Leonhart 0ec7d7ec28 Partial revert of #10433
The If block in this change was causing some 2D textures to be treated as if their mip 0 was a 3D Slice, this could be ascertained as the same texture viewed from different distances would render fine, but then close up would look like a decoding failure.

It also resulted in some 3D ASTC textures not being scaled appropriate leading to broken graphical effects such as the jagged TOTK recall animation being a circle, as the If block was only accepting the image based on its original info without any adjustments applied.
2023-09-18 23:28:53 +10:00
Charles Lombardo a8e3f2652d android: Use 1 worker for shader compilation for all devices 2023-09-16 21:38:28 -04:00
Fernando Sahmkow dcf5c4bec0 Vulkan: add temporary workaround for AMDVLK 2023-09-16 11:59:20 -04:00
Fernando Sahmkow 6dcc62ae86 Vulkan: Implement Depth Bias Control 2023-09-16 11:58:55 -04:00
Kelebek1 517702f3f8 Look for the most recently modified image for present 2023-09-11 03:11:29 +01:00
liamwhite 4d34ba4d9d Merge pull request #11470 from GPUCode/bundle-vvl
android: Add option to bundle validation layer
2023-09-10 13:40:18 -04:00
GPUCode 065305c627 vk_buffer_cache: Respect max vertex bindings in BindVertexBuffers (#11471) 2023-09-10 02:19:45 +02:00
GPUCode 75213f8c49 renderer_vulkan: Remove debug report
* VVL has implemented the more modern alternative, thus we don't need to support it anymore
2023-09-08 23:28:46 +03:00
Feng Chen 666bdc1125 video_core: Fix d24r8/s8d24 convert shader build error in moltenvk 2023-09-07 18:01:36 +08:00
Feng Chen a356e6b8d5 video_core: Add missing scissor update when viewport scale offset disable 2023-09-07 18:01:30 +08:00
liamwhite 230e40a2d6 Merge pull request #11383 from FernandoS27/are-you-a-wabbit
Fix regressions that damaged compute indirect & use reinterpret for copies with different byteblocksizes
2023-09-02 14:42:42 -04:00
liamwhite a2971a3540 Merge pull request #11393 from FernandoS27/bayo-got-busted-up
Maxwell3D: Improve Index buffer size estimation.
2023-09-02 14:42:28 -04:00
Danila Malyutin beec962363 Use initial_frame to check interlaced flag
If final frame was transferred from GPU, it won't carry the props.

Fixes #11089
2023-08-28 00:48:53 +04:00
Fernando Sahmkow a571250875 Maxwell3D: Improve Index buffer size estimation. 2023-08-27 22:14:37 +02:00
Fernando S bbbe7c3b11 Merge pull request #11389 from FernandoS27/discard-fix
Buffer Cache: fix discard writes.
2023-08-27 04:26:59 +02:00
Fernando Sahmkow 94dd857cda VideoCore: Implement DispatchIndirect 2023-08-27 04:26:22 +02:00
Fernando Sahmkow 8fcab24644 Shader Recompiler: Auto stub special registers and dump pipelines on exception. 2023-08-27 03:47:04 +02:00
Fernando Sahmkow 47d921e04d Buffer Cache: fix discard writes. 2023-08-27 03:45:43 +02:00
liamwhite 0560b267b3 Merge pull request #11317 from Kelebek1/macro_dumps
Mark decompiled macros on dump, dump shaders after translation
2023-08-26 19:14:25 -04:00
Fernando Sahmkow 8208becc49 DMA Pusher: Fix regression caused by guest memory optimizations 2023-08-26 22:00:43 +02:00
Kelebek1 334a0eaa9c Mark decompiled macros as decompiled on dump, dump shaders after translation 2023-08-25 21:47:47 -04:00
Feng Chen ce0c210173 video_core: set vertex buffer num to 16, because mvk have when using more than 16 2023-08-23 23:22:55 +08:00
liamwhite ed224b8712 Merge pull request #11302 from vonchenplus/vulkan_macos
Add macos moltenvk bundle, Add copy moltevk dylib script
2023-08-22 13:10:26 -04:00
Feng Chen ec643e7e9d Add macos moltenvk bundle, Add copy moltevk dylib script 2023-08-22 10:22:28 +08:00
liamwhite 62fbba8575 Merge pull request #11149 from ameerj/astc-perf-prod
host_shaders: ASTC compute shader optimizations
2023-08-21 16:08:51 -04:00
Kelebek1 5d1961ad67 Masked depthstencil clears 2023-08-19 03:29:46 +01:00
liamwhite 1f584c14e7 Merge pull request #11278 from Kelebek1/dma_sync
Mark accelerated DMA destination buffers and images as GPU-modified
2023-08-18 09:12:27 -04:00
Feng Chen c8c4aa6ef7 video_core: Fix vulkan assert error 2023-08-18 14:40:11 +08:00
liamwhite 71bb69c1f4 Merge pull request #11282 from ameerj/glasm-xfb
gl_graphics_pipeline: GLASM: Fix transform feedback with multiple buffers
2023-08-14 09:19:20 -04:00
liamwhite ade4a97659 Merge pull request #11283 from ameerj/glasm-pipeline-detection
gl_graphics_pipeline: Fix GLASM storage buffer detection
2023-08-14 09:19:10 -04:00
liamwhite b3c9497bc2 Merge pull request #11263 from liamwhite/my-feature-branch
vulkan_device: disable features associated with unloaded extensions
2023-08-14 09:18:47 -04:00
Ameer J 01638cfe35 gl_texture_cache: Enable async downloads 2023-08-13 23:17:59 -05:00
Ameer J b260471154 gl_buffer_cache: Enable async downloads 2023-08-13 23:17:54 -05:00
Ameer J dc665a7024 gl_staging_buffer_pool: Refactor allocation variables into a struct 2023-08-13 23:17:47 -05:00
Ameer J c6aafc55ab gl_graphics_pipeline: Fix GLASM storage buffer detection 2023-08-13 17:06:45 -04:00
Ameer J 4e1813a2c3 gl_graphics_pipeline: GLASM: Fix transform feedback with multiple buffers 2023-08-13 16:50:01 -04:00
Kelebek1 5de54129b3 Mark accelerted DMA destination buffers and images as GPU-modified 2023-08-13 02:22:39 +01:00
Liam 7b579a7708 vulkan_device: disable features associated with unloaded extensions 2023-08-11 14:54:12 -04:00
Squall-Leonhart e7602b1012 Needed to make this an extra case so it didnt also start asserting in BOTW.
Thanks Liam
2023-08-11 08:45:15 +10:00
Squall Leonhart 443f35e5db Fix an assert in the format lookup table fir Z16
Came across this while looking into Asterix and Obelix XXL glitching
2023-08-11 08:18:54 +10:00
Liam 3e4076c2ac general: fix apple clang build 2023-08-09 22:38:37 -04:00
Ameer J 4c1cf94f3a flatten color_values 2023-08-09 18:45:52 -04:00
Ameer J 18328533d0 flatten encoding_values 2023-08-09 18:38:37 -04:00
Ameer J 6077bac118 flatten result vector 2023-08-09 18:34:57 -04:00
Ameer J 433d7cbd52 GetUnquantizedWeightVector 2023-08-09 17:45:39 -04:00
liamwhite c918db9514 Merge pull request #11216 from lat9nq/no-mesa-astc
gl_device: Detect Mesa to disable their ASTC
2023-08-07 11:34:22 -04:00
Ameer J 903280955a Revert "HACK: Avoid swizzling and reuploading ASTC image every frame"
This reverts commit 476ac42b61.
2023-08-06 14:55:05 -04:00
Ameer J 476ac42b61 HACK: Avoid swizzling and reuploading ASTC image every frame 2023-08-06 14:54:58 -04:00
Ameer J 3f114d8e5e Compute Replicate 2023-08-06 14:54:58 -04:00
Ameer J 166a17f4ba minor 2023-08-06 14:54:58 -04:00
Ameer J 4c40c8be29 undo uint 2023-08-06 14:54:58 -04:00
Ameer J b57854fb5f Revert "vulkan dims specialization"
This reverts commit e6243058f2269bd79ac8479d58e55feec2611e9d.
2023-08-06 14:54:58 -04:00
ameerj 9c5c5cbf06 vulkan dims specialization 2023-08-06 14:54:58 -04:00
Ameer J 790010da61 small_block opt 2023-08-06 14:54:58 -04:00
Ameer J cc6abe21ea remove TexelWeightParams 2023-08-06 14:54:57 -04:00
Ameer J 9085d26036 error/void extent funcs 2023-08-06 14:54:57 -04:00
Ameer J 827cb40765 more packing 2023-08-06 14:54:57 -04:00
Ameer J 20b7b4c2b7 Revert "uint result index"
This reverts commit 0e978786b5a8e7382005d8b1e16cfa12f3eeb775.
2023-08-06 14:54:57 -04:00
Ameer J aa28865ff7 Revert "bfe instead of mod"
This reverts commit 86006a3b09e8a8c17d2ade61be76736a79e3f58a.
2023-08-06 14:54:57 -04:00
Ameer J 74d905d5cd Revert "global endpoints"
This reverts commit d8f5bfd1df2b7469ef6abcee182aa110602d1751.
2023-08-06 14:54:57 -04:00
Ameer J 97810e725b global endpoints 2023-08-06 14:54:57 -04:00
Ameer J a08e31d053 bfe instead of mod 2023-08-06 14:54:57 -04:00
Ameer J 48862223ae uint result index 2023-08-06 14:54:57 -04:00
Ameer J d14b1929bc amd opts 2023-08-06 14:54:57 -04:00
Ameer J 6678ade989 gl 2023-08-06 14:54:57 -04:00
Ameer J 950680f29f const, pack result_vector and replicate tables,
undo amd opts
2023-08-06 14:54:57 -04:00
Ameer J dc851097e6 minor redundancy cleanup 2023-08-06 14:54:57 -04:00
Ameer J aa1ab95ea3 extractbits robustness 2023-08-06 14:54:57 -04:00
Ameer J cf252bb6d3 reuse vectors memory 2023-08-06 14:54:57 -04:00
Ameer J 2dcddf8fb2 EncodingData pack 2023-08-06 14:54:57 -04:00
Ameer J 81f838f0fd flattening 2023-08-06 14:54:57 -04:00
Ameer J 42e19b3833 weights refactor 2023-08-06 14:54:57 -04:00
Ameer J de6bc91933 params.max_weight 2023-08-06 14:54:57 -04:00
Ameer J e582e0032c skip bits 2023-08-06 14:54:57 -04:00
Ameer J 96261ab592 restrict 2023-08-06 14:54:57 -04:00
lat9nq f34bc9cc98 gl_device: Filter more specifically for slow ASTC
Adds a check to find if the renderer is Intel DG (i.e. DG2).

gl_device: Detect Mesa to disable their ASTC

In our testing, our own ASTC decoder has shown itself to perform faster
than the included one from the driver. Disable theirs when Mesa is
detected.

Mesa detection depends on the vendor string. Some drivers never appear
outside of *nix contexts, so only check those in the *nix context.

gl_device: Internalize Intel DG detection
2023-08-05 15:19:16 -04:00
liamwhite ba751d2200 Merge pull request #11212 from Kelebek1/shader_stuff
Fix various misc pipeline/shader things
2023-08-05 12:58:39 -04:00
Kelebek1 770130b6c2 Fix shader dumps with nvdisasm
skip fragment shaders when rasterizer is disabled
initialize env_ptrs
2023-08-03 15:30:27 +01:00
Ameer J 09cb3bf896 vulkan_device: Fix subgroup_size_control detection on Vulkan 1.3 2023-08-02 20:45:03 -04:00
Ameer J 7f86685948 vulkan_device: Fix VK_EXT_subgroup_size_control detection 2023-08-02 19:25:14 -04:00
liamwhite cf4994e81e Merge pull request #11202 from abouvier/vulkan-config
vulkan: centralize config
2023-08-02 14:26:03 -04:00
liamwhite 28b236b988 Merge pull request #10839 from lat9nq/pgc-plus
general: Reimplement per-game configurations
2023-08-02 14:25:52 -04:00
Liam da8c1cfbdd vulkan_device: disable EDS3 blending on all AMD drivers 2023-08-01 20:46:05 -04:00
Alexandre Bouvier 9a86e4e431 vulkan: centralize config 2023-08-02 00:05:14 +02:00
Morph cc8aba1380 vulkan_device: Test depth stencil blit support by format 2023-07-31 19:14:20 -04:00
liamwhite 0bcd04d9a3 Merge pull request #11188 from abouvier/vma-fix
vma: enable options everywhere
2023-07-31 15:28:35 -04:00
liamwhite 5bb1371404 Merge pull request #11169 from GPUCode/desc-stuff
vk_descriptor_pool: Disallow descriptor set free
2023-07-31 09:11:19 -04:00
Alexandre Bouvier f663418ff5 vma: enable options everywhere 2023-07-31 13:01:21 +02:00
Moonlacer 00ba53057f Formatting fix 2023-07-30 23:02:07 -05:00
Moonlacer 4aa1ebb802 Match log warning 2023-07-30 22:50:22 -05:00
Moonlacer 699ab3050c Formatting fix 2023-07-30 04:29:51 -05:00
Moonlacer 30a5e8e165 Address feedback and change log warning 2023-07-30 04:01:29 -05:00
Moonlacer 3ca86ca6b2 Revert "Revert "Blacklist EDS3 blending from new AMD drivers"" 2023-07-30 00:21:51 -05:00
GPUCode 25bc2dbedb vk_descriptor_pool: Disallow descriptor set free 2023-07-27 18:08:56 +03:00
Morph a8f6941fd6 vulkan_device: Return true if either depth/stencil format supports blit
On devices that don't support D24S8 but supports D32S8, this should still return true if D32S8 supports src and dst blit
2023-07-26 20:21:37 -04:00
Moonlacer 9d21ddd2c1 Revert "Blacklist EDS3 blending from new AMD drivers" 2023-07-26 15:02:48 -05:00
liamwhite a28a0c47f8 Merge pull request #10990 from comex/ubsan
Fixes and workarounds to make UBSan happier on macOS
2023-07-26 10:33:28 -04:00
liamwhite ddb55725a1 Merge pull request #11098 from GPUCode/texel-buffers
buffer_cache: Increase number of texture buffers
2023-07-22 11:17:27 -04:00
lat9nq ed14cd8748 settings,opengl,yuzu-qt: Fix AA, Filter maximums
The new enum macros don't support setting values directly.
For LastAA and LastFilter, this means we need a simpler approach to loop
around the toggle in the frontend...
2023-07-21 10:56:55 -04:00
lat9nq 78f92086ca settings,general: Rename non-confirming enums 2023-07-21 10:56:54 -04:00
lat9nq 4a5f3e4733 configure_graphics_advance: Generate UI at runtime
We can iterate through the AdvancedGraphics settings and generate the UI
during runtime. This doesn't help runtime efficiency, but it helps a ton
in reducing the amount of work a developer needs in order to add a new
setting.
2023-07-21 10:56:07 -04:00
lat9nq fc30b04714 settings,video_core: Consolidate ASTC decoding options
Just puts them all neatly into one place.
2023-07-21 10:56:07 -04:00
lat9nq aa21a2ea3c vk_buffer_cache: Format 2023-07-18 19:56:20 -04:00
lat9nq 30e4e8c2f4 general: Silence -Wshadow{,-uncaptured-local} warnings
These occur in the latest commits in LLVM Clang.
2023-07-18 19:31:35 -04:00
GPUCode 7e9f75453f buffer_cache: Increase number of texture buffers 2023-07-15 23:09:58 +03:00
comex 85d77f636c Fixes and workarounds to make UBSan happier on macOS
There are still some other issues not addressed here, but it's a start.

Workarounds for false-positive reports:

- `RasterizerAccelerated`: Put a gigantic array behind a `unique_ptr`,
  because UBSan has a [hardcoded limit](https://stackoverflow.com/questions/64531383/c-runtime-error-using-fsanitize-undefined-object-has-a-possibly-invalid-vp)
  of how big it thinks objects can be, specifically when dealing with
  offset-to-top values used with multiple inheritance.  Hopefully this
  doesn't have a performance impact.

- `QueryCacheBase::QueryCacheBase`: Avoid an operation that UBSan thinks
  is UB even though it at least arguably isn't.  See the link in the
  comment for more information.

Fixes for correct reports:

- `PageTable`, `Memory`: Use `uintptr_t` values instead of pointers to
  avoid UB from pointer overflow (when pointer arithmetic wraps around
  the address space).

- `KScheduler::Reload`: `thread->GetOwnerProcess()` can be `nullptr`;
  avoid calling methods on it in this case.  (The existing code returns
  a garbage reference to a field, which is then passed into
  `LoadWatchpointArray`, and apparently it's never used, so it's
  harmless in practice but still triggers UBSan.)

- `KAutoObject::Close`: This function calls `this->Destroy()`, which
  overwrites the beginning of the object with junk (specifically a free
  list pointer).  Then it calls `this->UnregisterWithKernel()`.  UBSan
  complains about a type mismatch because the vtable has been
  overwritten, and I believe this is indeed UB.  `UnregisterWithKernel`
  also loads `m_kernel` from the 'freed' object, which seems to be
  technically safe (the overwriting doesn't extend as far as that
  field), but seems dubious.  Switch to a `static` method and load
  `m_kernel` in advance.
2023-07-15 12:00:28 -07:00
Alexandre Bouvier dad3ef76a2 cmake: allow using system VMA library 2023-07-12 04:51:45 +02:00
bunnei ab18aeb500 Merge pull request #10996 from Kelebek1/readblock_optimisation
Use spans over guest memory where possible instead of copying data
2023-07-10 18:54:19 -07:00
liamwhite 5688b55070 Merge pull request #10994 from liamwhite/ue4-preferred
vulkan_common: use device local preferred for image memory
2023-07-05 09:23:56 -04:00
liamwhite 81a137aa71 Merge pull request #11012 from gidoly/metroid-fix
Fix regression by unreal engine fix pr #11009
2023-07-05 09:23:34 -04:00
bunnei 66a20ecbc7 video_core: vulkan_device: Disable timeline semaphore on Turnip, fix qcom version check. 2023-07-03 19:25:06 -07:00
bunnei d8cda2c0b6 Merge pull request #10964 from bunnei/gpu-remove-qcom-check
video_core: vulkan_device: Fix S8Gen2 dynamic state checks.
2023-07-03 16:59:29 -07:00
bunnei 3bf2a14213 video_core: vulkan_device: Change to driver version check. 2023-07-03 14:25:06 -07:00
gidoly 66cb683f1e oops re open 2023-07-03 20:25:23 +09:00
Kelebek1 42638691b5 Use spans over guest memory where possible instead of copying data. 2023-07-02 23:09:48 +01:00
liamwhite d81539ed2d Merge pull request #10479 from GPUCode/format-list
Add support for VK_KHR_image_format_list
2023-07-02 17:38:21 -04:00
liamwhite 1bd420593c Merge pull request #10942 from FernandoS27/android-is-a-pain-in-the-a--
Memory Tracking: Add mechanism to register small writes when gpu page is contested by GPU
2023-07-02 11:29:01 -04:00
Liam aa2743de67 vulkan_common: use device local preferred for image memory 2023-07-01 23:44:57 -04:00
Liam c9cbfadcdc Revert "texture_cache: Fix incorrect logic for AccelerateDMA"
This reverts commit e9c07146d8.
2023-07-01 23:37:50 -04:00
liamwhite 2a11936fa3 Merge pull request #10984 from comex/cob
Minor cleanup in BufferCacheRuntime::ReserveNullBuffer
2023-07-01 22:38:33 -04:00
liamwhite 004b9609b0 Merge pull request #10974 from Steveice10/macos_vk
vulkan: Improvements to macOS surface creation
2023-07-01 22:38:26 -04:00
liamwhite ab339d1af3 Merge pull request #10970 from Morph1984/thing
general: Misc changes that did not deserve their own PRs
2023-07-01 22:38:18 -04:00
comex 1e1b0dccaf Minor cleanup in BufferCacheRuntime::ReserveNullBuffer
As far as I can tell, there is no reason to OR this bit in separately.
2023-07-01 12:00:25 -07:00
GPUCode 4270b443f8 renderer_vulkan: Fix some missing view formats
* Many times the format itself wouldn't have been added to the list causing device losses for nvidia GPUs

* Also account for ASTC acceleration storage views
2023-07-01 16:03:35 +03:00
GPUCode b7e726669e renderer_vulkan: Add support for VK_KHR_image_format_list 2023-07-01 16:03:29 +03:00
Steveice10 19a0345f69 vulkan: Use newer VK_EXT_metal_surface to create surface for MoltenVK. 2023-06-30 23:46:03 -07:00
Morph 10f95299eb maxwell_dma: Specify dst_operand.pitch instead of a temp var 2023-06-30 21:49:59 -04:00
Morph b8004b2472 general: Use ScratchBuffer where possible 2023-06-30 21:49:59 -04:00
Fernando S 9cb5d582d6 Merge pull request #10953 from FernandoS27/oh-oopsies-yfc
Texture cache: Fix YFC regression due to code testing
2023-06-30 20:25:09 +02:00
Fernando S 068fdeb0e8 Merge pull request #10956 from FernandoS27/pikmin-another-game-ill-hate
AccelerateDMA: Don't accelerate 3D texture DMA operations
2023-06-30 09:37:07 +02:00
bunnei bdf171633f video_core: vulkan_device: Scope S8Gen2 checks to just Qualcomm. 2023-06-29 18:41:38 -07:00
bunnei de534a8b82 video_core: vulkan_device: Fix S8Gen2 dynamic state checks. 2023-06-29 17:37:42 -07:00
Fernando Sahmkow 71c38a6eb3 AccelerateDMA: Don't accelerate 3D texture DMA operations 2023-06-29 17:23:29 +02:00
Fernando Sahmkow 8efc8dba3e Texture cache: Fix YFC regression due to code testing 2023-06-29 11:58:45 +02:00
Matías Locatti 64640b6d07 Blacklist EDS3 blending from new AMD drivers 2023-06-28 20:10:27 -03:00
Fernando Sahmkow 4f68a8f45a Memory Tracking: Optimize tracking to only use atomic writes when contested with the host GPU 2023-06-28 21:32:45 +02:00
Fernando Sahmkow 7ae0cdbb09 MemoryTracking: Initial setup of atomic writes. 2023-06-28 19:34:21 +02:00
GPUCode 9e58301aec renderer_vulkan: Prevent crashes when blitting depth stencil 2023-06-27 18:00:09 -07:00
GPUCode 5196f05cec video_core: Add BCn decoding support 2023-06-27 18:00:09 -07:00
GPUCode 8a829a12b6 renderer_vulkan: Add more feature checking 2023-06-27 18:00:09 -07:00
GPUCode d8a98f124a renderer_vulkan: Don't assume debug tool with debug renderer
* Causes crashes because mali drivers don't support debug utils
2023-06-27 18:00:09 -07:00
GPUCode 5011526a94 renderer_vulkan: Bump minimum SPIRV version
* 1.3 is guaranteed on all 1.1 drivers
2023-06-27 18:00:09 -07:00
GPUCode 035b4eaf46 renderer_vulkan: Respect viewport limit 2023-06-27 18:00:09 -07:00
GPUCode 1af4dc2ed7 renderer_vulkan: Don't add transform feedback flag if unsupported 2023-06-27 18:00:09 -07:00
GPUCode 843d93b951 renderer_vulkan: Add suport for debug report callback 2023-06-27 18:00:09 -07:00
liamwhite 8a679be44b Merge pull request #10867 from Kelebek1/dma_safe
Use safe reads in DMA engine
2023-06-27 11:21:47 -04:00
liamwhite 4f21c05522 Merge pull request #10473 from GPUCode/vma
Use vulkan memory allocator
2023-06-27 11:21:36 -04:00
GPUCode 7a8631cd45 externals: Use cmake subdirectory 2023-06-26 18:59:24 +03:00
Kelebek1 c80b6bfb83 Use safe reads in DMA engine 2023-06-26 11:34:02 +01:00
ameerj 5ae4d9983b OpenGL: Limit lmem warmup to NVIDIA
🐸
2023-06-25 19:06:51 -04:00
ameerj 28cecc6cd8 shaders: Track local memory usage 2023-06-25 18:59:33 -04:00
ameerj b2349d75f4 OpenGL: Add Local Memory warmup shader 2023-06-25 18:43:23 -04:00
liamwhite fa8419f54e Merge pull request #10859 from liamwhite/no-more-atomic-wait
general: remove atomic signal and wait
2023-06-23 09:27:14 -04:00
GPUCode c813876c5a vulkan_common: Remove required flags
* Allows VMA to fallback to system RAM instead of crashing
2023-06-22 20:03:12 +03:00
Liam db40a2f430 general: remove atomic signal and wait 2023-06-22 09:25:23 -04:00
Kelebek1 c7430e51e3 Remove memory allocations in some hot paths 2023-06-22 08:05:10 +01:00
bunnei 72a469b967 Merge pull request #10086 from Morph1984/coretiming-ng-1
core_timing: Use CNTPCT as the guest CPU tick
2023-06-21 21:12:46 -07:00
bunnei 5a5080ba4e Merge pull request #10777 from liamwhite/no-barrier
video_core: optionally skip barriers on feedback loops
2023-06-21 21:10:08 -07:00
liamwhite 10f2beb17a Merge pull request #10818 from vonchenplus/render_target_samples
video_core: add samples check when find render target
2023-06-20 09:55:23 -04:00
liamwhite 5df094850f Merge pull request #10835 from lat9nq/intel-restrict-compute-disable
vulkan_device: Restrict compute disable only to affected Intel drivers
2023-06-20 09:55:14 -04:00
liamwhite 50fe67c0f1 Merge pull request #10840 from Kelebek1/unbug_blinks_brain
Use current GPU address when unmapping GPU pages, not the base
2023-06-20 09:55:01 -04:00
toast2903 f68b01a8cf vulkan_device: Remove brace initializer
Co-authored-by: Tobias <thm.frey@gmail.com>
2023-06-19 17:35:12 -04:00
lat9nq 1ad8df763f video_core: Check broken compute earlier
Checks it as the system is determining what settings to enable. Reduces
the need to check settings while the system is running.
2023-06-19 17:33:30 -04:00
Kelebek1 6bd6e24d6e Use current GPU address when unmapping GPU pages, not the base 2023-06-19 00:19:50 +01:00
lat9nq a74f77bbbc video_core: Formalize HasBrokenCompute
Also limits it to only affected Intel proprietrary driver versions.

vulkan_device: Move broken compute determination

vk_device: Remove errant back quote
2023-06-18 16:15:47 -04:00
liamwhite 1ddf844419 Merge pull request #10829 from lat9nq/remove-external-mem
vulkan_device: Remove external memory extension
2023-06-18 09:43:03 -04:00
liamwhite 2f65ed20b7 Merge pull request #10798 from vonchenplus/draw_texture_scale
video_core: drawtexture support upscale
2023-06-18 09:42:41 -04:00
liamwhite e48b4b0b36 Merge pull request #10809 from Kelebek1/reduce_vertex_bindings
Synchronize vertex buffer even when it doesn't require binding
2023-06-18 09:42:32 -04:00
GPUCode 7b3718dc9c renderer_vulkan: Add missing initializers 2023-06-18 14:14:03 +03:00
GPUCode 66d3a1c5c7 renderer_vulkan: Use VMA for buffers 2023-06-18 12:45:18 +03:00
GPUCode d84d595dab renderer_vulkan: Use VMA for images 2023-06-18 12:45:18 +03:00
GPUCode fd9b920d2d memory_allocator: Remove OpenGL interop
* Appears to be unused atm
2023-06-18 12:45:18 +03:00
lat9nq 0a4650cd2b externals: Add vma and initialize it
video_core: Move vma implementation to library
2023-06-18 12:45:12 +03:00
lat9nq 38fe34a43f vulkan_device: Remove external memory extension
Unused in yuzu. Enables yuzu to boot games in Wine using Vulkan.
2023-06-18 01:20:08 -04:00
Liam e62d452bd9 renderer_vulkan: add missing include 2023-06-17 23:57:47 -04:00
Fernando S 06f47d34c8 Merge pull request #10744 from Wollnashorn/af-for-all
video_core: Improved anisotropic filtering heuristics
2023-06-18 00:02:05 +02:00
Kelebek1 547e837f78 Synchronize vertex buffer even when it doesn't require binding 2023-06-17 17:47:00 -04:00
FengChen 255ab12789 video_core: add samples check when find render target 2023-06-17 23:48:51 +08:00
Wollnashorn e10113e853 video_core: Only apply AF to 2D (array) image types 2023-06-17 14:20:44 +02:00
Wollnashorn 62b0b6bde0 video_core: Removed AF for all mip modes option as it's default now 2023-06-17 11:19:39 +02:00
bunnei 853249121d Merge pull request #10783 from liamwhite/memory
video_core: preallocate fewer IR blocks
2023-06-16 16:53:25 -07:00
Feng Chen c362895572 video_core: drawtexture support upscale 2023-06-16 20:51:15 +08:00
Wollnashorn 815f54385a video_core: Use sampler IDs instead pointers in the pipeline config
The previous approach of storing pointers returned by `GetGraphicsSampler`/`GetComputeSampler` caused UB, as these functions can cause reallocation of the sampler slot vector and therefore invalidate the pointers
2023-06-16 13:45:14 +02:00
bunnei 837d487905 Merge pull request #10790 from liamwhite/arm-driver-moment
vulkan_device: disable extended_dynamic_state2 on ARM drivers
2023-06-15 18:34:31 -07:00
bunnei 981332d727 Merge pull request #10775 from liamwhite/cb2
renderer_vulkan: propagate conditional barrier support
2023-06-15 17:37:03 -07:00
Wollnashorn eff77dae59 video_core: Fallback to default anisotropy instead to 1x anisotropy 2023-06-15 23:16:26 +02:00
Wollnashorn e405fb1c72 video_core: Disable AF for non-color image formats 2023-06-15 20:59:33 +02:00
Wollnashorn 1f7c69934d video_core: Fixed compilation errors because of name shadowing 2023-06-15 18:46:40 +02:00
Liam 0875e158fe vulkan_device: disable extended_dynamic_state2 on ARM drivers 2023-06-15 12:29:54 -04:00
Wollnashorn 1844cad9d4 video_core: Add per-image anisotropy heuristics (format & mip count) 2023-06-15 18:19:32 +02:00
Liam c913c891e0 video_core: preallocate fewer IR blocks 2023-06-14 21:37:57 -04:00
Liam d0837e10ae video_core: optionally skip barriers on feedback loops 2023-06-14 14:11:46 -04:00
Liam e77190ffab renderer_vulkan: propagate conditional barrier support 2023-06-14 10:49:40 -04:00
Wollnashorn 04782a922d video_core: Apply AF only to samplers with normal LOD range [0, 1+x] 2023-06-14 13:27:27 +02:00
Wollnashorn 9f46c7724b video_core: Fix default anisotropic heuristic 2023-06-14 11:21:22 +02:00
Wollnashorn 614f8a0429 video_core: Never apply AF to None mipmap mode
Should fix some artifacts with the "apply anisotropic filtering for all mipmap modes" option
2023-06-14 03:57:39 +02:00
Wollnashorn ff4c4a45e6 video_core: Disable anisotropic filtering for samplers with depth compare 2023-06-13 21:32:32 +02:00
Morph 9da90de908 buffer_cache_base: Specify buffer type in HostBindings
Avoid reinterpret-casting from void pointer since the type is already known at compile time.
2023-06-13 00:59:42 -04:00
Wollnashorn 6f1fb4c28a video_core: Option to apply anisotropic filtering for all mipmap modes 2023-06-13 03:21:01 +02:00
liamwhite aab6e3098d Merge pull request #10675 from liamwhite/scaler
image_info: adjust rescale thresholds and refactor constant use
2023-06-12 21:16:36 -04:00
Matías Locatti 28e1429daf Merge pull request #10699 from liamwhite/conditional-barrier
shader_recompiler: remove barriers in conditional control flow when device lacks support
2023-06-12 16:50:59 -03:00
bunnei d40c8428a0 Merge pull request #10693 from liamwhite/f64-to-f32
shader_recompiler: translate f64 to f32 when unsupported on host
2023-06-12 12:46:54 -07:00
bunnei 866b7c0632 Merge pull request #10668 from Kelebek1/reduce_vertex_bindings
Combine vertex/transform feedback buffer binding into a single call
2023-06-11 11:33:48 -07:00
bunnei e1402935d9 android: Fix screen orientation & blurriness. 2023-06-10 15:13:06 -07:00
Liam 947a4f6141 shader_recompiler: translate f64 to f32 when unsupported on host 2023-06-10 12:38:49 -04:00
Liam b646ac2908 shader_recompiler: remove barriers in conditional control flow when device lacks support 2023-06-10 12:30:39 -04:00
Liam 2046bead0e image_info: adjust rescale thresholds and refactor constant use 2023-06-08 17:46:40 -04:00
Liam 7e5be01a48 vk_blit_screen: use higher bit depth for fxaa 2023-06-08 11:27:57 -04:00
Kelebek1 ac23abacac Combine vertex/transform feedback buffer binding into a single call 2023-06-08 12:13:27 +01:00
Morph 1b83c7eab4 (wall, native)_clock: Add GetGPUTick
Allows us to directly calculate the GPU tick without double conversion to and from the host clock tick.
2023-06-07 21:44:42 -04:00
Morph 2856fadaa0 core_timing: Use CNTPCT as the guest CPU tick
Previously, we were mixing the raw CPU frequency and CNTFRQ.
The raw CPU frequency (1020 MHz) should've never been used as CNTPCT (whose frequency is CNTFRQ) is the only counter available.
2023-06-07 21:44:42 -04:00
liamwhite 06a6786a42 Merge pull request #10635 from mrcmunir/l4t-tx1-nvidia
Make VK_EXT_robustness2 optional
2023-06-07 14:04:14 -04:00
liamwhite 93372f503a Merge pull request #10476 from ameerj/gl-memory-maps
OpenGL: Make use of persistent buffer maps in buffer cache
2023-06-07 14:03:57 -04:00
liamwhite c2958ae5b6 Merge pull request #10583 from ameerj/ill-logic
AccelerateDMA: Fix incorrect check in Buffer<->Texture copies
2023-06-07 14:03:40 -04:00
Carlos Estrague / Mrc_munir 1de6e7a3e5 Updated to lexicographical order suggestions 2023-06-06 19:33:52 +02:00
Carlos Estrague / Mrc_munir e450a7d28c Make VK_EXT_robustness2 optional
For some reason nvidia implemented Vulkan 1.2 supported without support for VK_EXT_robustness2 in tegra X1/X2 .

Fix vulkan work in TX1/TX2  L4T drivers .
2023-06-06 06:32:47 +02:00
bunnei f4dd94ab58 android: vk_presentation_manager: Fix unusued needs_recreation. 2023-06-03 00:06:08 -07:00
bunnei 8e9813a618 android: vk_turbo_mode: Remove unnecessary device recreation.
- Fixes a rare crash.
2023-06-03 00:06:08 -07:00
bunnei fb362f0b6e android: renderer_vulkan: Fix crash with surface recreation. 2023-06-03 00:06:07 -07:00
bunnei d57495d3c0 android: Fix presentation layout on foldable and tablet devices. 2023-06-03 00:06:07 -07:00
bunnei 445a1f1b18 video_core: vk_rasterizer: Decrease draw dispatch count for Android. 2023-06-03 00:06:04 -07:00
bunnei 230dd8192d android: GPU: Enable async presentation, increase frames in flight. 2023-06-03 00:06:03 -07:00
bunnei c55db7e03d android: vulkan_device: Skip BGR565 emulation on S8gen2. 2023-06-03 00:06:01 -07:00
bunnei 4e2cdf74a3 android: vulkan_device: Only compile OverrideBcnFormats when used. 2023-06-03 00:06:00 -07:00
Liam 5d9250daf4 android: remove spurious warnings about BCn formats when patched with adrenotools 2023-06-03 00:06:00 -07:00
bunnei ac32fd08e9 android: video_core: Disable some problematic things on GPU Normal. 2023-06-03 00:06:00 -07:00
bunnei baa09b9cef android: video_core: Disable problematic compute shaders.
- Fixes #104.
2023-06-03 00:06:00 -07:00
bunnei 2650faea9d android: vulkan: Recreate surface after suspension & adapt to async. presentation. 2023-06-03 00:05:59 -07:00
bunnei 3571f28cde video_core: Enable support_descriptor_aliasing on Turnip, disable storage atomic otherwise. 2023-06-03 00:05:58 -07:00
bunnei 2810793b17 android: vulkan: Disable vertex_input_dynamic_state on Qualcomm. 2023-06-03 00:05:51 -07:00
bunnei e8efc6121d android: vulkan_debug_callback: Ignore many innocuous errors. 2023-06-03 00:05:50 -07:00
bunnei bf598273e9 android: vulkan_device: Disable VK_EXT_custom_border_color on Adreno.
- Causes crashes on sampler creation with Super Mario Odyssey.
2023-06-03 00:05:48 -07:00
Liam d54605d1a5 build: only enable adrenotools on arm64 2023-06-03 00:05:43 -07:00
liushuyu 44a629e584 video_core: fix clang-format errors 2023-06-03 00:05:33 -07:00
bunnei ea54161dbf video_core: vulkan_device: Correct error message for unsuitable driver. 2023-06-03 00:05:32 -07:00
bunnei 27250ee9ad android: vulkan: Implement adrenotools turbo mode. 2023-06-03 00:05:32 -07:00
bunnei 6ae51eff8a android: vulkan_device: Disable VK_EXT_extended_dynamic_state2 on Qualcomm.
- Newer drivers report this as supported, but it is broken.
2023-06-03 00:05:32 -07:00
bunnei 74e76421e6 android: native: Add support for custom Vulkan driver loading. 2023-06-03 00:05:31 -07:00
bunnei 56600190e4 core: frontend: Refactor GraphicsContext to its own module. 2023-06-03 00:05:31 -07:00
Billy Laws cfbe4b09eb Avoid using VectorExtractDynamic for subgroup mask on Adreno GPUs
This crashes their shader compiler for some reason.
2023-06-03 00:05:31 -07:00
Billy Laws 2beb3051c1 Implement scaled vertex buffer format emulation
These formats are unsupported by mobile GPUs so they need to be emulated in shaders instead.
2023-06-03 00:05:31 -07:00
Billy Laws 58d420937c Disable push descriptors on adreno drivers
Regular descriptors are around 1.5x faster to update.
2023-06-03 00:05:31 -07:00
Billy Laws ca2c3a6d5a Disable VK_EXT_extended_dynamic_state on mali 2023-06-03 00:05:31 -07:00
Billy Laws b2b069279e Disable multithreaded pipeline compilation on Qualcomm drivers
This causes crashes during compilation on several 6xx and 5xx driver versions.
2023-06-03 00:05:31 -07:00
Liam 46927d217c externals: add adrenotools for bcenabler 2023-06-03 00:05:28 -07:00
bunnei b3a74d7f73 video_core: vulkan_device: Device initialization for Adreno. 2023-06-03 00:05:28 -07:00
bunnei ce06e9e7fc video_core: vk_pipeline_cache: Disable support_descriptor_aliasing on Android. 2023-06-03 00:05:28 -07:00
bunnei f6f470fb4b video_core: vk_swapchain: Fix image format for Android. 2023-06-03 00:05:28 -07:00
bunnei 189bb7602c video_core: vk_blit_screen: Rotate viewport for Android landscape. 2023-06-03 00:05:27 -07:00
bunnei 6549cf8bd0 cmake: Integrate bundled FFmpeg for Android. 2023-06-03 00:05:26 -07:00
ameerj e9c07146d8 texture_cache: Fix incorrect logic for AccelerateDMA 2023-06-02 18:07:52 -04:00
liamwhite cd9f88e483 Merge pull request #10091 from Kelebek1/bc_bugggggg
Fix buffer overlap checking skipping a page for stream score right expand
2023-06-01 09:06:07 -04:00
liamwhite 90a3955fbb Merge pull request #10474 from GPUCode/you-left-me-waiting
Remove timeline semaphore wait
2023-06-01 09:05:30 -04:00
Kelebek1 3da7eafba7 Skip BufferCache tickframe with no channel state set 2023-05-30 21:57:13 +01:00
liamwhite a4a3df9e69 Merge pull request #10483 from ameerj/gl-cpu-astc
gl_texture_cache: Fix ASTC CPU decoding with compression disabled
2023-05-28 13:18:31 -04:00
liamwhite 01008297aa Merge pull request #10283 from danilaml/support-interlaced-videos
Add support for deinterlaced video playback
2023-05-28 13:17:58 -04:00
ameerj 514c224679 gl_texture_cache: Fix ASTC CPU decoding with compression disabled
gl_format was incorrectly being overwritten when compression was disabled
2023-05-28 13:14:51 -04:00
ameerj 41dfd9e4ec gl_staging_buffers: Optimization to reduce fence waiting 2023-05-28 00:38:47 -04:00
ameerj 8d223e8092 OpenGL: Make use of persistent buffer maps in buffer cache downloads
Persistent buffer maps were already used by the texture cache, this extends their usage for the buffer cache.

In my testing, using the memory maps for uploads was slower than the existing "ImmediateUpload" path, so the memory map usage is limited to downloads for the time being.
2023-05-28 00:38:46 -04:00
GPUCode 0dc4778654 renderer_vulkan: Remove timeline semaphore wait 2023-05-28 02:39:44 +03:00
Kelebek1 62c747f8a1 Move buffer bindings to per-channel state 2023-05-27 17:04:18 +01:00
Matías Locatti ebcfe440ba Merge pull request #10414 from liamwhite/anv-push-descriptor
vulkan_device: Enable VK_KHR_push_descriptor on newer ANV
2023-05-26 17:36:37 -03:00
Matías Locatti 9eab38567c Merge pull request #10418 from liamwhite/blink-and-youll-miss-it
texture_cache: process aliases and overlaps in the correct order
2023-05-26 17:36:09 -03:00
Kelebek1 eea071bf87 Fix buffer overlap checking skipping a page for stream score right expand 2023-05-26 10:35:46 +01:00
Liam 6c77a107a4 video_core: don't garbage collect during configuration 2023-05-25 12:03:12 -04:00
bunnei 62301e0f65 Merge pull request #10435 from FernandoS27/gotta-clean-mess-ups
Texture cache: revert wrong acceleration assumption
2023-05-24 21:00:53 -07:00
Fernando Sahmkow b0e5aa6725 Texture cache: revert wrong acceleration assumption 2023-05-24 10:52:02 +02:00
Fernando Sahmkow 769b1f0264 Texture Cache Util: Fix block depth adjustment on slices. 2023-05-24 10:06:58 +02:00
Fernando Sahmkow ce9a97ca48 texture_cache: process aliases and overlaps in the correct order 2023-05-24 09:53:42 +02:00
Fernando S 72c3cf6b32 Merge pull request #10422 from liamwhite/gc
video_core: tune garbage collection aggressiveness
2023-05-24 03:58:49 +02:00
Fernando S 178e8a6b0e Merge pull request #10398 from liamwhite/bcn
video_core: add ASTC recompression
2023-05-24 03:55:45 +02:00
Liam 4a54cea69a video_core: tune garbage collection aggressiveness 2023-05-23 12:55:14 -04:00
Liam 011dfe1db7 textures: add BC1 and BC3 compressors and recompression setting 2023-05-23 12:54:40 -04:00
liamwhite a496e853ff Merge pull request #10388 from GPUCode/fence-wait
vk_master_semaphore: Move fence wait on separate thread
2023-05-23 09:42:56 -04:00
liamwhite 7515655327 Merge pull request #10402 from liamwhite/uh
renderer_vulkan: barrier attachment feedback loops
2023-05-23 09:42:49 -04:00
Liam cdd20c6231 vulkan_device: Enable VK_KHR_push_descriptor on newer ANV 2023-05-22 19:53:20 -04:00
Liam 147f6129f4 renderer_vulkan: barrier attachment feedback loops 2023-05-22 18:10:16 -04:00
scorpion81 9c33fade59 Limit the device access memory to 4 GB
Hardly limiting the device access memory to 4 GB for integrated vulkan devices here. This works for the Steam Deck in order not to go above 4 GB VRAM usage any more (above this value the likelihood to crash when the RAM exceeds 12 GB as well raises).

But there will be perhaps a detection mechanism necessary for detecting the real memory limit for integrated vulkan devices. Those likely might have small limits anyway, but what about integrated GPUs on machines with > 16 GB RAM, aka larger amounts ?
2023-05-22 16:48:55 +02:00
Danila Malyutin 6ab723eace Add support for deinterlaced videos playback
This is a follow up to #10254 to improve the playback of cut scenes in Layton's Mystery Journey.
It uses ffmpeg's yadif filter for deinterlacing.
2023-05-22 01:43:44 +04:00
GPUCode 7732ce8a92 vk_master_semaphore: Move fence wait on separate thread 2023-05-20 19:23:53 +03:00
Liam f532faa5c3 renderer_vulkan: remove wrong constexpr 2023-05-18 18:01:01 -04:00
lat9nq 6597d2a5d3 vulkan_device: Disable VK_KHR_push_descriptor on ANV
Mesa commit ff91c5ca42bc80aa411cb3fd8f550aa6fdd16bdc breaks
VK_KHR_push_descriptor usage on ANV drivers 22.3.0, so disable it
and allow games to boot.
2023-05-17 22:19:57 -04:00
bunnei de9a79402d Merge pull request #10262 from liamwhite/depth-clamp
vulkan_common: disable depth clamp dynamic state for older radv
2023-05-17 12:19:03 -07:00
liamwhite 12a4dbe8f1 Merge pull request #10217 from Kelebek1/clear_value
Use the rendertarget format of the correct RT rather than the first valid
2023-05-16 10:06:30 -04:00
liamwhite c8356ee137 Merge pull request #10181 from lat9nq/intel-compute-toggle
configure_graphics: Add option to enable compute pipelines for Intel proprietary
2023-05-15 12:05:24 -04:00
liamwhite 896bf929d9 Merge pull request #10249 from FernandoS27/sorry-i-am-late
Buffer Cache: Clear sync code.
2023-05-15 12:03:25 -04:00
liamwhite cee8ef154e Merge pull request #10254 from danilaml/fix-h264-decode
Fix missing pic_order_present_flag in h264 header
2023-05-15 12:03:14 -04:00
Fernando Sahmkow 525cb91e3b Buffer Cache: Clear sync code. 2023-05-15 01:50:21 +02:00
liamwhite 836b8e1d64 Merge pull request #10288 from liamwhite/vram-limits
vulkan_device: reserve extra memory to prevent swaps
2023-05-14 17:02:15 -04:00
Liam 41353d738a vulkan_device: reserve extra memory to prevent swaps 2023-05-14 16:49:59 -04:00
Liam b10b8b7a57 vulkan_common: fix incompatible property flags 2023-05-14 01:13:11 -04:00
Liam 50b42ab980 vulkan_common: disable depth clamp dynamic state for older radv 2023-05-13 00:37:17 -04:00
Danila Malyutin 84df6eb7f9 Fix missing pic_order_present_flag in h264 header
Fixes #9635
2023-05-12 22:30:59 +04:00
Kelebek1 8a5db1aeff Correctly track RT indexes for image aspect lookup during clears 2023-05-12 01:40:21 +01:00
liamwhite 4838605114 Merge pull request #10132 from Kelebek1/fermi_blit2
Allow Fermi blit accelerate to work without images in cache
2023-05-11 10:45:59 -04:00
liamwhite 855502e669 Merge pull request #10216 from Kelebek1/buffer_cache_region_checks
Swap order of checking/setting region modifications in the buffer_cache
2023-05-11 10:45:47 -04:00
Kelebek1 fc6c77f7ae Allow Fermi blit accelerate to add src/dst to the cache if they don't exist already. Use ScratchBuffers in the software blit path. 2023-05-11 06:42:38 +01:00
Liam 66732f3e22 renderer_vulkan: separate guest and host compute descriptor queues 2023-05-10 13:46:48 -04:00
Kelebek1 b72b1f0a4e Use the rendertarget format of the correct RT rather than the first valid 2023-05-09 22:13:15 +01:00
Kelebek1 05dcdf5793 Swap order of checking/setting region modifications in the buffer_cache 2023-05-09 20:21:08 +01:00
Fernando Sahmkow a1317c3a6e Texture Cache: Fix ASTC textures 2023-05-09 02:42:10 +02:00
Fernando Sahmkow 5fa8c8685e Texture cache: Only force flush the dma downloads 2023-05-07 23:46:12 +02:00
Fernando Sahmkow 8203f2d8e1 Buffer Cache: disable reactive flushing in it. 2023-05-07 23:46:12 +02:00
Fernando Sahmkow a7a63d119c Texture cache: reverse inmediate flush changes 2023-05-07 23:46:12 +02:00
Fernando Sahmkow 1a2ed85a28 Buffer cache: always use async buffer downloads and fix regression. 2023-05-07 23:46:12 +02:00
Fernando Sahmkow 134c14f089 Address feedback, add CR notice, etc 2023-05-07 23:46:12 +02:00
Fernando Sahmkow dffc48b942 Query cache: stop updating pages as it's not affected by cpu writes 2023-05-07 23:46:12 +02:00
Fernando Sahmkow 62295b5069 Settings: add option to enable / disable reactive flushing 2023-05-07 23:46:12 +02:00
Fernando Sahmkow f1aa574448 Texture cache: sync the first flush. 2023-05-07 23:46:12 +02:00
Fernando Sahmkow 6bc60f78d9 GPU: Add Reactive flushing 2023-05-07 23:46:12 +02:00
liamwhite 28ed548196 Merge pull request #10081 from Kelebek1/copy_overlap_tick
Sort overlap_ids by modification tick before copy
2023-05-07 14:09:10 -04:00
liamwhite de45be2681 Merge pull request #10172 from Kelebek1/debug_validation_names
Log object names with debug renderer, add a GPU address to ImageViews
2023-05-07 14:09:03 -04:00
lat9nq 98f6fbd31c vk_pipeline_cache: Use setting to disable intel compute 2023-05-07 01:06:22 -04:00
bunnei 12c4c09b3f Merge pull request #10125 from lat9nq/vsync-select
configuration: Expose separate swap present modes
2023-05-06 21:55:39 -07:00
Kelebek1 d43a18a6ef Log object names with debug renderer, add a GPU address to ImageViews 2023-05-06 04:48:32 +01:00
liamwhite 64e46e723a Merge pull request #10145 from Kelebek1/code_size
Fix shader code resize to use word size rather than byte size
2023-05-04 14:44:02 -04:00
Fernando S c9a31835b6 Merge pull request #10153 from FernandoS27/a-quickie-fixie
Memory manager: Fix possible softlock
2023-05-04 03:56:53 +02:00
bunnei edac11f6c8 Merge pull request #10142 from FernandoS27/missing-astc
GPU: implement missing ASTC
2023-05-03 16:49:27 -07:00
Fernando Sahmkow d9b4380457 Memory manager: Fix possible softlock 2023-05-04 00:15:21 +02:00
bunnei 6f10c3fcd8 Merge pull request #10088 from FernandoS27/100-gelato-flavor-test-builds-later
Y.F.C Implement Asynchronous Fence manager and Rework Query async downloads
2023-05-03 15:10:22 -07:00
Fernando Sahmkow 94ecd260e3 GPU: implement missing ASTC 2023-05-03 11:33:28 -04:00
liamwhite 58b38d1761 Merge pull request #10151 from GPUCode/no-softlocks-please
Fix softlocks when disabling async present
2023-05-03 10:54:24 -04:00
Morph 2f29ad9d7e Merge pull request #10144 from liamwhite/dont-turbo
vulkan: disable turbo when debugging tool is attached
2023-05-03 10:53:03 -04:00
Morph 5e21f326b2 Merge pull request #10143 from liamwhite/fruit-company-moment
video_core: fix build on Apple Clang
2023-05-03 10:52:56 -04:00
GPUCode 40fa53e6d7 vk_present_manager: Fix softlocks when disabling async present 2023-05-03 07:50:10 +03:00
lat9nq cef9dca85f vk_swapchain: Use certain modes for unlocked
Uses mailbox, then immediate for unlocked framerate depending on
support for either. Also adds support for FIFO_RELAXED.

This function now assumes vsync_mode was originially configured to a value
that the driver supports.

vk_swapchain: ChooseSwapPresentMode determines updates

Simplifies swapchain a bit and allows us to change the present mode
during guest runtime.

vk_swapchain: Fix MSVC error

vk_swapchain: Enforce available present modes

Some frontends don't check the value of vsync_mode before comitting it.
Just as well, since a driver update or misconfiguration could problems
in the swap chain.

vk_swapchain: Silence warnings

Silences GCC warnings implicit-fallthrough and shadow, which apparently
are not enabled on clang.
2023-05-02 21:52:43 -04:00
lat9nq ff2197130f vulkan_surface: Pass only window info for surface creation
We don't need the whole EmuWindow when creating a surface,
and it creates onerous requirements outside of typical usage for
creating a surface elsewhere.
2023-05-02 21:51:30 -04:00
lat9nq 581d8f34ee configuration: Expose separate swap present modes
Previously, yuzu would try and guess which vsync mode to use given
different scenarios, but apparently we didn't always get it right. This
exposes the separate modes in a drop-down the user can select.

If a mode isn't available in Vulkan, it defaults to FIFO.
2023-05-02 21:51:29 -04:00
Kelebek1 3fc1615e28 Fix code resize to use word size rather than byte size 2023-05-02 23:52:21 +01:00
Liam 44b15592e8 vulkan: disable turbo when debugging tool is attached 2023-05-02 18:14:57 -04:00
Liam 2438a0b087 video_core: fix build on Apple Clang 2023-05-02 18:05:30 -04:00
GPUCode d56a40606c vk_present_manager: Add toggle for async presentation 2023-05-01 23:13:24 +03:00
GPUCode f9514cbc51 vk_blit_screen: Recreate FSR when frame is recreated
* Depends on the layout dimentions and thus should be recreated as well
2023-05-01 23:13:24 +03:00
GPUCode 373cfc636c renderer_vulkan: Fix crashing when updating descriptors
* During pipeline configure the function would acquire some payload space from the descriptor update queue,
  write the descriptor data on the GPU thread and give the scheduler a pointer to the beginning of said space to update it later.
  TickFrame resets the payload cursor, used to track acquires, back to the beginning of the buffer.
  This wasn't a problem before since WaitWorker was called at the end of the frame but now it is.
  If a frame writes to a cursor before the scheduler catches up, it will crash

* To fix this the payload buffer has been increased to account for the in flight frames that are allowed to exist now.
  TickFrame will switch between the payload spaces instead of resetting
2023-05-01 23:13:24 +03:00
GPUCode 8eede48a39 renderer_vulkan: Async presentation 2023-05-01 23:13:24 +03:00
Morph 98d1e50fb9 Merge pull request #10084 from FernandoS27/yuzu-goes-broom-broom
Y.F.C Buffer Cache Revamp
2023-05-01 11:08:02 -04:00
Fernando Sahmkow bd8abfe654 BufferCache: Fixes and address feedback 2023-05-01 11:43:26 +02:00
bunnei 4bcb509bbb Merge pull request #10110 from Morph1984/intel-disable-compute
vk_pipeline_cache: Skip compute pipelines on Intel proprietary drivers
2023-04-29 23:02:45 -07:00
Fernando Sahmkow f5d2ae4c5e Texture Cache: Release stagging buffers on tick frame 2023-04-29 15:31:38 +02:00
Fernando Sahmkow 6e18a08510 Buffer Cache: Release stagging buffers on tick frame 2023-04-29 00:46:31 +02:00
Fernando Sahmkow 917a21317f Clang: format and ficx compile errors. 2023-04-29 00:46:31 +02:00
Fernando Sahmkow cd4d4072c7 Implement Async downloads in normal and fix a few issues. 2023-04-29 00:46:31 +02:00
Fernando Sahmkow 139995905e Buffer Cache rework: Setup async downloads. 2023-04-29 00:46:31 +02:00
Fernando Sahmkow 64c9a90c20 Buffer Cache: Fully rework the buffer cache. 2023-04-29 00:46:31 +02:00
Fernando Sahmkow cf34f7c745 Address Feedback & Clang Format 2023-04-29 00:18:21 +02:00
Fernando Sahmkow 3595172637 Maxwell3D: only update parameters on High 2023-04-29 00:18:21 +02:00
Fernando Sahmkow b22e1a2bce Accelerate DMA: Use texture cache async downloads to perform the copies
to host.

WIP
2023-04-29 00:18:21 +02:00
Fernando Sahmkow e2bfd9e8c4 TextureCache: refactor DMA downloads to allow multiple buffers. 2023-04-29 00:18:21 +02:00
Morph 79d97d07e2 vk_pipeline_cache: Skip compute pipelines on Intel proprietary drivers
Intel's SPIR-V shader compiler is broken. For now, skip compiling any compute pipelines until they fix this issue.
This is not a perfect workaround, as there are a small subset of non-compute pipelines that still cause it to crash, but this should cover the majority of crashes.
It is unfortunate that even with a test case reported 6 months ago the issue has not been fixed in favor of fixing "the most popular games and apps".
Intel, you can do better than this.
2023-04-28 17:59:36 -04:00
Fernando Sahmkow 0da4b879eb QueryCache: Fix write invalidation. 2023-04-28 23:53:46 +02:00
Fernando Sahmkow ff3cf7c1d9 MemoryManager: Fix race conditions. 2023-04-28 23:53:02 +02:00
Fernando Sahmkow f606fa3515 Clang format and ddress feedback 2023-04-24 12:38:47 +02:00
Fernando S f430449ddb Merge pull request #10051 from liamwhite/surface-capabilities
vulkan: pick alpha composite flags based on available values
2023-04-24 12:37:13 +02:00
Fernando S f151023e45 Merge pull request #10069 from liamwhite/log
maxwell_3d: fix out of bounds array access in size estimation
2023-04-24 12:36:24 +02:00
Fernando Sahmkow abe4e83b45 QueryCache: rework async downloads. 2023-04-23 22:04:14 +02:00
Fernando Sahmkow eeffe68b7f Accuracy Normal: reduce accuracy further for perf improvements in Project Lime 2023-04-23 22:03:44 +02:00
Fernando Sahmkow ae99dcd531 Fence Manager: implement async fence management in a sepparate thread. 2023-04-23 04:48:50 +02:00
Liam b84bab419c maxwell_3d: fix out of bounds array access in size estimation 2023-04-22 10:35:26 -04:00
Kelebek1 477cbd067e Sort overlap_ids by modification tick before copy 2023-04-22 14:02:10 +01:00
Kelebek1 0397e174ae Account for a pre-added offset when using Corner sample mode for 2D blits 2023-04-21 19:08:21 +01:00
Liam fb2af6a41e vulkan: use plain fences when timeline semaphores are not available 2023-04-14 22:53:37 -04:00
bunnei d1e4bc6202 Merge pull request #10030 from Wollnashorn/botw-amd-fix
shader_recompiler: Fix ImageGather rounding on AMD/Intel
2023-04-14 16:56:34 -07:00
Liam e2b2842929 vulkan: pick alpha composite flags based on available values 2023-04-13 16:38:20 -04:00
Wollnashorn 111c02760b video_core: Enable ImageGather rounding fix on AMD open source drivers 2023-04-12 17:11:02 +02:00
liamwhite 84efa203a7 Merge pull request #10008 from vonchenplus/texture_cache
video_core: update imageinfo implement
2023-04-11 11:59:18 -04:00
Wollnashorn dda107ffa7 video_core: Enable ImageGather with subpixel offset on Intel 2023-04-08 16:12:44 +02:00
Wollnashorn 45fb154f0d shader_recompiler: Add subpixel offset for correct rounding at `ImageGather`
On AMD a subpixel offset of 1/512 of the texel size is applied to the texture coordinates at a ImageGather call to ensure the rounding at the texel centers is done the same way as in Maxwell or other Nvidia architectures.
See https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/ for more details why this might be necessary.

This should fix shadow artifacts at object edges in Zelda: Breath of the Wild (#9957, #6956).
2023-04-08 16:12:30 +02:00
liamwhite fa846222da Merge pull request #10004 from Kelebek1/cubemap
[texture_cache] Only upload GPU-modified overlaps
2023-04-03 13:05:52 -04:00
Jan Beich 604f887377 externals: update Vulkan-Headers to v1.3.246 2023-04-01 05:38:54 +00:00
Feng Chen c7675caf71 video_core: Keep the definition of DimensionControl consistent with nvidia open doc 2023-03-31 12:33:07 +08:00
Max Dunbar 8b5becf71b Fixes 'Continous' typo 2023-03-29 19:26:12 -07:00
Kelebek1 de4fc71536 Only upload GPU-modified overlaps 2023-03-28 11:07:39 +01:00
liamwhite c0e0237b21 Merge pull request #9984 from liamwhite/global-memory
memory: rename global memory references to application memory
2023-03-27 12:16:40 -04:00
Morph 9308213232 video_core/macro: Make use of Common::HashValue 2023-03-25 23:52:26 -04:00
bunnei 82155e4000 Merge pull request #9985 from liamwhite/funny-meme
vulkan: fix scheduler chunk reserve
2023-03-24 23:40:17 -07:00
Ross Schlaikjer ee8f63ac65 Pass GPU page table by reference 2023-03-25 00:25:02 -04:00
Liam aea009216e vulkan: fix scheduler chunk reserve 2023-03-24 09:09:01 -04:00
Morph 1242e360bd Merge pull request #9975 from liamwhite/more-waiting
vulkan: fix more excessive waiting in scheduler
2023-03-24 00:19:43 -04:00
Liam 6eaef51cf2 memory: rename global memory references to application memory 2023-03-23 20:28:47 -04:00
liamwhite c8963299fa Merge pull request #9971 from Morph1984/q
bounded_threadsafe_queue: Use simplified impl of bounded queue
2023-03-23 10:00:31 -04:00
Morph f33cddc400 Merge pull request #9962 from Kelebek1/disable_srgb
[video_core] Disable SRGB border color conversion in samplers
2023-03-23 03:07:00 -04:00
Morph 62fd55e5fe bounded_threadsafe_queue: Deduplicate and add PushModes
Adds the PushModes Try and Wait to allow producers to specify how they want to push their data to the queue if the queue is full.
If the queue is full:
- Try will fail to push to the queue, returning false. Try only returns true if it successfully pushes to the queue. This may result in items not being pushed into the queue.
- Wait will wait until a slot is available to push to the queue, resulting in potential for deadlock if a consumer is not running.
2023-03-21 19:20:21 -04:00
Morph c4314b231f bounded_threadsafe_queue: Use simplified impl of bounded queue
Provides a simplified SPSC, MPSC, and MPMC bounded queue implementation using mutexes.
2023-03-21 19:17:32 -04:00
Liam af8ce05caa vulkan: fix more excessive waiting in scheduler 2023-03-19 13:40:33 -04:00
bunnei 4471e9effe Merge pull request #9778 from behunin/my-box-chevy
gpu_thread: Use bounded queue
2023-03-17 22:14:29 -07:00
Kelebek1 0a90adff87 Disable SRGB border color conversion for now, to fix shadows in Xenoblade. 2023-03-17 04:46:38 +00:00
liamwhite f47a6b3c8d Merge pull request #9955 from liamwhite/color-blend-equation
vulkan: disable extendedDynamicState3ColorBlendEquation on radv
2023-03-15 20:19:45 -04:00
liamwhite f3dfe9e5e1 Merge pull request #9931 from liamwhite/sched
vk_scheduler: split work queue waits and execution waits
2023-03-15 20:19:35 -04:00
Liam 09a866fe79 vulkan: disable extendedDynamicState3ColorBlendEquation on radv 2023-03-15 15:55:07 -04:00
liamwhite fc39bb0ef9 Merge pull request #9933 from vonchenplus/texture_format
video_core: Update texture format
2023-03-14 11:35:37 -04:00
FengChen 0f336df1ea video_core: Better defined ImageInfo parameters 2023-03-14 22:36:34 +08:00
liamwhite 853e5576e6 Merge pull request #9943 from vonchenplus/gentleman
video_core: Fix inline_index and draw_texture error
2023-03-13 13:45:17 -04:00
Liam 11814a4991 vk_scheduler: split work queue waits and execution waits 2023-03-12 17:19:44 -04:00
Liam 5be8a74b0c general: fix spelling mistakes 2023-03-12 11:33:01 -04:00
FengChen e067d314ba video_core: Fix ogl status error when draw_texture 2023-03-12 13:33:31 +08:00
FengChen 5a1d6233b2 video_core: Invalid index_buffer flag when inline_index draw 2023-03-12 13:21:26 +08:00
Fernando S 0edffb460d Merge pull request #9913 from ameerj/acc-dma-refactor
AccelerateDMA: Refactor Buffer/Image copy code and implement for OGL
2023-03-11 20:04:19 +01:00
liamwhite 68e1996e52 Merge pull request #9925 from ameerj/gl-sync-signal
OpenGL: Prefer glClientWaitSync for OGLSync objects
2023-03-10 13:55:22 -05:00
liamwhite 2b8955aaa4 Merge pull request #9917 from Morph1984/the-real-time
native_clock: Re-adjust the RDTSC frequency to its real frequency
2023-03-10 13:55:11 -05:00
Feng Chen 63a0d2661c video_core: Update texture format 2023-03-10 21:48:50 +08:00
liamwhite 89c9a9e145 Merge pull request #9822 from ameerj/buffcache-ssbo-addr
buffer_cache: Add logic for non-NVN storage buffer tracking
2023-03-09 09:18:39 -05:00
ameerj 625d716f56 OpenGL: Prefer glClientWaitSync for OGLSync objects
At least on Nvidia, glClientWaitSync with a timeout of 0 (non-blocking) is faster than glGetSynciv of GL_SYNC_STATUS.
2023-03-08 20:29:25 -05:00
liamwhite a9fc59a998 Merge pull request #9896 from Kelebek1/d24s8
Check all swizzle components for red, not just [0]
2023-03-08 09:16:06 -05:00
Morph ddb330121a core: Promote CPU/GPU threads to time critical
And also demote Audren and CoreTiming to High thread priority.
2023-03-07 21:17:46 -05:00
Liam d55cc3b004 general: fix type inconsistencies 2023-03-07 20:05:19 -05:00
liamwhite 4bdcafda58 Merge pull request #9889 from Morph1984/time-is-ticking
core_timing: Reduce CPU usage on Windows
2023-03-07 10:54:13 -05:00
ameerj bc5a8c664b gl_rasterizer: Implement AccelerateDMA DmaBufferImageCopy 2023-03-06 22:57:52 -05:00
ameerj e901a7f029 Refactor AccelerateDMA code 2023-03-06 22:57:45 -05:00
Fernando Sahmkow 82f37192ec Engines: Implement Accelerate DMA Texture. 2023-03-05 12:18:00 +01:00
Morph e25334b8b3 core_timing: Use higher precision sleeps on Windows
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows.
Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution.
This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
2023-03-05 02:36:31 -05:00
Morph 7f06f21046 Merge pull request #9884 from liamwhite/service-cleanup
service: miscellaneous cleanups
2023-03-03 22:51:17 -05:00