Commit Graph

362 Commits

Author SHA1 Message Date
bunnei c916ad62e7 Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00
ReinUsesLisp 910decd9cb vk_pipeline_cache: Fix unintentional memcpy into optional
The intention behind this was to assign a float to from an uint32_t, but
it was unintentionally being copied directly into the std::optional.

Copy to a temporary and assign that temporary to std::optional. This can
be replaced with std::bit_cast<float> once we are in C++20.
2020-04-22 21:36:05 -03:00
Fernando Sahmkow 9fe7972120 Merge pull request #3653 from ReinUsesLisp/nsight-aftermath
renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows
2020-04-22 11:39:01 -04:00
Fernando Sahmkow 491aea4a91 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow d9f1d5a4fd ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Fernando Sahmkow ea522da8b5 Address Feedback. 2020-04-22 11:36:24 -04:00
ReinUsesLisp 0b9454849d vk_fence_manager: Initial implementation 2020-04-22 11:36:19 -04:00
Fernando Sahmkow 1966f1d948 OpenGL: Guarantee writes to Buffers. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow 7986c97ed2 GPU: Implement Flush Requests for Async mode. 2020-04-22 11:36:17 -04:00
Fernando Sahmkow af9f901764 FenceManager: Manage syncpoints and rename fences to semaphores. 2020-04-22 11:36:16 -04:00
Fernando Sahmkow 6092308fe4 Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan. 2020-04-22 11:36:14 -04:00
Fernando Sahmkow e7195b5f87 ThreadManager: Sync async reads on accurate gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow de53bc96c0 BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow c689dc6804 GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Rodrigo Locatti 89ba13c7d2 Merge pull request #3718 from ReinUsesLisp/better-pipeline-state
fixed_pipeline_state: Pack structure, use memcmp and CityHash on it
2020-04-21 18:17:58 -03:00
Mat M fe0364e257 Merge pull request #3733 from ambasta/patch-2
Initialize quad_indexed_pass before uint8_pass
2020-04-20 20:36:46 -04:00
Fernando Sahmkow 555a1273c9 Merge pull request #3700 from ReinUsesLisp/stream-buffer-sizes
vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers
2020-04-20 09:37:42 -04:00
Amit Prakash Ambasta 7915dc7ac9 Initialize quad_indexed_pass before uint8_pass
Fixes Werror=reorder in gcc
2020-04-20 04:53:52 +05:30
bunnei a39273cb95 Merge pull request #3694 from ReinUsesLisp/indexed-quads
vk_compute_pass: Implement indexed quads
2020-04-19 16:52:40 -04:00
Jan Beich cc5e71c5ad renderer_vulkan: assume X11 if not Windows/macOS after 30bbdc653c
Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateInstance:131: Presentation not supported on this platform
Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateSurface:378: Presentation not supported on this platform
Core <Critical> core/core.cpp:Load:199: Failed to initialize system (Error 5)!
2020-04-19 00:32:23 +00:00
ReinUsesLisp 98266da47c fixed_pipeline_state: Hash and compare the whole structure
Pad FixedPipelineState's size to 384 bytes to be a multiple of 16.

Compare the whole struct with std::memcmp and hash with CityHash. Using
CityHash instead of a naive hash should reduce the number of collisions.
Improve used type traits to ensure this operation is safe.

With these changes the improvements to the hashable pipeline state are:

Optimized structure
Hash:            89 ns
Comparison:     103 ns
Construction*:  164 ns
Struct size:    384 bytes

Original structure
Hash:           148 ns
Equal:          174 ns
Construction*:  281 ns
Size:          1384 bytes

* Attribute state initialization is not measured

These measures are averages taken with std::chrono::high_accuracy_clock
on MSVC shipped on Visual Studio 16.6.0 Preview 2.1.
2020-04-18 19:57:26 -03:00
ReinUsesLisp 89c816a3cf fixed_pipeline_state: Pack blending state
Reduce FixedPipelineState's size to 364 bytes.
2020-04-18 19:23:35 -03:00
ReinUsesLisp d74b3a5a50 fixed_pipeline_state: Pack rasterizer state
Reduce FixedPipelineState's size to 600 bytes.
2020-04-18 19:22:57 -03:00
ReinUsesLisp cc6af27ae7 fixed_pipeline_state: Pack depth stencil state
Reduce FixedPipelineState's size to 632 bytes.
2020-04-18 19:22:11 -03:00
ReinUsesLisp fd2f04bbdc fixed_pipeline_state: Pack attribute state
Reduce FixedPipelineState's size from 1384 to 664 bytes
2020-04-18 19:21:19 -03:00
ReinUsesLisp 153845bba3 vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers
Nvidia recently introduced a new memory type for data streaming
(awesome!), but yuzu was assuming that all heaps had enough memory
for the assumed stream buffer size (256 MiB).

This worked fine on AMD but Nvidia's new memory heap was smaller than
256 MiB. This commit changes this assumption and allocates a bit less
than the size of the preferred heap, with a maximum of 256 MiB (to avoid
allocating all system memory on integrated devices).

- Fixes a crash on NVIDIA 450.82.0.0
2020-04-17 18:12:48 -03:00
ReinUsesLisp 14ab5c4b65 vk_compute_pass: Implement indexed quads
Implement indexed quads (GL_QUADS used with glDrawElements*) with a
compute pass conversion.

The compute shader converts from uint8/uint16/uint32 indices to uint32.
The format is passed through push constants to avoid having different
variants of the same shader.

- Used by Fast RMX
- Used by Xenoblade Chronicles 2 (it still has graphical due to
synchronization issues on Vulkan)
2020-04-16 21:12:32 -03:00
Fernando Sahmkow 7a9b83b658 Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
ReinUsesLisp c1ad40a3cb buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Lioncash 3c3928a5f7 video_core: Amend doxygen comment references
Fixes broken documentation references.
2020-04-15 22:33:29 -04:00
Fernando Sahmkow d06795c08a Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Mat M 077ffb29fd Merge pull request #3668 from ReinUsesLisp/vtx-format-16ui
maxwell_to_vk: Add uint16 vertex formats
2020-04-15 11:43:52 -04:00
ReinUsesLisp 9db905ec1d maxwell_to_vk: Add uint16 vertex formats 2020-04-15 04:06:30 -03:00
ReinUsesLisp f643910f7b maxwell_to_vk: Add missing breaks
Avoid invalid fallbacks.
2020-04-15 04:05:33 -03:00
ReinUsesLisp ffc83d1a57 vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
When the dynamic state is specified, pViewports and pScissors are
ignored, quoting the specification:

  pViewports is a pointer to an array of VkViewport structures, defining
  the viewport transforms. If the viewport state is dynamic, this member
  is ignored.

That said, AMD's proprietary driver itself seem to read it regardless of
what the specification says.
2020-04-15 03:30:08 -03:00
ReinUsesLisp a1ea6948e2 vk_rasterizer: Default to 1 viewports with a size of 0
Silence validation layer errors.
2020-04-14 04:44:34 -03:00
ReinUsesLisp c4dc51f168 renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows
Adds optional support for Nsight Aftermath. It is enabled through
ENABLE_NSIGHT_AFTERMATH in cmake. A path to the SDK has to be provided
by the environment variable NSIGHT_AFTERMATH_SDK.

Nsight Aftermath allows an application to generate "minidumps" of the
GPU state when a device loss happens. By analysing these on Nsight we
can know what a game was doing and why it triggered a device loss.

The dump is generated inside %APPDATA%\yuzu\log\gpucrash and this
directory is deleted every time a new instance is initialized with
Nsight enabled.

To enable it on yuzu there has a to be a driver and device capable of
running Nsight Aftermath on Vulkan. That means only Turing based GPUs
on the latest stable driver, beta drivers won't work for now.

It is manually enabled in Configuration>Debug>Enable Graphics Debugging
because when using all debugging capabilities there is a runtime cost.
2020-04-14 00:39:21 -03:00
ReinUsesLisp d351b10ad3 renderer_vulkan: Remove Nvidia checkpoints 2020-04-13 17:33:59 -03:00
ReinUsesLisp 8809b9fe3c renderer_vulkan: Catch device losses in more places 2020-04-13 17:33:59 -03:00
Rodrigo Locatti f2f2262c5e Merge pull request #3636 from ReinUsesLisp/drop-vk-hpp
renderer_vulkan: Drop Vulkan-Hpp
2020-04-13 17:08:04 -03:00
ReinUsesLisp 1f1e80c67d texture_cache: Remove preserve_contents
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.

This removes preserve_contents and assumes it as true at all times.
2020-04-11 01:51:02 -03:00
ReinUsesLisp 40ef0d3b28 renderer_vulkan: Drop Vulkan-Hpp 2020-04-10 22:49:02 -03:00
bunnei aeac4d2d60 Merge pull request #3594 from ReinUsesLisp/vk-instance
yuzu: Drop SDL2 and Qt frontend Vulkan requirements
2020-04-10 20:06:55 -04:00
Fernando Sahmkow cacbe85d13 VkRasterizer: Eliminate Legacy code. 2020-04-08 18:59:09 -04:00
Fernando Sahmkow f00f6bbdb6 Memory: Address Feedback. 2020-04-08 13:40:46 -04:00
ReinUsesLisp 30bbdc653c yuzu: Drop SDL2 and Qt frontend Vulkan requirements
Create Vulkan instances and surfaces from the Vulkan backend.
2020-04-07 16:32:19 -03:00
ReinUsesLisp 3140098db9 renderer_vulkan: Query device names from the backend 2020-04-07 02:23:23 -03:00
Fernando Sahmkow 02f2fa510d Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow 6c9f7db8af Query Cache: Use VAddr instead of physical memory for adressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow 3728c7160f Buffer Cache: Use vAddr instead of physical memory. 2020-04-06 09:23:06 -04:00