Commit Graph

87 Commits

Author SHA1 Message Date
ameerj 899cf73819 vk_blit_screen: Fix non-accelerated texture size calculation
Addresses the potential OOB access in UnswizzleTexture.
2021-08-16 14:28:10 -04:00
yzct12345 46e4e6707f decoders: Optimize swizzle copy performance (#6790)
This makes UnswizzleTexture up to two times faster. It is the main bottleneck in NVDEC video decoding.
2021-08-02 11:18:58 -04:00
lat9nq 8aed753c16 decoders: Break instead of continue
continue causes a memory leak in A Hat in Time.
2021-06-04 05:12:14 -04:00
lat9nq 54f2a92203 decoders: Avoid out-of-bounds access
This is not a real fix, so assert here and continue before crashing.
2021-06-04 05:03:54 -04:00
ameerj cac341dbc7 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
ReinUsesLisp 4e4056f581 common/alignment: Rename AlignBits to AlignUpLog2
AlignUpLog2 describes what the function does better than AlignBits.
2021-01-15 04:13:33 -03:00
ReinUsesLisp d25b097e84 video_core: Rewrite the texture cache
The current texture cache has several points that hurt maintainability
and performance. It's easy to break unrelated parts of the cache
when doing minor changes. The cache can easily forget valuable
information about the cached textures by CPU writes or simply by its
normal usage.The current texture cache has several points that hurt
maintainability and performance. It's easy to break unrelated parts
of the cache when doing minor changes. The cache can easily forget
valuable information about the cached textures by CPU writes or simply
by its normal usage.

This commit aims to address those issues.
2020-12-30 03:38:50 -03:00
ReinUsesLisp 9a83f8794b textures/decoders: Fix block linear to pitch copies
There were two issues with block linear copies. First the swizzling was
wrong and this commit reimplements them.

The other issue was that these copies are generally used to download
render targets from the GPU and yuzu was not downloading them from
host GPU memory unless the extreme GPU accuracy setting was selected.
This commit enables cached memory reads for all accuracy levels.

- Fixes level thumbnails in Super Mario Maker 2.
2020-08-10 20:45:03 -03:00
bunnei 2e4a5d2110 Merge pull request #4324 from ReinUsesLisp/formats
video_core: Fix, add and rename pixel formats
2020-07-21 00:13:04 -04:00
ReinUsesLisp a068ce4c32 video_core: Rearrange pixel format names
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
2020-07-13 01:44:23 -03:00
ReinUsesLisp 2691f5e6b9 video_core/textures: Add and use SwizzleSliceToVoxel, and minor style changes
Change GOB sizes from free-functions to constexpr constants.

Add SwizzleSliceToVoxel, a function that swizzles a 2D array of pixels
into a 3D texture and use it for 3D copies.
2020-07-10 04:09:32 -03:00
Fernando Sahmkow 4d2b23d3c5 MaxwellDMA: Optimize micro copies. 2020-04-28 13:44:14 -04:00
Lioncash eaeb4520f7 General: Resolve warnings related to missing declarations 2020-04-16 23:43:34 -04:00
Fernando Sahmkow f1adfe6591 MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
Fernando Sahmkow b62d22a77d texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow 18322c1369 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
ReinUsesLisp 1d10810d2b video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
ReinUsesLisp f46a979f19 gl_texture_cache: Add fast copy path 2019-06-20 21:36:11 -03:00
ReinUsesLisp 3b430b5605 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
Fernando Sahmkow 56c2b0ea86 Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators. 2019-04-16 12:00:46 -04:00
Fernando Sahmkow 15368c6070 Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
bunnei d3f26c1546 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
ReinUsesLisp 3989075e5f gl_rasterizer_cache: Move format conversion to its own file 2019-02-26 20:08:27 -03:00
ReinUsesLisp 64612bf940 decoders: Minor style changes 2019-02-26 20:08:27 -03:00
David Marcec 1dfb0a513a Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
FernandoS27 b509890e4c Implemented Tile Width Spacing 2018-11-26 09:05:12 -04:00
bunnei e84cceb645 Merge pull request #1717 from FreddyFunk/swizzle-gob
textures/decoders: Replace magic numbers
2018-11-18 20:13:00 -08:00
Frederic L d2dd9cfc1d Eliminated unnessessary memory allocation and copy (#1702) 2018-11-18 19:53:03 -08:00
Frederic Laing 9d588877ef textures/decoders: Replace magic numbers 2018-11-17 01:55:28 +01:00
Frederic Laing 8dcfc75e4e textures/decoders: Minor cleanup 2018-11-15 21:04:17 +01:00
greggameplayer ec188ec832 Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB (#1666)
* Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB
( needed by Mario+Rabbids Kingdom Battle )

* Small placement correction
2018-11-12 18:34:54 -08:00
FernandoS27 fe596b4c6e Fix ASTC formats 2018-11-01 13:08:19 -04:00
bunnei 6bf7e0d83d Merge pull request #1524 from FernandoS27/layers-fix
rasterizer: Fix Layered Textures Loading and Cubemaps
2018-10-25 00:29:18 -04:00
Lioncash a3c2defab1 decoders: Remove unused variable within SwizzledData() 2018-10-23 23:51:13 -04:00
FernandoS27 88ff802e1a Fixed Layered Textures Loading and Cubemaps 2018-10-23 14:27:36 -04:00
bunnei fa24c17b95 decoders: Introduce functions for un/swizzling subrects. 2018-10-18 22:41:43 -04:00
bunnei 7ce1c004c4 Merge pull request #1488 from Hexagon12/astc-types
video_core: Added ASTC 5x4; 8x5 types
2018-10-14 14:44:24 -04:00
FernandoS27 39ba6950b8 Shorten the implementation of 3D swizzle to only 3 functions 2018-10-13 20:58:00 -04:00
FernandoS27 92e9faba25 Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBuffer 2018-10-13 16:11:11 -04:00
FernandoS27 1a70753709 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
FernandoS27 8b1e913058 Remove old Swizzle algorithms and use 3d Swizzle 2018-10-13 15:25:17 -04:00
FernandoS27 2650e33c48 Implement Precise 3D Swizzle 2018-10-13 15:25:16 -04:00
FernandoS27 8b32bd526b Implement Fast 3D Swizzle 2018-10-13 15:25:15 -04:00
Hexagon12 f50514b25b Added ASTC 5x4; 8x5 2018-10-13 17:10:26 +03:00
FernandoS27 eec2311ec1 Implemented helper function to correctly calculate a texture's size 2018-10-12 14:21:53 -04:00
FernandoS27 9acd217fed Reverse stride align restriction on FastSwizzle due to lost performance 2018-09-21 12:09:59 -04:00
FernandoS27 ec51077131 Join both Swizzle methods within one interface function 2018-09-21 11:42:34 -04:00
FernandoS27 dc98e08d51 Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table instead 2018-09-21 11:34:54 -04:00
FernandoS27 55543db875 Remove same output bpp restriction on FastSwizzle 2018-09-21 11:10:44 -04:00
FernandoS27 0bd735d574 Improved Legacy Swizzler to be better documented and work better 2018-09-21 10:57:12 -04:00