First special instruction
Start Load/Store implementation
Start TextureSample
Sample progress
I/O Load/Store Progress
Rest of load/store
TODO: Currently, the generator still assumes the GLSL style of I/O attributres. On MSL, the vertex function should output a struct which contains a float4 with the required position attribute.
TextureSize and VectorExtract
Fix UserDefined IO Vars
Fix stage input struct names
Mostly copied from GLSL since in terms of syntax within blocks they’re pretty similar. Likely the result will need tweaking…
Isn’t that conveniant?
“Do the simd_shuffle”
atomics
Remaining instructions
Remove removed special instructions
Getting somewhere…
* Ensure descriptor sets are only re-used when all command buffers using it have completed
* Fix some SPIR-V capabilities
* Set update after bind flag if we exceed limits
* Simpler fix for Intel
* Format whitespace
* Make struct readonly
* Add barriers for extra set arrays too
* Key textures using set and binding (rather than just binding)
* Extend full bindless to cover cases with phi nodes
* Log error on bindless access failure
* Shader cache version bump
* Remove constant buffer match to reduce the chances of full bindless triggering
* Re-enable it for constant buffers, paper mario does actually need it
* Format whitespace
* Report base and extra sets from the backend
* Pass texture set index everywhere
* Key textures using set and binding (rather than just binding)
* Start using extra sets for array textures
* Shader cache version bump
* Separate new commands, some PR feedback
* Introduce new manual descriptor set reservation method that prevents it from being used by something else while owned by an array
* Move bind extra sets logic to new method
* Should only use separate array is MaximumExtraSets is not zero
* Format whitespace
* Add support for bindless textures from shader input (vertex buffer)
* Shader cache version bump
* Format whitespace
* Remove cache entries on pool removal, disable for OpenGL
* PR feedback
* Add support for large sampler arrays on Vulkan
* Shader cache version bump
* Format whitespace
* Move DescriptorSetManager to PipelineLayoutCacheEntry to allow different pool sizes per layout
* Handle array textures with different types on the same buffer
* Somewhat better caching system
* Avoid useless buffer data modification checks
* Move redundant bindings update checking to the backend
* Fix an issue where texture arrays would get the same bindings across stages on Vulkan
* Backport some fixes from part 2
* Fix typo
* PR feedback
* Format whitespace
* Add some missing XML docs
Previously we were only doing it for Vulkan, but it turns out that
not setting it when PROGRAM_POINT_SIZE is set is considered UB
on OpenGL Core.
Signed-off-by: Mary <mary@mary.zone>
* Change TargetFramework to net8.0
* Disable info messages
* Fix warings
* Disable additional analyzer messages
* Fix typo
* Add whitespace
* Fix ref vs in warnings
* Use explicit [In] on array parameters
* No need to guard Remove with Contains
* Use 'ArgumentOutOfRangeException.ThrowIf...' instead of explicitly throwing a new exception instance
* Bump .NET SDK version
* Enable JsonSerializerIsReflectionEnabledByDefault
* Use 8.0.100 GA release
* Bump System package versions
---------
Co-authored-by: Zoltan Csizmadia <Zoltan.Csizmadia@vericast.com>
* GPU: Add fallback when textureGatherOffsets is not supported.
This PR adds a fallback for GPUs or APIs that don't support an equivalent to the method `textureGatherOffsets`, where each of the 4 gathered texels has an individual offset. This is done by reusing the existing code to handle non-const offsets for texture instructions, though it has also been corrected as there were a few implementation issues.
MoltenVK reports support for this capability, and it didn't error when we initially released the MacOS build, but that has since changed. MVK still reports support, but spirv-cross has been fixed in a way that it _attempts_ to use this capability, but the metal compiler errors since it doesn't exist.
Some other fixes:
- textureGatherOffsets emulation has been changed significantly. It now uses 4 texture sample instructions (not gather), calculates a base texel (i=0 j=0) and adds the offsets onto it before converting into a tex coord. The final result is offset into a texel center, so it shouldn't be subject to interpolation, though this isn't perfect and could have some error with floating point formats with linear sampling. It is subject to texture wrap mode as it should be, which is why texelFetch was not used.
- Maybe gather should be used here with component `w` (i=0, j=0), though this multiplies number of texels fetched by 4... The way it was doing this before _was_ wrong_, but doing it right would avoid issues with texel center precision.
- textureGatherOffset (singular) now performs textureGather with the offset applied to the coords, rather than the slower fallback where each texel is fetched individually.
* Increment shader cache version, remove unused arg
* Use base texture size for gather coord offset.
Implicit LOD for gather is not supported.
* Use 4 texture gathers for offsets emulation
Avoids issues with interpolation at cost of performance
(not sure how bad this is)
* Address Feedback
* Strings should not be concatenated using '+' in a loop
* fix IDE0090
* undo GenerateLoadOrStore
* prefer string interpolation
* Update src/Ryujinx.Graphics.Shader/CodeGen/Glsl/Instructions/InstGen.cs
Co-authored-by: Mary <thog@protonmail.com>
---------
Co-authored-by: Mary <thog@protonmail.com>
* Implement vertex and geometry shader conversion to compute
* Call InitializeReservedCounts for compute too
* PR feedback
* Set clip distance mask for geometry and tessellation shaders too
* Transform feedback emulation only for vertex
#5576 changed where the position was declared, but forgot to add the Invariant declaration to position when the ReducedPrecision flag was enabled. This was causing weird graphical bugs in a bunch of games, mostly to do with mismatching depth between multiple draws of the same geometry.
Maybe the attempt to add it to Position in DeclareInputOrOutput can be removed now, assuming that path is never used.