deadcade/Ryujinx

Author	SHA1	Message	Date
merry	8d41402fa6	A32: Implement VCVTT, VCVTB (#3710 ) * A32: Implement VCVTT, VCVTB * A32: F16C implementation of VCVTT/VCVTB	2022-10-19 02:36:04 +02:00
LDj3SNuD	5af8ce7c38	A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V in… (#3712 ) * A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V instructions; they use "Round to Nearest with Ties to Away" rounding mode not supported in x86. All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. The titles Mario Strikers and Super Smash Bros. U. use these instructions intensively. * Update Ptc.cs * A32: Add fast path for Vcvta_RM, Vrinta_RM and Vrinta_V instructions aswell.	2022-10-19 00:21:33 +00:00
Luna	77c4291c34	Avalonia: Update Polish Translation (#3722 ) * Add new string You need the period there otherwise it could be read as "Głoś" -> "Preach" * Update MainWindow.axaml Updating to bring it in line with the other languages naming themselves in their respective languages * Update pl_PL.json realizing that period isn't necessary considering the string's usage (which to be fair, I should have checked when I added it) * Update pl_PL.json * Add Updater Message	2022-10-19 00:10:28 +00:00
riperiperi	6e92b7a378	Dispose Vulkan TextureStorage when views hit 0 instead of immediately (#3738 ) Due to the `using` statement being scoped to the `CreateTextureView` method, `TextureStorage` would be disposed as soon as the view was returned. This was largely fine as the TextureStorage resources were being kept alive by the views holding their own references to them, but it also meant that dispose is only called as soon as the texture is created. Aliased Storages are TextureStorages created with the same allocation as another TextureStorage, if they have to be aliased as another format. We keep track of a TextureStorage's `_aliasedStorages` as they are created, and dispose them when the TextureStorage is disposed... ...except it is disposed immediately, before any aliased storages are even created. The aliased storages added after this will never be disposed. This PR attempts to fix this by disposing TextureStorage when its view count reaches 0. The other use of texture storage - the D32S8 blit - still manually disposes the storage, but regular uses created via the GAL are now disposed by the view count. I think this makes the most sense, as otherwise in the future this behaviour might be forgotton and more things could be added to the Dispose() method that don't work due to it not actually calling at the right time. This should improve memory leaks in Super Mario Odyssey, most noticeable when resolution scaling. The memory usage of the game is still wildly unpredictable due to how it interacts with the texture cache, but now it shouldn't get considerably longer as you play... I hope. I've seen it typically recover back to the same level occasionally, though it can spike significantly. Please test a bunch of games on multiple GPUs to make sure this doesn't break anything.	2022-10-18 23:52:08 +00:00
Yohoki	9b852c7481	Fix: Arguments Break when Updating (#3744 ) * Wrap Args in quotes -Wrap args in quotes to allow for spaces in dir paths when restarting Ryujinxs from Update. * Wrap second instance of GetCommandLineArgs() * Changed ryuArgs from string to string[] * Update Ryujinx.Ava/Modules/Updater/Updater.cs Co-authored-by: mageven <62494521+mageven@users.noreply.github.com> * Update UpdateDialog.cs Co-authored-by: mageven <62494521+mageven@users.noreply.github.com>	2022-10-18 23:41:16 +00:00
Berkan Diler	c40c3905e2	Avoid allocations in .Parse methods (#3760 ) * Avoid allocations in .Parse methods Use the Span overloads of the Parse methods when possible to avoid string allocations and remove one unnecessarry array allocation * Avoid another string allocation	2022-10-18 23:31:34 +00:00
gdkchan	a6cd044f0f	Vulkan: Fix blit levels/layers parameters being inverted (#3768 )	2022-10-18 10:13:44 +02:00
gdkchan	f5a1de6ac5	Fix kernel VA allocation when random allocation fails (#3755 ) * Fix kernel VA allocation when random allocation fails * This was off by one	2022-10-17 22:12:49 +00:00
MetrosexualGarbodor	2aeb5b00e3	Update README.md (#3767 ) Update compatibility numbers	2022-10-17 23:58:11 +02:00
Emmanuel Hansen	60ba7b71f2	remove property changed call in time zone validation (#3752 )	2022-10-17 16:48:14 +00:00
gdkchan	7c1d2bbb98	Implement OpenDataStorageWithProgramIndex partially (#3765 ) * Implement OpenDataStorageWithProgramIndex partially * Was not supposed to change this	2022-10-17 13:37:05 +00:00
mageven	beacf8c1c8	TamperMachine: Fix input mask check (#3764 )	2022-10-16 19:51:52 -03:00
riperiperi	0dbe45ae37	Fix various issues caused by Vertex/Index buffer conversions (#3762 ) * Fix various issues caused by #3679 - The arguments for the 0th dummy vertex buffer were incorrect - it was given an offset of 16 rather than a size of 16. - The wrong size was used when doing `autoBuffer.Get` on a converted vertex buffer. - The possibility of a vertex buffer being disposed and then rebound can rebindings to find a different buffer where the current range is out of bounds. Avoid binding when out of range to prevent validation errors. - The above also affects generation of converted buffers, which was a bit more fatal. Conversion functions now attempt to bound input offset/size. * Fix offset for converted buffer	2022-10-16 19:38:58 -03:00
riperiperi	2b50e52e48	Fix primitive count calculation for topology conversion (#3763 ) Luigi's Mansion 3 performs a non-index quads draw with 6 vertices. It's meant to ignore the last two, but the index pattern's primitive count calculation was rounding up. No idea why the game does this but this should fix random triangles in the map.	2022-10-16 19:25:40 -03:00
mageven	49eadbc209	Fix phantom configured Controllers (#3720 ) Enable guest controller only when a valid host controller is mapped.	2022-10-16 20:34:42 +02:00
gdkchan	2df16ded9b	Improve shader BRX instruction code generation (#3759 ) * Improve shader BRX instruction code generation * Shader cache version bump, add some comments and asserts	2022-10-15 23:20:16 +00:00
TSRBerry	e43390c723	bsd: Check if socket is bound before calling RecvFrom() (#3761 )	2022-10-15 20:52:49 +00:00
gdkchan	5af1327068	Vulkan: Fix sampler custom border color (#3751 )	2022-10-10 08:35:44 +02:00
gdkchan	88a8d1e567	Fix disposed textures being updated on TextureBindingsManager (#3750 ) * Fix disposed textures being updated on TextureBindingsManager * PR feedback	2022-10-09 15:23:52 -03:00
riperiperi	bf77d1cab9	GPU: Pass SpanOrArray for Texture SetData to avoid copy (#3745 ) * GPU: Pass SpanOrArray for Texture SetData to avoid copy Texture data is often converted before upload, meaning that an array was allocated to perform the conversion into. However, the backend SetData methods were being passed a Span of that data, and the Multithreaded layer does `ToArray()` on it so that it can be stored for later! This method can't extract the original array, so it creates a copy. This PR changes the type passed for textures to a new ref struct called SpanOrArray, which is backed by either a ReadOnlySpan or an array. The benefit here is that we can have a ToArray method that doesn't copy if it is originally backed by an array. This will also avoid a copy when running the ASTC decoder. On NieR this was taking 38% of texture upload time, which it does a _lot_ of when you move between areas, so there should be a 1.6x performance boost when strictly uploading textures. No doubt this will also improve texture streaming performance in UE4 games, and maybe a small reduction with video playback. From the numbers, it's probably possible to improve the upload rate by a further 1.6x by performing layout conversion on GPU. I'm not sure if we could improve it further than that - multithreading conversion on CPU would probably result in memory bottleneck. This doesn't extend to buffers, since we don't convert their data on the GPU emulator side. * Remove implicit cast to array.	2022-10-08 12:04:47 -03:00
riperiperi	1ca0517c99	Vulkan: Fix some issues with CacheByRange (#3743 ) * Fix some issues with CacheByRange - Cache now clears under more circumstances, the most important being the fast path write. - Cache supports partial clear which should help when more buffers join. - Fixed an issue with I8->I16 conversion where it wouldn't register the buffer for use on dispose. Should hopefully fix issues with https://github.com/Ryujinx/Ryujinx-Games-List/issues/4010 and maybe others. * Fix collection modified exception * Fix accidental use of parameterless constructor * Replay DynamicState when restoring from helper shader	2022-10-08 11:28:27 -03:00
gdkchan	599d485bff	Change NvMap ID allocation to match nvservices (#3741 ) * Change NvMap ID allocation to match nvservices * Move NvMapIdDictionary to Types	2022-10-05 17:49:18 -03:00
gdkchan	60e16c15b6	Fix memory corruption in BCAT and FS Read methods when buffer is larger than needed (#3739 ) * Fix memory corruption in FS Read methods when buffer is larger than needed * PR feedback * nit: Don't move this around	2022-10-04 20:12:54 -03:00
gdkchan	2068445939	Fix shader SULD (bindless) instruction using wrong register as handle (#3732 ) * GLSL: Do not generate scale helpers if we have no textures * Fix shader SULD (bindless) instruction using wrong register as handle	2022-10-03 20:40:22 -03:00
gdkchan	a4fc9f8050	Support use of buffer ranges with size 0 (#3736 )	2022-10-03 20:08:38 -03:00
gdkchan	5437d6cb13	Vulkan: Fix buffer texture storage not being updated on buffer handle reuse (#3731 )	2022-10-03 19:45:33 -03:00
Emmanuel Hansen	7539e26144	Avalonia - Fixes updater (#3670 ) * update avalonia * fix updater * fix spacing * addressed review * convert permission value to octal * Add missing comma * revert package updates	2022-10-03 11:25:25 -03:00
Luna	1c3697b6a4	Update AboutWindow.axaml (#3724 )	2022-10-02 22:02:11 +00:00
gdkchan	81f848e54f	Allow Surface Flinger frame enqueue after process has exited (#3733 )	2022-10-02 21:50:03 +00:00
MutantAura	358a781639	Volume Hotkeys (#3500 ) * Initial GTK implementation * Less messy and Avalonia imp * Move clamping to HLE and streamline imps * Make viewmodel update consistent * Fix rebase and add an english locale. Co-authored-by: Mary-nyan <mary@mary.zone>	2022-10-02 09:38:37 +00:00
Wunk	45ce540b9b	ARMeilleure: Add `gfni` acceleration (#3669 ) * ARMeilleure: Add `GFNI` detection This is intended for utilizing the `gf2p8affineqb` instruction * ARMeilleure: Add `gf2p8affineqb` Not using the VEX or EVEX-form of this instruction is intentional. There are `GFNI`-chips that do not support AVX(so no VEX encoding) such as Tremont(Lakefield) chips as well as Jasper Lake. `13df339fe7/GenuineIntel/GenuineIntel00806A1_Lakefield_LC_InstLatX64.txt (L1297-L1299)` `13df339fe7/GenuineIntel/GenuineIntel00906C0_JasperLake_InstLatX64.txt (L1252-L1254)` * ARMeilleure: Add `gfni` acceleration of `Rbit_V` Passes all `Rbit_V` unit tests on my `i9-11900k` ARMeilleure: Add `gfni` acceleration of `S{l,r}i_V` Also added a fast-path for when the shift amount is greater than the size of the element. * ARMeilleure: Add `gfni` acceleration of `Shl_V` and `Sshr_V` * ARMeilleure: Increment InternalVersion * ARMeilleure: Fix Intrinsic and Assembler Table alignment `gf2p8affineqb` is the longest instruction name I know of. It shouldn't get any wider than this. * ARMeilleure: Remove SSE2+SHA requirement for GFNI * ARMeilleure Add `X86GetGf2p8LogicalShiftLeft` Used to generate GF(2^8) 8x8 bit-matrices for bit-shifting for the `gf2p8affineqb` instruction. * ARMeilleure: Append `FeatureInfo7Ecx` to `FeatureInfo`	2022-10-02 11:17:19 +02:00
mageven	96bf7f8522	Avoid allocating unmanaged string per shader (#3730 ) * Avoid reallocating same unmanaged string per shader * Address PR feedback * Rename to _disposed	2022-10-02 10:59:34 +02:00
Ac_K	33e673ceb8	fatal: Implement Service (#3573 ) * fatal: Implement Service This PR adds a basic implementation of fatal service, guest processes call it when there is something wrong. But since we can already have all informations by debugging it's not really useful. In any case, that's avoid an unimplemented service exception. Structs/Enum are based on Atmosphère source code. After logs the error report, I call SvcBreak. Feedbacks are welcome on this, since some guests calls it right after fatal service so I can remove it if needed. * Addresses gdkchan feedback	2022-10-02 10:30:46 +02:00
gdkchan	9c2500de5f	Fix incorrect tessellation inputs/outputs (#3728 ) * Fix incorrect tessellation inputs/outputs * Shader cache version bump	2022-10-01 02:35:52 -03:00
gdkchan	dbe43c1719	Fix SSL GetCertificates with certificate ID set to All (#3727 ) * Fix SSL GetCertificates with certificate ID set to All * Fix last entry status value	2022-09-29 12:45:25 -03:00
riperiperi	f502cfaf62	Vulkan: Zero blend state when disabled or write mask is 0 (#3719 ) * Zero blend state when disabled or write mask is 0 Any difference in the blend state when blend is disabled is meaningless, but Ryujinx would compare different disabled blends and compile them as separate pipelines. This change ensures that all pipelines where blend state is meaningless record it as such, which avoids compiling a bunch of pipelines that are essentially identical. The NVIDIA driver is pretty forgiving when it comes to silly pipeline misses like this, but other drivers don't offer the same level of kindness. This should reduce stuttering on those drivers, and might improve overall performance very slightly due to less pipeline variants being in the hash table. * Fix blend possibly being wrong when an attachment is unmasked	2022-09-29 12:32:49 -03:00
gdkchan	1fd5cf2b4a	Fix ListOpenContextStoredUsers and stub LoadOpenContext (#3718 ) * Fix ListOpenContextStoredUsers and stub LoadOpenContext * Remove nonsensical comment	2022-09-27 21:24:52 -03:00
LDj3SNuD	814f75142e	Fpsr and Fpcr freed. (#3701 ) * Implemented in IR the managed methods of the Saturating region ... ... of the SoftFallback class (the SatQ ones). The need to natively manage the Fpcr and Fpsr system registers is still a fact. Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Ptc.InternalVersion = 3665 * Addressed PR feedback. * Implemented in IR the managed methods of the ShlReg region of the SoftFallback class. It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Fpsr and Fpcr freed. Handling/isolation of Fpsr and Fpcr via register for IR and via memory for Tests and Threads, with synchronization to context exchanges (explicit for SoftFloat); without having to call managed methods. Thanks to the inlining work of the previous two PRs and others in this. Tests performed locally in both release and debug modes, in both lowcq and highcq, with FastFP to true and false (explicit FP tests included). Tested with the title Tony Hawk's PS. Depends on shlreg. * Update InstEmitSimdHelper.cs * De-magic Masks. Remove the Stride and Len flags; Fpsr.NZCV are A32 only, then moved to Fpscr: this leads to emitting less IR in reference to Get/Set Fpsr/Fpcr/Fpscr methods in reference to Mrs/Msr (A64) and Vmrs/Vmsr (A32) instructions. * Addressed PR feedback.	2022-09-20 18:55:13 -03:00
riperiperi	4c0eb91d7e	Convert Quads to Triangles in Vulkan (#3715 ) * Add Index Buffer conversion for quads to Vulkan Also adds a reusable repeating pattern index buffer to use for non-indexed draws, and generalizes the conversion cache for buffers. * Fix some issues * End render pass before conversion * Resume transform feedback after we ensure we're in a pass. * Always generate UInt32 type indices for topology conversion * No it's not. * Remove unused code * Rely on TopologyRemap to convert quads to tris. * Remove double newline * Ensure render pass ends before stride or I8 conversion	2022-09-20 18:38:48 -03:00
gdkchan	da75a9a6ea	OpenGL: Fix blit from non-multisample to multisample texture (#3596 ) * OpenGL: Fix blit from non-multisample to multisample texture * New approach for multisample copy using compute shaders	2022-09-19 16:12:56 -03:00
MutantAura	41790aa743	Avalonia - Misc changes to UX (#3643 ) * Change navbar from compact to default and force text overflow globally * Fix settings window * Fix right stick control alignment * Initialize value and add logging for SDL IDs * Fix alignment of setting text and improve borders * Clean up padding and size of buttons on controller settings * Fix right side trigger alignment and correct styling * Revert axaml alignment * Fix alignment of volume widget * Fix timezone autocompletebox dropdown height * MainWindow: Line up volume status bar item * Remove margins and add padding to volume widget * Make volume text localizable. Co-authored-by: merry <git@mary.rs>	2022-09-19 16:04:22 -03:00
gdkchan	0cb1e926b5	Allow bindless textures with handles from unbound constant buffer (#3706 )	2022-09-19 15:35:47 -03:00
Emmanuel Hansen	6f0395538b	Avalonia - Use embedded window for avalonia (#3674 ) * wip * use embedded window * fix race condition on opengl Windows * fix glx issues on prime nvidia * fix mouse support win32 * clean up * addressed review * addressed review * fix warnings * fix sotware keyboard dialog * Update Ryujinx.Ava/Ui/Applet/SwkbdAppletDialog.axaml.cs Co-authored-by: gdkchan <gab.dark.100@gmail.com> * remove double semi Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2022-09-19 15:05:26 -03:00
LDj3SNuD	b9f1ff3c77	Implemented in IR the managed methods of the ShlReg region of the SoftFallback class. (#3700 ) * Implemented in IR the managed methods of the Saturating region ... ... of the SoftFallback class (the SatQ ones). The need to natively manage the Fpcr and Fpsr system registers is still a fact. Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Ptc.InternalVersion = 3665 * Addressed PR feedback. * Implemented in IR the managed methods of the ShlReg region of the SoftFallback class. It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665). All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq. * Update InstEmitSimdHelper.cs	2022-09-19 14:49:10 -03:00
TSRBerry	a77af4c5e9	Readme: Fix broken shell image (#3708 )	2022-09-19 14:06:00 +02:00
merry	fbcf802fbc	A32/T32/A64: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) (#3694 ) * OpCodeTable: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) * A64: Remove catch-all Hint instruction * T16: Handle unallocated hint instructions Some thumb tests execute these assuming that they're nops. * T32: Fill out other Hint instructions * A32: Fill out other hint instructions	2022-09-14 18:18:15 -03:00
riperiperi	c3c41fa4bb	Periodically Flush Commands for Vulkan (#3689 ) * Periodically Flush Commands for Vulkan NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now. Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner. This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws. This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency. By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built. The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread. Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck. Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely. Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO) * Remove unused variable * Fix possible issue with early query flush	2022-09-14 13:48:31 -03:00
gdkchan	356e480bf5	Fix partial unmap reprotection on Windows (#3702 )	2022-09-14 17:46:37 +02:00
gdkchan	8e119a1e96	Implement PLD and SUB (imm16) on T32, plus UADD8, SADD8, USUB8 and SSUB8 on both A32 and T32 (#3693 )	2022-09-13 19:51:40 -03:00
merry	e05bf90af6	T32: Implement Asimd instructions (#3692 )	2022-09-13 18:25:37 -03:00

... 7 8 9 10 11 ...

2696 commits