summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-28Merge tag 'v5.15.89' into 5.15-2.2.x-imxEmanuele Ghidoli
This is the 5.15.89 stable release Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
2023-08-28Merge tag 'v5.15.88' into 5.15-2.2.x-imxEmanuele Ghidoli
This is the 5.15.88 stable release Conflicts: drivers/tty/serial/fsl_lpuart.c Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
2023-02-13Merge remote-tracking branch 'nxp-imx/lf-5.15.y' into 5.15-2.2.x-imxOtavio Salvador
Signed-off-by: Otavio Salvador <otavio@ossystems.com.br>
2023-02-08Merge pull request #624 from MrCry0/5.15-2.2.x-imx-fix-lpuartOtavio Salvador
5.15 2.2.x-imx: refix lpuart
2023-02-08serial: fsl_lpuart: fix driver stuckOleksandr Suvorov
The calls reorder done in commit [1] works well in the upstream kernel. The NXP version of the kernel needs a port to be attached to the driver earlier, otherwise, the driver is stuck. Move back calls order as a workaround. It may break working rs485. The only board that uses this driver for the RS-485 port is "vf610-zii-scu4-aib" which is quite legacy. [1] commit b079d3775237 ("serial: Deassert Transmit Enable on probe in driver-specific way") Fixes: b9ae52e89c61 ("serial: fsl_lpuart: fix merging issue") Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@foundries.io>
2023-02-07LF-8313-2: Update License to GPL-2.0Kshitiz Varshney
The license was wrong by mistake, this commit updates the license to GPL-2.0 and removes the author name from file for maintaining confidentiality as this code is NXP owned. Signed-off-by: Kshitiz Varshney <kshitiz.varshney@nxp.com>
2023-02-07LF-8313-1: Update License to GPL-2.0Kshitiz Varshney
The license was wrong by mistake, this commit updates the license to GPL-2.0 and removes the author name from file for maintaining confidentiality as this code is NXP owned. Signed-off-by: Kshitiz Varshney <kshitiz.varshney@nxp.com>
2023-02-03Merge pull request #621 from MrCry0/5.15-2.2.x-imx-lpuartOtavio Salvador
serial: fsl_lpuart: fix merging issue
2023-02-03serial: fsl_lpuart: fix merging issueOleksandr Suvorov
The commit [1] moves adding uart port after getting all options for rs485. Merge commit mistakenly leaves both uart_add_one_port() calls in place. Fix it. [1] commit b079d3775237 ("serial: Deassert Transmit Enable on probe in driver-specific way") Fixes: 3ff5eb3ff57e ("Merge pull request #620 from angolini/5.15-2.2.x-imx-bump") Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@foundries.io>
2023-01-18Linux 5.15.89Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20230116154747.036911298@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18pinctrl: amd: Add dynamic debugging for active GPIOsMario Limonciello
commit 1d66e379731f79ae5039a869c0fde22a4f6a6a91 upstream. Some laptops have been reported to wake up from s2idle when plugging in the AC adapter or by closing the lid. This is a surprising behavior that is further clarified by commit cb3e7d624c3ff ("PM: wakeup: Add extra debugging statement for multiple active IRQs"). With that commit in place the following interaction can be seen when the lid is closed: [ 28.946038] PM: suspend-to-idle [ 28.946083] ACPI: EC: ACPI EC GPE status set [ 28.946101] ACPI: PM: Rearming ACPI SCI for wakeup [ 28.950152] Timekeeping suspended for 3.320 seconds [ 28.950152] PM: Triggering wakeup from IRQ 9 [ 28.950152] ACPI: EC: ACPI EC GPE status set [ 28.950152] ACPI: EC: ACPI EC GPE dispatched [ 28.995057] ACPI: EC: ACPI EC work flushed [ 28.995075] ACPI: PM: Rearming ACPI SCI for wakeup [ 28.995131] PM: Triggering wakeup from IRQ 9 [ 28.995271] ACPI: EC: ACPI EC GPE status set [ 28.995291] ACPI: EC: ACPI EC GPE dispatched [ 29.098556] ACPI: EC: ACPI EC work flushed [ 29.207020] ACPI: EC: ACPI EC work flushed [ 29.207037] ACPI: PM: Rearming ACPI SCI for wakeup [ 29.211095] Timekeeping suspended for 0.739 seconds [ 29.211095] PM: Triggering wakeup from IRQ 9 [ 29.211079] PM: Triggering wakeup from IRQ 7 [ 29.211095] ACPI: PM: ACPI non-EC GPE wakeup [ 29.211095] PM: resume from suspend-to-idle * IRQ9 on this laptop is used for the ACPI SCI. * IRQ7 on this laptop is used for the GPIO controller. What has occurred is when the lid was closed the EC woke up the SoC from it's deepest sleep state and the kernel's s2idle loop processed all EC events. When it was finished processing EC events, it checked for any other reasons to wake (break the s2idle loop). The IRQ for the GPIO controller was active so the loop broke, and then this IRQ was processed. This is not a kernel bug but it is certainly a surprising behavior, and to better debug it we should have a dynamic debugging message that we can enact to catch it. Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Mark Pearson <markpearson@lenovo.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20221013134729.5592-2-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout"Ferry Toth
commit b659b613cea2ae39746ca8bd2b69d1985dd9d770 upstream. This reverts commit 8a7b31d545d3a15f0e6f5984ae16f0ca4fd76aac. This patch results in some qemu test failures, specifically xilinx-zynq-a9 machine and zynq-zc702 as well as zynq-zed devicetree files, when trying to boot from USB drive. Link: https://lore.kernel.org/lkml/20221220194334.GA942039@roeck-us.net/ Fixes: 8a7b31d545d3 ("usb: ulpi: defer ulpi_register on ulpi_read_id timeout") Cc: stable@vger.kernel.org Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ferry Toth <ftoth@exalondelft.nl> Link: https://lore.kernel.org/r/20221222205302.45761-1-ftoth@exalondelft.nl Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18block: handle bio_split_to_limits() NULL returnJens Axboe
commit 613b14884b8595e20b9fac4126bf627313827fbe upstream. This can't happen right now, but in preparation for allowing bio_split_to_limits() returning NULL if it ended the bio, check for it in all the callers. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18io_uring/io-wq: only free worker if it was allocated for creationJens Axboe
commit e6db6f9398dadcbc06318a133d4c44a2d3844e61 upstream. We have two types of task_work based creation, one is using an existing worker to setup a new one (eg when going to sleep and we have no free workers), and the other is allocating a new worker. Only the latter should be freed when we cancel task_work creation for a new worker. Fixes: af82425c6a2d ("io_uring/io-wq: free worker if task_work creation is canceled") Reported-by: syzbot+d56ec896af3637bdb7e4@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18io_uring/io-wq: free worker if task_work creation is canceledJens Axboe
commit af82425c6a2d2f347c79b63ce74fca6dc6be157f upstream. If we cancel the task_work, the worker will never come into existance. As this is the last reference to it, ensure that we get it freed appropriately. Cc: stable@vger.kernel.org Reported-by: 진호 <wnwlsgh98@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18scsi: mpt3sas: Remove scsi_dma_map() error messagesSreekanth Reddy
commit 0c25422d34b4726b2707d5f38560943155a91b80 upstream. When scsi_dma_map() fails by returning a sges_left value less than zero, the amount of logging produced can be extremely high. In a recent end-user environment, 1200 messages per second were being sent to the log buffer. This eventually overwhelmed the system and it stalled. These error messages are not needed. Remove them. Link: https://lore.kernel.org/r/20220303140203.12642-1-sreekanth.reddy@broadcom.com Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18efi: fix NULL-deref in init error pathJohan Hovold
[ Upstream commit 703c13fe3c9af557d312f5895ed6a5fda2711104 ] In cases where runtime services are not supported or have been disabled, the runtime services workqueue will never have been allocated. Do not try to destroy the workqueue unconditionally in the unlikely event that EFI initialisation fails to avoid dereferencing a NULL pointer. Fixes: 98086df8b70c ("efi: add missed destroy_workqueue when efisubsys_init fails") Cc: stable@vger.kernel.org Cc: Li Heng <liheng40@huawei.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18arm64: cmpxchg_double*: hazard against entire exchange variableMark Rutland
[ Upstream commit 031af50045ea97ed4386eb3751ca2c134d0fc911 ] The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960bed5e857eb ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // #18 | 10: d2800681 mov x1, #0x34 // #52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // #86 | 20: d2800f03 mov x3, #0x78 // #120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // #18 | 44: d2800681 mov x1, #0x34 // #52 | 48: d2800ac2 mov x2, #0x56 // #86 | 4c: d2800f03 mov x3, #0x78 // #120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, #8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // #18 | 14: d2800681 mov x1, #0x34 // #52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // #86 | 24: d2800f03 mov x3, #0x78 // #120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, #8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // #18 | 4c: d2800681 mov x1, #0x34 // #52 | 50: d2800ac2 mov x2, #0x56 // #86 | 54: d2800f03 mov x3, #0x78 // #120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, #8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b4bc8a ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b795652f ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/Y6GXoO4qmH9OIZ5Q@hirez.programming.kicks-ass.net/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230104151626.3262137-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18arm64: atomics: remove LL/SC trampolinesMark Rutland
[ Upstream commit b2c3ccbd0011bb3b51d0fec24cb3a5812b1ec8ea ] When CONFIG_ARM64_LSE_ATOMICS=y, each use of an LL/SC atomic results in a fragment of code being generated in a subsection without a clear association with its caller. A trampoline in the caller branches to the LL/SC atomic with with a direct branch, and the atomic directly branches back into its trampoline. This breaks backtracing, as any PC within the out-of-line fragment will be symbolized as an offset from the nearest prior symbol (which may not be the function using the atomic), and since the atomic returns with a direct branch, the caller's PC may be missing from the backtrace. For example, with secondary_start_kernel() hacked to contain atomic_inc(NULL), the resulting exception can be reported as being taken from cpus_are_stuck_in_kernel(): | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004 | CM = 0, WnR = 0 | [0000000000000000] user address but active_mm is swapper | Internal error: Oops: 96000004 [#1] PREEMPT SMP | Modules linked in: | CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19.0-11219-geb555cb5b794-dirty #3 | Hardware name: linux,dummy-virt (DT) | pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : cpus_are_stuck_in_kernel+0xa4/0x120 | lr : secondary_start_kernel+0x164/0x170 | sp : ffff80000a4cbe90 | x29: ffff80000a4cbe90 x28: 0000000000000000 x27: 0000000000000000 | x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 | x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000 | x20: 0000000000000001 x19: 0000000000000001 x18: 0000000000000008 | x17: 3030383832343030 x16: 3030303030307830 x15: ffff80000a4cbab0 | x14: 0000000000000001 x13: 5d31666130663133 x12: 3478305b20313030 | x11: 3030303030303078 x10: 3020726f73736563 x9 : 726f737365636f72 | x8 : ffff800009ff2ef0 x7 : 0000000000000003 x6 : 0000000000000000 | x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000100 | x2 : 0000000000000000 x1 : ffff0000029bd880 x0 : 0000000000000000 | Call trace: | cpus_are_stuck_in_kernel+0xa4/0x120 | __secondary_switched+0xb0/0xb4 | Code: 35ffffa3 17fffc6c d53cd040 f9800011 (885f7c01) | ---[ end trace 0000000000000000 ]--- This is confusing and hinders debugging, and will be problematic for CONFIG_LIVEPATCH as these cases cannot be unwound reliably. This is very similar to recent issues with out-of-line exception fixups, which were removed in commits: 35d67794b8828333 ("arm64: lib: __arch_clear_user(): fold fixups into body") 4012e0e22739eef9 ("arm64: lib: __arch_copy_from_user(): fold fixups into body") 139f9ab73d60cf76 ("arm64: lib: __arch_copy_to_user(): fold fixups into body") When the trampolines were introduced in commit: addfc38672c73efd ("arm64: atomics: avoid out-of-line ll/sc atomics") The rationale was to improve icache performance by grouping the LL/SC atomics together. This has never been measured, and this theoretical benefit is outweighed by other factors: * As the subsections are collapsed into sections at object file granularity, these are spread out throughout the kernel and can share cachelines with unrelated code regardless. * GCC 12.1.0 has been observed to place the trampoline out-of-line in specialised __ll_sc_*() functions, introducing more branching than was intended. * Removing the trampolines has been observed to shrink a defconfig kernel Image by 64KiB when building with GCC 12.1.0. This patch removes the LL/SC trampolines, meaning that the LL/SC atomics will be inlined into their callers (or placed in out-of line functions using regular BL/RET pairs). When CONFIG_ARM64_LSE_ATOMICS=y, the LL/SC atomics are always called in an unlikely branch, and will be placed in a cold portion of the function, so this should have minimal impact to the hot paths. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220817155914.3975112-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: 031af50045ea ("arm64: cmpxchg_double*: hazard against entire exchange variable") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18arm64: atomics: format whitespace consistentlyMark Rutland
[ Upstream commit 8e6082e94aac6d0338883b5953631b662a5a9188 ] The code for the atomic ops is formatted inconsistently, and while this is not a functional problem it is rather distracting when working on them. Some have ops have consistent indentation, e.g. | #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ | static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \ | { \ | u32 tmp; \ | \ | asm volatile( \ | __LSE_PREAMBLE \ | " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ | " add %w[i], %w[i], %w[tmp]" \ | : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ | : "r" (v) \ | : cl); \ | \ | return i; \ | } While others have negative indentation for some lines, and/or have misaligned trailing backslashes, e.g. | static inline void __lse_atomic_##op(int i, atomic_t *v) \ | { \ | asm volatile( \ | __LSE_PREAMBLE \ | " " #asm_op " %w[i], %[v]\n" \ | : [i] "+r" (i), [v] "+Q" (v->counter) \ | : "r" (v)); \ | } This patch makes the indentation consistent and also aligns the trailing backslashes. This makes the code easier to read for those (like myself) who are easily distracted by these inconsistencies. This is intended as a cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: 031af50045ea ("arm64: cmpxchg_double*: hazard against entire exchange variable") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18io_uring: lock overflowing for IOPOLLPavel Begunkov
commit 544d163d659d45a206d8929370d5a2984e546cb7 upstream. syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace:  io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734  io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773  io_fill_cqe_req io_uring/io_uring.h:168 [inline]  io_do_iopoll+0x474/0x62c io_uring/rw.c:1065  io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513  io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056  io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869  process_one_work+0x2d8/0x504 kernel/workqueue.c:2289  worker_thread+0x340/0x610 kernel/workqueue.c:2436  kthread+0x12c/0x158 kernel/kthread.c:376  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path. Reported-and-tested-by: syzbot+6805087452d72929404e@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUIDPaolo Bonzini
[ Upstream commit 45e966fcca03ecdcccac7cb236e16eea38cc18af ] Passing the host topology to the guest is almost certainly wrong and will confuse the scheduler. In addition, several fields of these CPUID leaves vary on each processor; it is simply impossible to return the right values from KVM_GET_SUPPORTED_CPUID in such a way that they can be passed to KVM_SET_CPUID2. The values that will most likely prevent confusion are all zeroes. Userspace will have to override it anyway if it wishes to present a specific topology to the guest. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18Documentation: KVM: add API issues sectionPaolo Bonzini
[ Upstream commit cde363ab7ca7aea7a853851cd6a6745a9e1aaf5e ] Add a section to document all the different ways in which the KVM API sucks. I am sure there are way more, give people a place to vent so that userspace authors are aware. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220322110712.222449-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18mm: Always release pages to the buddy allocator in memblock_free_late().Aaron Thompson
[ Upstream commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 ] If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. All pages in reserved ranges have been initialized at that point, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. One case where this issue can occur in the wild is EFI boot on x86_64. The x86 EFI code reserves all EFI boot services memory ranges via memblock_reserve() and frees them later via memblock_free_late() (efi_reserve_boot_services() and efi_free_boot_services(), respectively). If any of those ranges happens to fall within the deferred init range, the pages will not be released and that memory will be unavailable. For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI: v6.2-rc2: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 178867 v6.2-rc2 + patch: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 222816 # +43,949 pages Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Aaron Thompson <dev@aaront.org> Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18platform/surface: aggregator: Add missing call to ssam_request_sync_free()Maximilian Luz
[ Upstream commit c965daac370f08a9b71d573a71d13cda76f2a884 ] Although rare, ssam_request_sync_init() can fail. In that case, the request should be freed via ssam_request_sync_free(). Currently it is leaked instead. Fix this. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20221220175608.1436273-1-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18igc: Fix PPS delta between two synchronized end-pointsChristopher S Hall
[ Upstream commit 5e91c72e560cc85f7163bbe3d14197268de31383 ] This patch fix the pulse per second output delta between two synchronized end-points. Based on Intel Discrete I225 Software User Manual Section 4.2.15 TimeSync Auxiliary Control Register, ST0[Bit 4] and ST1[Bit 7] must be set to ensure that clock output will be toggles based on frequency value defined. This is to ensure that output of the PPS is aligned with the clock. How to test: 1) Running time synchronization on both end points. Ex: ptp4l --step_threshold=1 -m -f gPTP.cfg -i <interface name> 2) Configure PPS output using below command for both end-points Ex: SDP0 on I225 REV4 SKU variant ./testptp -d /dev/ptp0 -L 0,2 ./testptp -d /dev/ptp0 -p 1000000000 3) Measure the output using analyzer for both end-points Fixes: 87938851b6ef ("igc: enable auxiliary PHC functions for the i225") Signed-off-by: Christopher S Hall <christopher.s.hall@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf build: Properly guard libbpf includesIan Rogers
[ Upstream commit d891f2b724b39a2a41e3ad7b57110193993242ff ] Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT. In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL. Fixes: d6a735ef3277c45f ("perf bpf_counter: Move common functions to bpf_counter.h") Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: Don't support encap rules with gbp optionGavin Li
[ Upstream commit d515d63cae2cd186acf40deaa8ef33067bb7f637 ] Previously, encap rules with gbp option would be offloaded by mistake but driver does not support gbp option offload. To fix this issue, check if the encap rule has gbp option and don't offload the rule Fixes: d8f9dfae49ce ("net: sched: allow flower to match vxlan options") Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5: Fix ptp max frequency adjustment rangeRahul Rameshbabu
[ Upstream commit fe91d57277eef8bb4aca05acfa337b4a51d0bba4 ] .max_adj of ptp_clock_info acts as an absolute value for the amount in ppb that can be set for a single call of .adjfine. This means that a single call to .getfine cannot be greater than .max_adj or less than -(.max_adj). Provides correct value for max frequency adjustment value supported by devices. Fixes: 3d8c38af1493 ("net/mlx5e: Add PTP Hardware Clock (PHC) support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/sched: act_mpls: Fix warning during failed attribute validationIdo Schimmel
[ Upstream commit 9e17f99220d111ea031b44153fdfe364b0024ff2 ] The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid combination according to the comment above 'struct nla_policy': " Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN: NLA_BINARY Validation function called for the attribute. All other Unused - but note that it's a union " This can trigger the warning [1] in nla_get_range_unsigned() when validation of the attribute fails. Despite being of 'NLA_U32' type, the associated 'min'/'max' fields in the policy are negative as they are aliased by the 'validate' field. Fix by changing the attribute type to 'NLA_BINARY' which is consistent with the above comment and all other users of NLA_POLICY_VALIDATE_FN(). As a result, move the length validation to the validation function. No regressions in MPLS tests: # ./tdc.py -f tc-tests/actions/mpls.json [...] # echo $? 0 [1] WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118 nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 Modules linked in: CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 [...] Call Trace: <TASK> __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310 netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411 netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline] netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506 netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x38f/0x500 net/socket.c:2482 ___sys_sendmsg net/socket.c:2536 [inline] __sys_sendmsg+0x197/0x230 net/socket.c:2565 __do_sys_sendmsg net/socket.c:2574 [inline] __se_sys_sendmsg net/socket.c:2572 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/ Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC") Reported-by: Wei Chen <harperchen1110@gmail.com> Tested-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: fix the O_* fcntl/open macro definitions for riscvWilly Tarreau
[ Upstream commit 00b18da4089330196906b9fe075c581c17eb726c ] When RISCV port was imported in 5.2, the O_* macros were taken with their octal value and written as-is in hex, resulting in the getdents64() to fail in nolibc-test. Fixes: 582e84f7b779 ("tool headers nolibc: add RISCV support") #5.2 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: restore mips branch ordering in the _start blockWilly Tarreau
[ Upstream commit 184177c3d6e023da934761e198c281344d7dd65b ] Depending on the compiler used and the optimization options, the sbrk() test was crashing, both on real hardware (mips-24kc) and in qemu. One such example is kernel.org toolchain in version 11.3 optimizing at -Os. Inspecting the sys_brk() call shows the following code: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 27bd0020 addiu sp,sp,32 40048c: 10e00001 beqz a3,400494 <sys_brk+0x18> 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra It is obviously wrong, the "negu" instruction is placed in beqz's delayed slot, and worse, there's no nop nor instruction after the return, so the next function's first instruction (addiu sip,sip,-32) will also be executed as part of the delayed slot that follows the return. This is caused by the ".set noreorder" directive in the _start block, that applies to the whole program. The compiler emits code without the delayed slots and relies on the compiler to swap instructions when this option is not set. Removing the option would require to change the startup code in a way that wouldn't make it look like the resulting code, which would not be easy to debug. Instead let's just save the default ordering before changing it, and restore it at the end of the _start block. Now the code is correct: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 10e00002 beqz a3,400494 <sys_brk+0x18> 40048c: 27bd0020 addiu sp,sp,32 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra 400498: 00000000 nop Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") #5.0 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: Remove .global _start from the entry point codeAmmar Faizi
[ Upstream commit 1590c59836dace3a20945bad049fe8802c4e6f3f ] Building with clang yields the following error: ``` <inline asm>:3:1: error: _start changed binding to STB_GLOBAL .global _start ^ 1 error generated. ``` Make sure only specify one between `.global _start` and `.weak _start`. Remove `.global _start`. Cc: llvm@lists.linux.dev Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc/arch: mark the _start symbol as weakWilly Tarreau
[ Upstream commit dffeb81af5fe5eedccf5ea4a8a120d8c3accd26e ] By doing so we can link together multiple C files that have been compiled with nolibc and which each have a _start symbol. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc/arch: split arch-specific code into individual filesWilly Tarreau
[ Upstream commit 271661c1cde5ff47eb7af9946866cd66b70dc328 ] In order to ease maintenance, this splits the arch-specific code into one file per architecture. A common file "arch.h" is used to include the right file among arch-* based on the detected architecture. Projects which are already split per architecture could simply rename these files to $arch/arch.h and get rid of the common arch.h. For this reason, include guards were placed into each arch-specific file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc/types: split syscall-specific definitions into their own filesWilly Tarreau
[ Upstream commit cc7a492ad0a076dff5cb4281b1516676d7924fcf ] The macros and type definitions used by a number of syscalls were moved to types.h where they will be easier to maintain. A few of them are arch-specific and must not be moved there (e.g. O_*, sys_stat_struct). A warning about them was placed at the top of the file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc/std: move the standard type definitions to std.hWilly Tarreau
[ Upstream commit 967cce191f50090d5cbd3841ee2bbb7835afeae2 ] The ordering of includes and definitions for now is a bit of a mess, as for example asm/signal.h is included after int definitions, but plenty of structures are defined later as they rely on other includes. Let's move the standard type definitions to a dedicated file that is included first. We also move NULL there. This way all other includes are aware of it, and we can bring asm/signal.h back to the top of the file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: use pselect6 on RISCVWilly Tarreau
[ Upstream commit 9c2970fbb425cca0256ecf0f96490e4f253fda24 ] This arch doesn't provide the old-style select() syscall, we have to use pselect6(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax`Ammar Faizi
[ Upstream commit 7bdc0e7a390511cd3df8194003b908f15a6170a5 ] Note that mov to 32-bit register will zero extend to 64-bit register. Thus `mov $60,%eax` has the same effect with `mov $60,%rax`. Use the shorter opcode to achieve the same thing. ``` b8 3c 00 00 00 mov $60,%eax (5 bytes) [1] 48 c7 c0 3c 00 00 00 mov $60,%rax (7 bytes) [2] ``` Currently, we use [2]. Change it to [1] for shorter code. Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber listAmmar Faizi
[ Upstream commit bf91666959eeac44fb686e9359e37830944beef2 ] Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory"). - rax for the return value. - rcx to save the return address. - r11 to save the rflags. Other registers are preserved. Having r8, r9 and r10 in the syscall clobber list is harmless, but this results in a missed-optimization. As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse their value after the syscall returns to userspace. But since they are in the clobber list, GCC will always miss this opportunity. Remove them from the x86-64 syscall clobber list to help GCC generate better code and fix the comment. See also the x86-64 ABI, section A.2 AMD64 Linux Kernel Conventions, A.2.1 Calling Conventions [1]. Extra note: Some people may think it does not really give a benefit to remove r8, r9 and r10 from the syscall clobber list because the impression of syscall is a C function call, and function call always clobbers those 3. However, that is not the case for nolibc.h, because we have a potential to inline the "syscall" instruction (which its opcode is "0f 05") to the user functions. All syscalls in the nolibc.h are written as a static function with inline ASM and are likely always inline if we use optimization flag, so this is a profit not to have r8, r9 and r10 in the clobber list. Here is the example where this matters. Consider the following C code: ``` #include "tools/include/nolibc/nolibc.h" #define read_abc(a, b, c) __asm__ volatile("nop"::"r"(a),"r"(b),"r"(c)) int main(void) { int a = 0xaa; int b = 0xbb; int c = 0xcc; read_abc(a, b, c); write(1, "test\n", 5); read_abc(a, b, c); return 0; } ``` Compile with: gcc -Os test.c -o test -nostdlib With r8, r9, r10 in the clobber list, GCC generates this: 0000000000001000 <main>: 1000: f3 0f 1e fa endbr64 1004: 41 54 push %r12 1006: 41 bc cc 00 00 00 mov $0xcc,%r12d 100c: 55 push %rbp 100d: bd bb 00 00 00 mov $0xbb,%ebp 1012: 53 push %rbx 1013: bb aa 00 00 00 mov $0xaa,%ebx 1018: 90 nop 1019: b8 01 00 00 00 mov $0x1,%eax 101e: bf 01 00 00 00 mov $0x1,%edi 1023: ba 05 00 00 00 mov $0x5,%edx 1028: 48 8d 35 d1 0f 00 00 lea 0xfd1(%rip),%rsi 102f: 0f 05 syscall 1031: 90 nop 1032: 31 c0 xor %eax,%eax 1034: 5b pop %rbx 1035: 5d pop %rbp 1036: 41 5c pop %r12 1038: c3 ret GCC thinks that syscall will clobber r8, r9, r10. So it spills 0xaa, 0xbb and 0xcc to callee saved registers (r12, rbp and rbx). This is clearly extra memory access and extra stack size for preserving them. But syscall does not actually clobber them, so this is a missed optimization. Now without r8, r9, r10 in the clobber list, GCC generates better code: 0000000000001000 <main>: 1000: f3 0f 1e fa endbr64 1004: 41 b8 aa 00 00 00 mov $0xaa,%r8d 100a: 41 b9 bb 00 00 00 mov $0xbb,%r9d 1010: 41 ba cc 00 00 00 mov $0xcc,%r10d 1016: 90 nop 1017: b8 01 00 00 00 mov $0x1,%eax 101c: bf 01 00 00 00 mov $0x1,%edi 1021: ba 05 00 00 00 mov $0x5,%edx 1026: 48 8d 35 d3 0f 00 00 lea 0xfd3(%rip),%rsi 102d: 0f 05 syscall 102f: 90 nop 1030: 31 c0 xor %eax,%eax 1032: c3 ret Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: David Laight <David.Laight@ACULAB.COM> Acked-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id> Link: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI [1] Link: https://lore.kernel.org/lkml/20211011040344.437264-1-ammar.faizi@students.amikom.ac.id/ Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: 184177c3d6e0 ("tools/nolibc: restore mips branch ordering in the _start block") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18af_unix: selftest: Fix the size of the parameter to connect()Mirsad Goran Todorovac
[ Upstream commit 7d6ceeb1875cc08dc3d1e558e191434d94840cd5 ] Adjust size parameter in connect() to match the type of the parameter, to fix "No such file or directory" error in selftests/net/af_unix/ test_oob_unix.c:127. The existing code happens to work provided that the autogenerated pathname is shorter than sizeof (struct sockaddr), which is why it hasn't been noticed earlier. Visible from the trace excerpt: bind(3, {sa_family=AF_UNIX, sun_path="unix_oob_453059"}, 110) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa6a6577a10) = 453060 [pid <child>] connect(6, {sa_family=AF_UNIX, sun_path="unix_oob_45305"}, 16) = -1 ENOENT (No such file or directory) BUG: The filename is trimmed to sizeof (struct sockaddr). Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Florian Westphal <fw@strlen.de> Reviewed-by: Florian Westphal <fw@strlen.de> Fixes: 314001f0bf92 ("af_unix: Add OOB support") Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame()Minsuk Kang
[ Upstream commit 9dab880d675b9d0dd56c6428e4e8352a3339371d ] Fix a use-after-free that occurs in hcd when in_urb sent from pn533_usb_send_frame() is completed earlier than out_urb. Its callback frees the skb data in pn533_send_async_complete() that is used as a transfer buffer of out_urb. Wait before sending in_urb until the callback of out_urb is called. To modify the callback of out_urb alone, separate the complete function of out_urb and ack_urb. Found by a modified version of syzkaller. BUG: KASAN: use-after-free in dummy_timer Call Trace: memcpy (mm/kasan/shadow.c:65) dummy_perform_transfer (drivers/usb/gadget/udc/dummy_hcd.c:1352) transfer (drivers/usb/gadget/udc/dummy_hcd.c:1453) dummy_timer (drivers/usb/gadget/udc/dummy_hcd.c:1972) arch_static_branch (arch/x86/include/asm/jump_label.h:27) static_key_false (include/linux/jump_label.h:207) timer_expire_exit (include/trace/events/timer.h:127) call_timer_fn (kernel/time/timer.c:1475) expire_timers (kernel/time/timer.c:1519) __run_timers (kernel/time/timer.c:1790) run_timer_softirq (kernel/time/timer.c:1803) Fixes: c46ee38620a2 ("NFC: pn533: add NXP pn533 nfc device driver") Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18hvc/xen: lock console list traversalRoger Pau Monne
[ Upstream commit c0dccad87cf68fc6012aec7567e354353097ec1a ] The currently lockless access to the xen console list in vtermno_to_xencons() is incorrect, as additions and removals from the list can happen anytime, and as such the traversal of the list to get the private console data for a given termno needs to happen with the lock held. Note users that modify the list already do so with the lock taken. Adjust current lock takers to use the _irq{save,restore} helpers, since the context in which vtermno_to_xencons() is called can have interrupts disabled. Use the _irq{save,restore} set of helpers to switch the current callers to disable interrupts in the locked region. I haven't checked if existing users could instead use the _irq variant, as I think it's safer to use _irq{save,restore} upfront. While there switch from using list_for_each_entry_safe to list_for_each_entry: the current entry cursor won't be removed as part of the code in the loop body, so using the _safe variant is pointless. Fixes: 02e19f9c7cac ('hvc_xen: implement multiconsole support') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20221130163611.14686-1-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enableAngela Czubak
[ Upstream commit b4e9b8763e417db31c7088103cc557d55cb7a8f5 ] PF netdev can request AF to enable or disable reception and transmission on assigned CGX::LMAC. The current code instead of disabling or enabling 'reception and transmission' also disables/enable the LMAC. This patch fixes this issue. Fixes: 1435f66a28b4 ("octeontx2-af: CGX Rx/Tx enable/disable mbox handlers") Signed-off-by: Angela Czubak <aczubak@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230105160107.17638-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tipc: fix unexpected link reset due to discovery messagesTung Nguyen
[ Upstream commit c244c092f1ed2acfb5af3d3da81e22367d3dd733 ] This unexpected behavior is observed: node 1 | node 2 ------ | ------ link is established | link is established reboot | link is reset up | send discovery message receive discovery message | link is established | link is established send discovery message | | receive discovery message | link is reset (unexpected) | send reset message link is reset | It is due to delayed re-discovery as described in function tipc_node_check_dest(): "this link endpoint has already reset and re-established contact with the peer, before receiving a discovery message from that node." However, commit 598411d70f85 has changed the condition for calling tipc_node_link_down() which was the acceptance of new media address. This commit fixes this by restoring the old and correct behavior. Fixes: 598411d70f85 ("tipc: make resetting of links non-atomic") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ALSA: usb-audio: Relax hw constraints for implicit fb syncTakashi Iwai
[ Upstream commit d463ac1acb454fafed58f695cb3067fbf489f3a0 ] The fix commit the commit e4ea77f8e53f ("ALSA: usb-audio: Always apply the hw constraints for implicit fb sync") tried to address the bug where an incorrect PCM parameter is chosen when two (implicit fb) streams are set up at the same time. This change had, however, some side effect: once when the sync endpoint is chosen and set up, this restriction is applied at the next hw params unless it's freed via hw free explicitly. This patch is a workaround for the problem by relaxing the hw constraints a bit for the implicit fb sync. We still keep applying the hw constraints for implicit fb sync, but only when the matching sync EP is being used by other streams. Fixes: e4ea77f8e53f ("ALSA: usb-audio: Always apply the hw constraints for implicit fb sync") Reported-by: Ruud van Asseldonk <ruud@veniogames.com> Link: https://lore.kernel.org/r/4e509aea-e563-e592-e652-ba44af6733fe@veniogames.com Link: https://lore.kernel.org/r/20230102170759.29610-3-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ALSA: usb-audio: Make sure to stop endpoints before closing EPsTakashi Iwai
[ Upstream commit 0599313e26666e79f6e7fe1450588431b8cb25d5 ] At the PCM hw params, we may re-configure the endpoints and it's done by a temporary EP close followed by re-open. A potential problem there is that the EP might be already running internally at the PCM prepare stage; it's seen typically in the playback stream with the implicit feedback sync. As this stream start isn't tracked by the core PCM layer, we'd need to stop it explicitly, and that's the missing piece. This patch adds the stop_endpoints() call at snd_usb_hw_params() to assure the stream stop before closing the EPs. Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management") Link: https://lore.kernel.org/r/4e509aea-e563-e592-e652-ba44af6733fe@veniogames.com Link: https://lore.kernel.org/r/20230102170759.29610-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ASoC: wm8904: fix wrong outputs volume after power reactivationEmanuele Ghidoli
[ Upstream commit 472a6309c6467af89dbf660a8310369cc9cb041f ] Restore volume after charge pump and PGA activation to ensure that volume settings are correctly applied when re-enabling codec from SND_SOC_BIAS_OFF state. CLASS_W, CHARGE_PUMP and POWER_MANAGEMENT_2 register configuration affect how the volume register are applied and must be configured first. Fixes: a91eb199e4dc ("ASoC: Initial WM8904 CODEC driver") Link: https://lore.kernel.org/all/c7864c35-738c-a867-a6a6-ddf9f98df7e7@gmail.com/ Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20221223080247.7258-1-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recoveryPeter Wang
[ Upstream commit 1a5665fc8d7a000671ebd3fe69c6f9acf1e0dcd9 ] When SSU/enter hibern8 fail in WLUN suspend flow, trigger the error handler and return busy to break the suspend. Otherwise the consumer will get stuck in runtime suspend status. Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20221208072520.26210-1-peter.wang@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18scsi: ufs: Stop using the clock scaling lock in the error handlerBart Van Assche
[ Upstream commit 5675c381ea51360b4968b78f23aefda73e3de90d ] Instead of locking and unlocking the clock scaling lock, surround the command queueing code with an RCU reader lock and call synchronize_rcu(). This patch prepares for removal of the clock scaling lock. Link: https://lore.kernel.org/r/20211203231950.193369-16-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 1a5665fc8d7a ("scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery") Signed-off-by: Sasha Levin <sashal@kernel.org>