aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
18 hourswifi: ath12k: correct monitor destination ring sizefor-nextath-nextAaradhana Sahu1-1/+1
The default memory profile configures rxdma_monitor_dst_ring_size as 8092, which is a typo. The intended value is 8192, consistent with all other ring sizes in the table being powers of two. Correct the monitor destination ring size to 8192. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Fixes: defae535dd63 ("wifi: ath12k: Add a table of parameters entries impacting memory consumption") Signed-off-by: Aaradhana Sahu <aaradhana.sahu@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260616062342.4079796-1-aaradhana.sahu@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: change MAC buffer ring size to 4096Yingying Tang1-1/+1
For WCN7850, MAC buffer ring size is updated to 2048 in 955df16f2a4c3 ("wifi: ath12k: change MAC buffer ring size to 2048") to increase peak throughput. But during the RX process, a phenomenon can still be observed where the throughput drops by about 30% from its peak value and then recovers, and this behavior repeats during RX. After increasing MAC buffer ring size to 4096, the data rate drop has gone. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260610031358.2043716-1-yingying.tang@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Skip peer link info update in rx_status for monitor MSDUsSushant Butta1-53/+1
Do not populate peer and link_id in ieee80211_rx_status for monitor MSDUs. The monitor RX path is handled differently in mac80211 when RX_FLAG_ONLY_MONITOR is set, and does not consume peer/link metadata. As such, looking up the peer and updating link_id here is unnecessary. Additionally, this metadata is not required for monitor mode delivery, and performing the lookup/update introduces redundant work and the potential for inconsistent rx_status state if multiple paths modify it. Hence, remove the peer lookup and link_id update from the monitor MSDU delivery path. This also removes the per-MSDU debug logging in the monitor path, slightly reducing debuggability, but avoids unnecessary overhead in the monitor RX path. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Sushant Butta <sushant.butta@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260609064856.547032-3-sushant.butta@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Skip setting RX_FLAG_8023 for Ethernet-II (DIX) frames in ↵Sushant Butta3-24/+3
monitor mode Monitor mode delivers raw 802.11 frames, not 802.3/Ethernet frames. Setting RX_FLAG_8023 for monitor RX is incorrect and can break userspace capture and analysis. Do not update this flag in the monitor path to ensure correct handling of captured frames. In the monitor path, RX_FLAG_ONLY_MONITOR is always set before decap is evaluated, which forces decap to remain DP_RX_DECAP_TYPE_RAW. As a result, the condition to set RX_FLAG_8023 can never be satisfied. Hence, drop this unreachable code. Also remove the unused hal_rx_mon_ppdu_info parameter from ath12k_dp_mon_rx_deliver_msdu(), as it was passed but never used. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Sushant Butta <sushant.butta@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260609064856.547032-2-sushant.butta@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Show per-radio center freq in dp statsSreeramya Soratkal1-0/+7
Currently, the frequency on which each radio is operating is not available in device_dp_stats. This information is helpful in debugging the channel-specific throughput and is available with iw/nl80211 dump. Extend the device_dp_stats dump to display the center frequency in the existing per-radio loop. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Sreeramya Soratkal <sreeramya.soratkal@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Aishwarya R <aishwarya.r@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260626085253.3927269-4-sreeramya.soratkal@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Add timestamp to dp stats displaySreeramya Soratkal1-0/+3
In MLO configurations the device_dp_stats debugfs file is read separately for each ath12k device. Without a timestamp it is impossible to know whether two snapshots were taken at the same moment, making counter comparisons across devices unreliable. Prepend a ktime-based millisecond timestamp to the output header so the reader can confirm when the snapshot was taken. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Sreeramya Soratkal <sreeramya.soratkal@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Aishwarya R <aishwarya.r@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260626085253.3927269-3-sreeramya.soratkal@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Use runtime device count in dp stats displaySreeramya Soratkal1-2/+2
The REO Rx Received and Rx WBM REL SRC Errors display loops in ath12k_debugfs_dump_device_dp_stats() iterate up to the compile-time constant ATH12K_MAX_DEVICES. This unconditionally prints zeros in columns with no hardware behind it, making the output misleading. Replace the compile-time bound with the runtime ab->ag->num_devices so only live device slots appear in the output. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Sreeramya Soratkal <sreeramya.soratkal@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Aishwarya R <aishwarya.r@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260626085253.3927269-2-sreeramya.soratkal@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Advertise multicast Ethernet encapsulation offload supportTamizh Chelvam Raja2-14/+53
Advertise IEEE80211_OFFLOAD_ENCAP_MCAST to inform mac80211 that multicast frame encapsulation is handled in hardware. This allows mac80211 to pass Ethernet-formatted multicast frames directly to the driver. In ath12k_wifi7_mac_op_tx(), refine the logic that selects the MLO multicast replication path. Add a sta pointer check so that only unicast Hardware-encap frames use the direct transmit path, while multicast Hardware-encap frames fall through to the MLO replication loop and are transmitted on each active link. In the MLO replication loop, use skb_clone() for Hardware-encap frames. These frames are already in Ethernet format and do not require 802.11 link address rewriting by ath12k_mlo_mcast_update_tx_link_address(). Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Tamizh Chelvam Raja <tamizh.raja@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260623100501.2100119-1-tamizh.raja@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: advertise ieee_link_id in vdev start MLO paramsManish Dharanenthiran3-11/+29
Firmware builds the AP MLD partner profile from the hw_link_id passed in the vdev start parameters. However, hw_link_id is not always the same as the logical per-MLD ieee_link_id, since ieee_link_id is assigned per MLD and not per pdev. This matters in mixed MLO and SLO setups. For example: MLD 1 - 5 GHz + 6 GHz (2-link MLO): ieee_link_id 0 and 1 MLD 2 - 6 GHz only (1-link SLO): ieee_link_id 0 MLD 3 - 5 GHz only (1-link SLO): ieee_link_id 0 The same physical 6 GHz radio can use ieee_link_id 1 for one MLD and ieee_link_id 0 for another. Pass the correct ieee_link_id to firmware so it can build accurate per-STA profile elements. Add ieee_link_id to wmi_vdev_start_mlo_params for the self link and to wmi_partner_link_info for each partner link. Populate these fields in ath12k_mac_mlo_get_vdev_args() from the corresponding vdev link_id before encoding the WMI command. Introduce two new flags in ML params to indicate to firmware when the new fields are valid: ATH12K_WMI_FLAG_MLO_IEEE_LINK_IDX_VALID BIT(18) for the self link ATH12K_WMI_FLAG_MLO_IEEE_LINK_IDX_VALID_PARTNER BIT(19) for partner links Firmware parses ieee_link_id only when the matching flag is set. Also fix the debug message by using correct format specifiers and host-endian values instead of __le32 values. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Co-developed-by: Hari Naraayana Desikan Kannan <hari.kannan@oss.qualcomm.com> Signed-off-by: Hari Naraayana Desikan Kannan <hari.kannan@oss.qualcomm.com> Co-developed-by: Karthik M <karthik.m@oss.qualcomm.com> Signed-off-by: Karthik M <karthik.m@oss.qualcomm.com> Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260623-ieee_link_id-v2-1-8a89d71baf58@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: reset REOQ LUT addresses before firmware stopAishwarya R3-3/+17
During module removal, REOQ LUT cleanup writes 0 to the REOQ/ML-REOQ LUT address registers. That cleanup runs from ath12k_core_stop(), after ath12k_qmi_firmware_stop() has already stopped the firmware (mode OFF), so the register writes can hit an invalid target access. Move the REOQ LUT register reset before ath12k_qmi_firmware_stop(), so the registers are cleared before stopping the firmware, while register access is still valid. Additionally, handle the error path where firmware-ready setup fails after LUT programming but before core_stop() is reached, ensuring the registers are properly reset in that case as well. On the crash-recovery path, ath12k_core_reconfigure_on_crash() calls ath12k_core_qmi_firmware_ready(), which re-enters ath12k_dp_setup() and ath12k_dp_reoq_lut_setup(), so the LUT registers are reprogrammed before use and stale values do not persist across recovery. There is a brief window between the crash and when the LUT registers are reprogrammed during recovery, during which the registers still hold the freed DMA memory addresses. This is safe because the device is non-functional in that window and will not initiate any DMA access until firmware is restarted and the registers are reprogrammed. No functional issue has been observed so far due to this sequence. However, this change proactively avoids potential issues such as invalid register accesses after firmware stop during module removal and error handling. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Co-developed-by: P Praneesh <praneesh.p@oss.qualcomm.com> Signed-off-by: P Praneesh <praneesh.p@oss.qualcomm.com> Signed-off-by: Aishwarya R <aishwarya.r@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Tamizh Chelvam Raja <tamizh.raja@oss.qualcomm.com> Link: https://patch.msgid.link/20260619120751.363340-1-aishwarya.r@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: expand UserPD ID mask to support up to 8 PDsAaradhana Sahu2-2/+1
Currently ATH12K_USERPD_ID_MASK uses GENMASK(9, 8), which defines a 2-bit field and limits supported UserPD IDs to values 0-3. Future IPQ5332 multi-PD platform variants support more than three UserPDs. Expand ATH12K_USERPD_ID_MASK to GENMASK(10, 8), increasing the field width to 3 bits and allowing UserPD IDs from 0-7. ATH12K_USERPD_ID_MASK is currently used only while constructing the ath12k AHB PAS ID, so this change does not affect existing platforms. Also remove the unused ATH12K_MAX_UPDS definition. Tested-on: IPQ5332 hw1.0 AHB WLAN.WBE.1.6-01275-QCAHKSWPL_SILICONZ-1 Signed-off-by: Aaradhana Sahu <aaradhana.sahu@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260604031551.4178754-1-aaradhana.sahu@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: tighten RX monitor TLV bounds checkMiaoqing Pan1-2/+2
Validate the pointer to the next RX monitor TLV more strictly by ensuring that at least a full TLV header is available within the status buffer before continuing TLV parsing. Prevent potential out-of-bounds access when handling malformed or truncated RX monitor status data. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260509025819.1641630-6-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: add dp_mon support 32-bit TLV headersMiaoqing Pan1-28/+29
Wi-Fi 7 monitor status parsing in dp_mon currently assumes a 64-bit TLV header and directly decodes tag/len/userid from struct hal_tlv_64_hdr. On chips using a 32-bit TLV header (e.g. QCC2072), this causes monitor RX status packets to be dropped during TLV parsing. Introduce HAL helpers to decode TLV header fields (tag/len/userid/value) for both 32-bit and 64-bit header layouts. Without changing the actual TLV parsing logic. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260509025819.1641630-5-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: add HAL ops for monitor TLV header decode and alignmentMiaoqing Pan5-0/+22
Wi-Fi 7 monitor RX status TLV parsing needs to decode TLV headers and advance the pointer with the correct header alignment. Different targets use different TLV header layouts (32-bit vs 64-bit), but the HAL ops for dp_mon RX status header decode and header alignment were not populated for all wifi7 targets. Add dp_mon RX status TLV header decode callbacks and TLV header alignment helpers to the wifi7 HAL ops for QCC2072, QCN9274 and WCN7850. Export helpers to query the required TLV header alignment for 32-bit and 64-bit TLV headers so the caller can align the TLV walk correctly across targets. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260509025819.1641630-4-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: refactor HAL TLV32/64 decode helpersMiaoqing Pan5-15/+39
Change TLV decode helpers to return the TLV value pointer and optionally decode tag/len/usrid via out parameters. This allows reusing the helpers for DP monitor RX status header TLV parsing and avoids duplicated header decoding in callers. No functional change intended. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260509025819.1641630-3-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: fix TLV32 length maskMiaoqing Pan2-8/+5
HAL_TLV_HDR_LEN was using the wrong bitmask; fix it to cover bits [21:10]. Also drop HAL_SRNG_TLV_HDR_{TAG,LEN} and use the generic TLV header bit definitions for TLV32/TLV64 encode/decode to avoid redundant macros. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices") Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260509025819.1641630-2-miaoqing.pan@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: avoid setting 320MHz support on non 6GHz bandNicolas Escande1-1/+16
On a split phy qcn9274 (2.4GHz + 5GHz low), "iw phy" reports 320MHz related features on the 5GHz band while it should not: Wiphy phy1 [...] Band 2: [...] EHT Iftypes: managed [...] EHT PHY Capabilities: (0xe2ffdbe018778000): 320MHz in 6GHz Supported [...] Beamformee SS (320MHz): 7 [...] Number Of Sounding Dimensions (320MHz): 3 [...] EHT MCS/NSS: (0x22222222222222222200000000): This is also reflected in the beacons sent by a mesh interface started on that band. They erroneously advertise 320MHz support too. This should not happen as IEEE Std 802.11-2024, subclause 9.4.2.323.3 says we should not set the 320MHz related fields when not operating on a 6GHz band. For example it says about Bit 0 "Support For 320 MHz In 6 GHz" "Reserved if the EHT Capabilities element is indicating capabilities for the 2.4 GHz or 5 GHz bands." Fix this by clearing the related bits when converting from WMI eht phy capabilities to mac80211 phy capabilities, for bands other than 6GHz. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00218-QCAHKSWPL_SILICONZ-1 Signed-off-by: Nicolas Escande <nico.escande@gmail.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260623151613.72113-1-nico.escande@gmail.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: remove unused QMI definitionsAaradhana Sahu1-26/+0
The driver contains several unused QMI definitions such as response length macros, message IDs, firmware segment length definitions, and CALDB address size definitions. Remove these unused definitions as they are not referenced anywhere in the driver. No functional change intended. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1 Signed-off-by: Aaradhana Sahu <aaradhana.sahu@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260623035104.3765404-1-aaradhana.sahu@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: use %u for unsigned variables in QMI debug logsRaj Kumar Bhagat1-35/+35
Replace incorrect %d format specifiers with %u for unsigned variables in qmi.c debug messages. Also add missing trailing '\n' in log messages to ensure proper termination. No functional change intended. Tested-on: Compile tested only. Signed-off-by: Raj Kumar Bhagat <raj.bhagat@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260623-qmi-debug-log-v1-1-79471aa8b898@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: Fix inconsistencies in struct qmi_elem_info initializersRaj Kumar Bhagat1-69/+75
Currently, the struct qmi_elem_info initializers in qmi.c are inconsistent in how they align the assignments, with tabs being used in the majority of places but spaces being used in some places. In those places replace the spaces with tabs for consistency. Also fix incorrect and missing terminating records in the following qmi_elem_info initializers: - qmi_wlanfw_shadow_reg_cfg_s_v01_ei[] - qmi_wlanfw_mem_ready_ind_msg_v01_ei[] - qmi_wlanfw_fw_ready_ind_msg_v01_ei[] Tested-on: Compile tested only. Signed-off-by: Raj Kumar Bhagat <raj.bhagat@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260623-qmi-inconsistencies-v1-1-0fc17f2b8338@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
18 hourswifi: ath12k: enable threaded NAPI when DP IRQ affinity is unavailableHangtian Zhu1-1/+11
Determine threaded NAPI policy from runtime IRQ capability of the DP MSI IRQ. If irq_can_set_affinity() reports that affinity cannot be set, enable threaded NAPI for DP interrupt groups so datapath processing is not constrained by a single-CPU softirq context. On RB3Gen2, where IRQ affinity is unavailable in the effective IRQ path, EHT160 UDP downlink throughput improved from 802 Mbps to 2.58 Gbps after enabling threaded NAPI. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00074-QCACOLSWPL_V1_TO_SILICONZ-1 Signed-off-by: Hangtian Zhu <hangtian.zhu@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20260519011627.713068-3-hangtian.zhu@oss.qualcomm.com [Fixed checkpatch "Missing a blank line after declarations"] Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
6 daysMerge tag 'net-7.2-rc1' of ↵Linus Torvalds82-391/+848
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter and IPsec. Current release - regressions: - do not acquire dev->tx_global_lock in netdev_watchdog_up() - ethtool: keep rtnl_lock for ops using ethtool_op_get_link() - fix deadlock in nested UP notifier events Current release - new code bugs: - eth: - cn20k: fix subbank free list indexing for search order - airoha: fix BQL underflow in shared QDMA TX ring Previous releases - regressions: - netfilter: - flowtable: fix offloaded ct timeout never being extended - nf_conncount: prevent connlimit drops for early confirmed ct Previous releases - always broken: - require CAP_NET_ADMIN in the originating netns when modifying cross-netns devices - report NAPI thread PID in the caller's pid namespace - mac802154: fix dirty frag in in-place crypto for IOT radios - sctp: hold socket lock when dumping endpoints in sctp_diag, avoid an overflow - eth: gve: fix header buffer corruption with header-split and HW-GRO - af_key: initialize alg_key_len for IPComp states, prevent OOB read" * tag 'net-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (213 commits) selftests: bonding: add a test for VLAN propagation over a bonded real device vlan: defer real device state propagation to netdev_work net: add the driver-facing netdev_work scheduling API net: turn the rx_mode work into a generic netdev_work facility net: ethtool: keep rtnl_lock for ops using ethtool_op_get_link() rxrpc: Fix rxrpc_rotate_tx_rotate() to check there's something to rotate rxrpc: Fix leak of released call in recvmsg(MSG_PEEK) rxrpc: Fix socket notification race rxrpc: Fix potential infinite loop in rxrpc_recvmsg() rxrpc: Fix oob challenge leak in cleanup after notification failure rxrpc: Fix the reception of a reply packet before data transmission afs: Fix uncancelled rxrpc OOB message handler afs: Fix further netns teardown to cancel the preallocation charger rxrpc: Fix double unlock in rxrpc_recvmsg() rxrpc: Fix leak of connection from OOB challenge rxrpc: Fix ACKALL packet handling net: hns3: differentiate autoneg default values between copper and fiber net: hns3: fix permanent link down deadlock after reset net: hns3: refactor MAC autoneg and speed configuration net: hns3: unify copper port ksettings configuration path ...
6 daysnet: ethtool: keep rtnl_lock for ops using ethtool_op_get_link()Jakub Kicinski7-6/+14
Breno reports following splats on mlx5: RTNL: assertion failed at net/core/dev.c (2241) WARNING: net/core/dev.c:2241 at netif_state_change+0xed/0x130, CPU#5: ethtool/1335 RIP: 0010:netif_state_change+0xf9/0x130 Call Trace: <TASK> __linkwatch_sync_dev+0xea/0x120 ethtool_op_get_link+0xe/0x20 __ethtool_get_link+0x26/0x40 linkstate_prepare_data+0x51/0x200 ethnl_default_doit+0x213/0x470 genl_family_rcv_msg_doit+0xdd/0x110 Looks like I missed ethtool_op_get_link() trying to sync linkwatch, which needs rtnl_lock. Not all drivers do this - bnxt doesn't, it just returns the link state, so add an opt-in bit. Reported-by: Breno Leitao <leitao@debian.org> Fixes: 45079e00133e ("net: ethtool: optionally skip rtnl_lock on Netlink path for GET ops") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Breno Leitao <leitao@debian.org> Acked-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260624190439.2521219-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: hns3: differentiate autoneg default values between copper and fiberShuaisong Yang1-0/+7
Fix a link loss issue during driver initialization on optical ports connected to forced-mode (non-autoneg) remote switches. Previously, during driver probe or initialization, hclge_configure() blindly hardcoded hdev->hw.mac.req_autoneg to AUTONEG_ENABLE for all media types. While this is necessary for copper (BASE-T) ports to establish a link, many high-speed optical (fiber) ports in data centers are connected to switches running in forced mode (fixed speed, autoneg disabled). Forcing autoneg on these optical ports during initialization causes a permanent link failure since the remote end refuses to respond to autoneg pulses. Fix this by implementing media-type differentiated initialization in hclge_init_ae_dev(). Copper ports continue to default to AUTONEG_ENABLE, while optical ports strictly inherit the preset autoneg status pre-configured by the firmware (hdev->hw.mac.autoneg), preserving native compatibility with forced-mode network environments. Fixes: 05eb60e9648c ("net: hns3: using user configure after hardware reset") Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260624141319.271439-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: hns3: fix permanent link down deadlock after resetShuaisong Yang1-7/+15
Fix a critical race condition deadlock where the network interface remains permanently Link Down after a hardware reset under specific ethtool sequences. This issue exclusively manifests in firmware-controlled PHY topologies where the driver relies on the IMP firmware to arbitrate link parameters. Standard devices driven by the kernel's native PHY_LIB are unaffected. The deadlock occurs via the following path: 1. User disables autoneg and forces an unmatched speed, forcing link down: `ethtool -s ethx autoneg off speed 10 duplex full` 2. User re-enables autoneg: `ethtool -s ethx autoneg on`. The netdev stack passes cmd->base.speed as SPEED_UNKNOWN (0xffffffff). 3. Driver saves req_autoneg=1, but before the interface can link up, a hardware reset is triggered. 4. During reset recovery, MAC init reads the un-synchronized runtime state mac.autoneg (which is still 0/OFF), misinterprets it as forced mode, and pushes the cached SPEED_UNKNOWN into the hardware registers, causing the MAC firmware state machine to freeze. Meanwhile, PHY init reads req_autoneg=1 and enables PHY autoneg. Since the MAC is frozen with 0xffffffff and PHY is running autoneg, they mismatch permanently. Fix this by: 1. Intercepting SPEED_UNKNOWN/DUPLEX_UNKNOWN in hclge_set_phy_link_ksettings() and hclge_cfg_mac_speed_dup_h() to prevent it from corrupting the driver's cached valid configuration. 2. Save req_autoneg in hclge_set_autoneg(). 3. Aligning the state judgment in hclge_set_autoneg_speed_dup() to use req_autoneg instead of the un-synchronized runtime mac.autoneg, ensuring both MAC and PHY consistently enter the autoneg branch to eliminate configuration discrepancies during reset recovery. Fixes: 05eb60e9648c ("net: hns3: using user configure after hardware reset") Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260624141319.271439-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: hns3: refactor MAC autoneg and speed configurationShuaisong Yang2-14/+42
Extract the MAC autoneg and speed/duplex/lane configuration logic out of hclge_mac_init() and encapsulate it into a new dedicated helper function hclge_set_autoneg_speed_dup(). In the init path (hclge_init_ae_dev), this helper is now called after hclge_update_port_info() so that firmware-reported autoneg values are already populated before applying the link configuration. Introduce a separate req_lane_num field in struct hclge_mac to isolate the user-requested lane count from mac.lane_num, which firmware may overwrite via hclge_get_sfp_info() with stale values from a prior link lifecycle (e.g., lane_num=4 from 100G). During probe, req_lane_num is initialized to 0, which instructs firmware to auto-select the correct lane count for the current speed, rather than reusing the firmware- reported mac.lane_num that may be inconsistent with the target speed. This prevents probe failures from mismatched (speed, lane_num) pairs. In the reset path (hclge_reset_ae_dev), it runs immediately after hclge_mac_init(), using the previously cached req_* values to restore the link without re-querying firmware. Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260624141319.271439-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: hns3: unify copper port ksettings configuration pathShuaisong Yang2-19/+40
Refactor hns3_set_link_ksettings() and hclge_set_phy_link_ksettings() to unify the configuration path for copper ports. Previously, netdevs with a native kernel phy attached bypassed the main MAC parameter caching logic and returned early via phy_ethtool_ksettings_set(). This prevented the driver from updating hdev->hw.mac.req_xxx variables for kernel PHY setups, leaving them out-of-sync during reset recovery. Clean this up by routing all copper port configurations through ops->set_phy_link_ksettings(), and perform driver-level or kernel-level PHY arbitration inside hclge_set_phy_link_ksettings() via hnae3_dev_phy_imp_supported(). This ensures that the user's intended link profiles (req_speed, req_duplex, req_autoneg) are uniformly recorded across all copper and fiber deployment topologies, laying the groundwork for stable reset recovery. For copper ports where neither IMP firmware nor a kernel PHY is available (e.g. PHY_INEXISTENT), hclge_set_phy_link_ksettings() returns -ENODEV. In hns3_set_link_ksettings(), this is caught so the configuration falls through to the existing MAC-level path (check_ksettings_param -> cfg_mac_speed_dup_h), preserving compatibility with PHY-less copper deployments. Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260624141319.271439-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: mana: Optimize irq affinity for low vcpu configsShradha Gupta1-14/+64
Before the commit 755391121038 ("net: mana: Allocate MSI-X vectors dynamically"), all the MANA IRQs were assigned statically and together during early driver load. After this commit, the IRQ allocation for MANA was done in two phases. HWC IRQ allocated earlier and then, queue IRQs dynamically added at a later point. By this time, the IRQ weights on vCPUs can become imbalanced and if IRQ count is greater than the vCPU count the topology aware IRQ distribution logic in MANA can cause multiple MANA IRQs to land on the same vCPUs, while other sibling vCPUs have none (case 1). On SMP enabled, low-vCPU systems, this becomes a bigger problem as the softIRQ handling overhead of two IRQs on the same vCPUs becomes much more than their overheads if they were spread across sibling vCPUs. In such cases when many parallel TCP connections are tested, the throughput drops significantly. Fix the affinity assignment logic, in cases where the IRQ count is greater than the vCPU count and when IRQs are added dynamically, by utilizing all the vCPUs irrespective of their NUMA/core bindings (case 2). The results of setting the affinity and hint to NULL were also studied, and we observed that, with this logic if there are pre-existing IRQs allocated on the VM (apart from MANA), during MANA IRQs allocation, it leads to clustering of the MANA queue IRQs again (case 3). ======================================================= Case 1: without this patch ======================================================= 4 vcpu(2 cores), 5 MANA IRQs (1 HWC + 4 Queue) TYPE effective vCPU aff ======================================================= IRQ0: HWC 0 IRQ1: mana_q1 0 IRQ2: mana_q2 2 IRQ3: mana_q3 0 IRQ4: mana_q4 3 %soft on each vCPU(mpstat -P ALL 1) on receiver vCPU 0 1 2 3 ======================================================= pass 1: 38.85 0.03 24.89 24.65 pass 2: 39.15 0.03 24.57 25.28 pass 3: 40.36 0.03 23.20 23.17 ======================================================= Case 2: with this patch ======================================================= 4 vcpu(2 cores), 5 MANA IRQs (1 HWC + 4 Queue) TYPE effective vCPU aff ======================================================= IRQ0: HWC 0 IRQ1: mana_q1 0 IRQ2: mana_q2 1 IRQ3: mana_q3 2 IRQ4: mana_q4 3 %soft on each vCPU(mpstat -P ALL 1) on receiver vCPU 0 1 2 3 ======================================================= pass 1: 15.42 15.85 14.99 14.51 pass 2: 15.53 15.94 15.81 15.93 pass 3: 16.41 16.35 16.40 16.36 ======================================================= Case 3: with affinity set to NULL ======================================================= 4 vCPU(2 cores), 5 MANA IRQs (1 HWC + 4 Queue) TYPE effective vCPU aff ======================================================= IRQ0: HWC 0 IRQ1: mana_q1 2 IRQ2: mana_q2 3 IRQ3: mana_q3 2 IRQ4: mana_q4 3 ======================================================= Throughput Impact(in Gbps, same env) ======================================================= TCP conn with patch w/o patch aff NULL 20480 15.65 7.73 5.25 10240 15.63 8.93 5.77 8192 15.64 9.69 7.16 6144 15.64 13.16 9.33 4096 15.69 15.75 13.50 2048 15.69 15.83 13.61 1024 15.71 15.28 13.60 Fixes: 755391121038 ("net: mana: Allocate MSI-X vectors dynamically") Cc: stable@vger.kernel.org Co-developed-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Yury Norov <ynorov@nvidia.com> Link: https://patch.msgid.link/20260624072138.1632849-1-shradhagupta@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: sparx5: unregister blocking notifier on init failureHaoxiang Li1-1/+3
sparx5_register_notifier_blocks() registers the switchdev blocking notifier before allocating the ordered workqueue. If the workqueue allocation fails, the error path unregisters the switchdev and netdevice notifiers, but leaves the blocking notifier registered. Add a separate error label for the workqueue allocation failure path and unregister the switchdev blocking notifier there. Fixes: d6fce5141929 ("net: sparx5: add switching support") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260623115714.2192074-1-haoxiang_li2024@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysocteontx2-af: Free BPID bitmap on setup failureHaoxiang Li1-3/+8
nix_setup_bpids() allocates bp->bpids with rvu_alloc_bitmap(), which uses a plain kcalloc(). If any of the following devm_kcalloc() allocations for the BPID mapping arrays fails, the function returns without freeing the bitmap. Free the BPID bitmap before returning from those error paths. Fixes: d6212d2e41a0 ("octeontx2-af: Create BPIDs free pool") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260623114316.2182271-1-haoxiang_li2024@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: enetc: fix potential divide-by-zero when num_vsi is zeroWei Fang1-0/+3
For i.MX94 series, all the standalone ENETCs do not support SR-IOV, so pf->caps.num_vsi is zero. This leads to a divide-by-zero in enetc4_default_rings_allocation() when distributing rings among PF and VFs. Division by zero is undefined behavior in C. On ARM64, the UDIV/SDIV instructions silently return zero rather than raising an exception, so the issue does not cause a visible crash. However, relying on this behavior is incorrect and poses a cross-platform compatibility risk. Add an explicit check for num_vsi == 0 and return early after the PF's rings have been configured. Fixes: 2d673b0e2f8d ("net: enetc: add standalone ENETC support for i.MX94") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260624072726.1238903-1-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysMerge branch '100GbE' of ↵Jakub Kicinski9-27/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-06-22 (ice, i40e, e1000e) For ice: Dawid changes call to release control VSI during reset to prevent leaking it. Lukasz fixes flow control error check to check value rather than treat is as bitmap values. Paul makes link related errors non-fatal to probe to allow for recovery in certain NVM update situations. Marcin moves netif_keep_dst() to only be called once when entering switchdev mode. ZhaoJinming adds a cleanup path for ice_dpll_init_info() to prevent memory leaks on error path. For i40e: Mohamed Khalfella corrects argument passed in macro to match the one provided to the macro. For e1000e: Dima resolves power state issues by adjusting value of PLL clock gate and re-enabling K1; a quirk table is added to keep it off for known bad systems. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Reconfigure PLL clock gate timeout and re-enable K1 on Meteor Lake i40e: Fix i40e_debug() to use struct i40e_hw argument ice: dpll: fix memory leak in ice_dpll_init_info error paths ice: dpll: set pointers to NULL after kfree in ice_dpll_deinit_info ice: call netif_keep_dst() once when entering switchdev mode ice: fix ice_init_link() error return preventing probe ice: fix AQ error code comparison in ice_set_pauseparam() ice: fix FDIR CTRL VSI resource leak in ice_reset_all_vfs() ==================== Link: https://patch.msgid.link/20260622220059.2471844-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: stmmac: dwmac-spacemit: Fix wrong irq definitionInochi Amaoto1-2/+2
The current irq definition of the wake irq and the lpi irq is wrong, replace them with the right number and name. Fixes: 30f0ba420ed3 ("net: stmmac: Add glue layer for Spacemit K3 SoC") Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260623074637.503864-3-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: stmmac: dwmac-spacemit: Fix wrong phy interface definitionInochi Amaoto1-3/+6
The current MII interface register definition from the vendor is wrong, use the right number for the macro. Also, correct the interface mask in spacemit_set_phy_intf_sel() so it can update the register with the right number Fixes: 30f0ba420ed3 ("net: stmmac: Add glue layer for Spacemit K3 SoC") Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260623074637.503864-2-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: ethernet: sunplus: spl2sw: fix phy_node refcount leak in removeShitalkumar Gandhi1-2/+4
mac->phy_node is acquired via of_parse_phandle() in spl2sw_probe() and stored in the mac private data, transferring ownership of the device_node reference to mac. On driver removal, spl2sw_phy_remove() disconnects the PHY but never drops that reference, so each probe-then-remove cycle leaks one of_node refcount per port permanently. Drop the reference after phy_disconnect(). While at it, remove the redundant inner "if (ndev)" check; comm->ndev[i] was just verified non-NULL on the line above. Compile-tested only; no SP7021 hardware available. Fixes: fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021") Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/f3bdd4c91f3e2269b4e256075f9dc70808b1b8e9.1782195965.git.shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: sungem: fix probe error cleanupRuoyu Wang1-4/+9
gem_init_one() calls gem_remove_one() when register_netdev() fails. gem_remove_one() unregisters and frees resources owned by the net_device, including the DMA block, MMIO mapping, PCI regions, and the net_device itself. gem_init_one() then falls through to its own cleanup labels and frees the same resources again. Keep the register_netdev() error path in gem_init_one(): clear drvdata so PM/remove paths do not see a half-registered device, remove the NAPI instance added during probe, and let the existing cleanup labels release the resources once. The issue was found by a local static-analysis checker for probe error paths. The reported path was manually inspected before sending this fix. Compile-tested with CONFIG_SUNGEM=y. Runtime testing was not performed because no sungem hardware is available. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260623025759.3468566-1-ruoyuw560@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 dayseth: mlx5: fix macsec dependencyArnd Bergmann1-1/+1
Configurations with mlx5 built-in but macsec=m fail to link: x86_64-linux-ld: drivers/infiniband/hw/mlx5/macsec.o: in function `mlx5r_add_gid_macsec_operations': macsec.c:(.text+0x77d): undefined reference to `macsec_netdev_is_offloaded' x86_64-linux-ld: drivers/infiniband/hw/mlx5/macsec.o: in function `mlx5r_del_gid_macsec_operations': macsec.c:(.text+0xe81): undefined reference to `macsec_netdev_is_offloaded' Fix the dependency so this configuration cannot happen. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260622124229.2444502-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: usb: kalmia: bound RX frame length in kalmia_rx_fixup()Maoyi Xie1-0/+8
kalmia_rx_fixup() computes usb_packet_length = skb->len - (2 * KALMIA_HEADER_LENGTH) as a u16, guarded only by a pre-loop check that skb->len is at least KALMIA_HEADER_LENGTH, which is 6. A device can deliver a short bulk-IN frame with skb->len in the 6 to 11 range, or leave a short trailing remainder on a later loop iteration. Either case underflows usb_packet_length to about 65530. That bypasses the usb_packet_length < ether_packet_length truncation path. The device-supplied ether_packet_length, a le16 up to 65535 read from header_start[2], then drives a memcmp() and the following skb_trim() and skb_pull() past the end of the rx buffer. The rx buffer is hard_mtu * 10, which is 14000 bytes. That is an out of bounds read. Require both the start and end framing headers to be present before subtracting them, on every loop iteration. Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Cc: stable@vger.kernel.org Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/178211531778.2216480.12637613349790980750@maoyixie.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysgeneve: validate inner network offset in geneve_gro_complete()Xiang Mei1-0/+14
Even with both paths gated on gs->gro_hint, geneve_gro_complete() re-derives the inner dispatch type and length from the packet and the current gs->gro_hint, independently of geneve_gro_receive(). The two can disagree if gs->gro_hint flips under a concurrent geneve_quiesce()/ geneve_unquiesce() (sk_user_data is NULL across a synchronize_net()), or if the re-read option bytes differ from the ones receive parsed. geneve_gro_receive() already records the inner network header position in NAPI_GRO_CB()->inner_network_offset. Have geneve_gro_complete() compute the offset it is about to dispatch at, adding ETH_HLEN in the ETH_P_TEB case where eth_gro_complete() steps over the inner MAC header, and bail out if it lands past inner_network_offset. Use a lower bound rather than exact equality: between gh_len and the inner L3 header, geneve_gro_receive() may also have pulled an inner VLAN tag (vlan_gro_receive() advances the recorded offset past it), which only moves inner_network_offset further out. A valid frame therefore always satisfies inner_nh <= inner_network_offset, while a gh_len inflated by a hint gro_receive() did not honour dispatches past the validated inner header, i.e. the out-of-bounds completion. Only the latter is rejected. Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path") Suggested-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260618032622.484720-2-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysgeneve: gate GRO hint in geneve_gro_complete() on gs->gro_hintXiang Mei1-2/+2
geneve_gro_receive() reads the GRO hint through geneve_sk_gro_hint_off(), which honours it only when the socket enabled IFLA_GENEVE_GRO_HINT (gs->gro_hint). geneve_gro_complete() instead calls the low-level geneve_opt_gro_hint_off() and acts on the hint unconditionally. On a tunnel without the hint, receive aggregates the frames as plain ETH_P_TEB while complete still honours an attacker-supplied hint option: it inflates gh_len by gro_hint->nested_hdr_len (u8) and redirects the dispatch type, so the inner gro_complete handler runs at nhoff + gh_len, an offset receive never pulled nor validated, reading out of bounds of the skb head: BUG: KASAN: slab-out-of-bounds in ipv6_gro_complete (net/ipv6/ip6_offload.c:196) Read of size 1 at addr ffff88800fe91980 by task exploit/153 ipv6_gro_complete (net/ipv6/ip6_offload.c:196) geneve_gro_complete (drivers/net/geneve.c:965) udp_gro_complete (net/ipv4/udp_offload.c:940) inet_gro_complete (net/ipv4/af_inet.c:1621) __gro_flush (net/core/gro.c:306) Gate the complete path on gs->gro_hint too via geneve_sk_gro_hint_off(), so both paths agree. Tunnels that enable the hint are unaffected. Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path") Reported-by: Weiming Shi <bestswngs@gmail.com> Reported-by: Kyle Zeng <kylebot@openai.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260618032622.484720-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: mvneta: re-enable percpu interrupt on resumeYun Zhou1-0/+3
On Marvell MPIC platforms (Armada 370/XP/38x), mvneta uses a percpu IRQ disable/enable scheme for NAPI: the ISR (mvneta_percpu_isr) calls disable_percpu_irq() to mask the MPIC per-CPU interrupt and schedules NAPI poll, which calls enable_percpu_irq() on completion to unmask. If suspend occurs while NAPI poll is pending (between disable_percpu_irq in the ISR and enable_percpu_irq in poll completion), the interrupt is never re-enabled: 1. mvneta_percpu_isr: disable_percpu_irq() + napi_schedule() => MPIC masked, percpu_enabled cpumask bit cleared 2. NAPI poll does not complete before suspend proceeds (on PREEMPT_RT this is highly likely since softirqs run in ksoftirqd which gets frozen; on non-RT it can happen when softirq processing is deferred to ksoftirqd) 3. mvneta_stop_dev => napi_disable(): cancels the pending poll without executing the completion path 4. suspend_device_irqs => IRQCHIP_MASK_ON_SUSPEND: masks MPIC (already masked, but records IRQS_SUSPENDED) 5. Resume: mpic_resume checks irq_percpu_is_enabled() => false (bit was cleared in step 1) => skips unmask 6. mvneta_start_dev only restores device-level INTR_NEW_MASK, does not touch the MPIC per-CPU mask Result: MPIC per-CPU interrupt stays masked permanently. The NIC generates interrupts (INTR_NEW_CAUSE != 0) but the CPU never receives them, causing complete loss of network connectivity. Fix by calling on_each_cpu(mvneta_percpu_enable) in the resume path to unconditionally unmask the MPIC per-CPU interrupt regardless of pre-suspend state. Fixes: 12bb03b436da ("net: mvneta: Handle per-cpu interrupts") Signed-off-by: Yun Zhou <yun.zhou@windriver.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20260622074350.1666290-1-yun.zhou@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysocteontx2-af: fix CGX debugfs RVU AF PCI reference leaksRatheesh Kannoth1-26/+30
CGX per-lmac debugfs seq readers obtained struct rvu via pci_get_drvdata(pci_get_device(..., PCI_DEVID_OCTEONTX2_RVU_AF, ...)), which leaks a PCI device reference on every read. Store rvu and the CGX handle in debugfs inode private data when creating stats, mac_filter, and fwdata files (one context per CGX), and use debugfs aux numbers for fwdata so lmac_id matches the other CGX debugfs entries. Fixes: f967488d095e ("octeontx2-af: Add per CGX port level NIX Rx/Tx counters") Fixes: dbc52debf95f ("octeontx2-af: Debugfs support for DMAC filters") Fixes: 49f02e6877d1 ("Octeontx2-af: Debugfs support for firmware data") Cc: Linu Cherian <lcherian@marvell.com> Reported-by: Yuho Choi <dbgh9129@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260622034229.2254145-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysocteontx2-af: Validate NIX maximum LFs correctlySubbaraya Sundeep1-8/+19
NIX maximum number of LFs can be set via devlink command but that can be done before assigning any LFs to a PF/VF. The condition used to check whether any LFs are assigned is incorrect. This patch fixes that condition. Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1782082853-6941-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: wwan: t7xx: destroy DMA pool on CLDMA late init failureHaoxiang Li1-0/+3
t7xx_cldma_late_init() creates md_ctrl->gpd_dmapool before initializing the TX and RX rings. If any ring initialization fails, the error path frees the already initialized rings but leaves the DMA pool allocated. Destroy md_ctrl->gpd_dmapool on the late-init failure path to avoid leaking the DMA pool. Fixes: 39d439047f1d ("net: wwan: t7xx: Add control DMA interface") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20260621031714.3605022-1-haoxiang_li2024@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: airoha: fix BQL underflow in shared QDMA TX ringLorenzo Bianconi2-76/+93
When multiple netdevs share a QDMA TX ring and one device is stopped, netdev_tx_reset_subqueue() zeroes that device's BQL counters while its pending skbs remain in the shared HW TX ring. When NAPI later completes those skbs via netdev_tx_completed_queue(), the already-zeroed dql->num_queued counter underflows. Fix the issue: - Remove netdev_tx_reset_subqueue() from airoha_dev_stop() so pending skbs are completed naturally by NAPI with proper BQL accounting. - Rework airoha_qdma_tx_cleanup() to disable TX DMA, flush BQL counters, DMA-unmap and free all pending skbs while skb->dev references are still valid. Use a per-queue flushing flag checked under q->lock in airoha_dev_xmit() to prevent races between teardown and transmit. Call airoha_qdma_stop_napi() before airoha_qdma_tx_cleanup() at the call sites. - Move DMA engine start into probe. Split DMA teardown so TX DMA is disabled in airoha_qdma_tx_cleanup() and RX DMA in airoha_qdma_cleanup(). - Remove qdma->users counter since DMA lifetime is now tied to probe/cleanup rather than per-netdev open/stop. Fixes: a9c2ca61fec7 ("net: airoha: Support multiple net_devices for a single FE GDM port") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260620-airoha-bql-fixes-v3-1-76b95374e63e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: phy: realtek: Clear MDIO_AN_10GBT_CTRL_ADV10G bitJan Klos1-1/+2
On RTL8127A connected to a link partner that advertises 10000baseT speed cannot be changed to anything other than 10000baseT as 10GbE is always advertised regardless of any setting. Fix this by clearing MDIO_AN_10GBT_CTRL_ADV10G bit in rtl822x_config_aneg()'s call to phy_modify_mmd_changed(). Fixes: 83d962316128 ("net: phy: realtek: add RTL8127-internal PHY") Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Jan Klos <honza.klos@gmail.com> Link: https://patch.msgid.link/20260620011956.37181-1-honza.klos@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysocteontx2-af: npc: cn20k: Fix subbank free list indexing for search orderRatheesh Kannoth1-12/+39
subbank_srch_order[i] is the physical subbank at search-order slot i, so each subbank's arr_idx must be i (its slot), not subbank_srch_order[sb->idx]. The old logic mis-keyed xa_sb_free and broke allocation traversal order. Populate arr_idx and xa_sb_free in a single pass over the search order after subbank structs are initialized. Fixes: 7ac9d4c4075c ("octeontx2-af: npc: cn20k: add subbank search order control") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260619095100.1864440-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: mana: Fall back to standard MTU when PF reports adapter_mtu of 0Erni Sri Satya Vennela2-3/+16
Commit d7709812e13d ("net: mana: hardening: Validate adapter_mtu from MANA_QUERY_DEV_CONFIG") rejected any adapter_mtu value smaller than ETH_MIN_MTU + ETH_HLEN, including 0, returning -EPROTO and failing mana_probe(). Some older PF firmware versions still in the field report adapter_mtu as 0 in the MANA_QUERY_DEV_CONFIG response. With the hardening check in place, the MANA VF driver now fails to load on those hosts, breaking networking entirely for guests. MANA hardware always supports the standard Ethernet MTU. Treat a reported adapter_mtu of 0 as "the PF did not advertise a value" and fall back to ETH_FRAME_LEN, the same value used for the pre-V2 message version path. Only jumbo frames remain unavailable until the PF reports a valid MTU. Other small-but-nonzero bogus values are still rejected, preserving the original protection against the unsigned-subtraction wrap that would otherwise let ndev->max_mtu underflow to a huge value. Fixes: d7709812e13d ("net: mana: hardening: Validate adapter_mtu from MANA_QUERY_DEV_CONFIG") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260619055348.467224-1-ernis@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: dsa: mxl862xx: fix use-after-free of DSA ports in crc_err_workDaniel Golle1-5/+6
Upon an MDIO CRC error mxl862xx_crc_err_work_fn() walks the DSA ports and closes the CPU port conduits: dsa_switch_for_each_cpu_port(dp, priv->ds) dev_close(dp->conduit); mxl862xx_remove() unregisters the switch before cancelling this work: set_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags); cancel_delayed_work_sync(&priv->stats_work); dsa_unregister_switch(ds); mxl862xx_host_shutdown(priv); dsa_unregister_switch() frees the dsa_port objects. If a CRC error schedules the work during teardown it can run after the ports have been freed and dereference freed memory. Guard the port walk with MXL862XX_FLAG_WORK_STOPPED, which is already set before dsa_unregister_switch(). DSA tears the ports down under rtnl_lock(), so checking the flag under rtnl_lock() means the work either runs before teardown and sees valid ports, or runs afterwards, observes the flag and skips the walk. This mirrors the host_flood_work handler, which skips torn-down ports under rtnl_lock(). Link: https://sashiko.dev/#/patchset/cover.1780968180.git.daniel%40makrotopia.org?part=2 Fixes: a319d0c8c8ce ("net: dsa: mxl862xx: add CRC for MDIO communication") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/5e55169926c02f2b914e5ada529d7453b943cda4.1781702256.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: dsa: mxl862xx: avoid unaligned 16-bit access in api_wrapDaniel Golle1-3/+4
The MXL862XX_API_* macros pass the address of a stack-allocated, __packed firmware-ABI struct to mxl862xx_api_wrap() as a void *. The struct has an alignment of 1, so the compiler is free to place it at an odd address. mxl862xx_api_wrap() reinterprets that buffer as a __le16 * and accesses it with data[i], for which the compiler assumes the natural 2-byte alignment of __le16 and emits aligned 16-bit loads/stores (e.g. lhu/sh on MIPS). When the buffer lands on an odd address these fault on architectures that do not support unaligned access, such as MIPS32. -Waddress-of-packed-member does not catch this: the packed origin is laundered through the void * parameter, so the cast inside api_wrap looks alignment-safe to the compiler and no warning is emitted. Use get_unaligned_le16()/put_unaligned_le16() for the three 16-bit word accesses. The byte accesses (*(u8 *)&data[i], crc16()) are already safe and are left unchanged. Link: https://sashiko.dev/#/patchset/cover.1781319534.git.daniel%40makrotopia.org?part=4 Fixes: 23794bec1cb6 ("net: dsa: add basic initial driver for MxL862xx switches") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/599327521db465a534d277de53ab9b6cac01928b.1781702256.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: dsa: realtek: fix memory leak in rtl8366rb_setup_led()David Yang1-4/+4
led_classdev_register_ext() only reads init_data.devicename - it never stores the pointer. However, the caller allocated devicename with kasprintf() but never freed it, leaking the string memory. Fix it with a stack buffer to avoid dynamic buffers completely. Fixes: 32d617005475 ("net: dsa: realtek: add LED drivers for rtl8366rb") Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260618140200.1888707-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: ixp4xx_hss: fix duplicate HDLC netdev allocationHaoxiang Li1-2/+2
ixp4xx_hss_probe() allocates two HDLC netdevs. The first one is stored in ndev, initialized, and registered with register_hdlc_device(). The second one is stored in port->netdev and later used by the remove path for unregister_hdlc_device() and free_netdev(). This means that the registered netdev is not the same object that is unregistered and freed on remove. It also leaks the first allocation if the second alloc_hdlcdev() call fails, and the first allocation is not checked before ndev is used. Older code allocated the HDLC netdev only once and stored the same object in both the local variable and port->netdev. The buggy conversion split this into two alloc_hdlcdev() calls. A later rename changed the local variable name to ndev, but the underlying mismatch remained. Fix this by allocating the HDLC netdev only once and assigning the same object to port->netdev. Fixes: 99ebe65eb9c0 ("net: ixp4xx_hss: move out assignment in if condition") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260622043015.643637-1-haoxiang_li2024@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: airoha: fix netif_set_real_num_tx_queues for sparse QoS channelsLorenzo Bianconi1-9/+16
airoha_tc_htb_alloc_leaf_queue() assigns queue IDs based on the channel index (opt->qid = AIROHA_NUM_TX_RING + channel), but updates real_num_tx_queues with a simple increment (num_tx_queues + 1). When QoS channels are allocated sparsely (e.g., channels 0 and 3 without 1 and 2), the returned qid can exceed real_num_tx_queues, causing out-of-bounds accesses in the networking stack. For example, allocating channel 0 then channel 3 results in real_num_tx_queues = 34 but qid = 35, which is out of range [0, 34). Fix this by computing real_num_tx_queues based on the highest active channel index rather than using a simple counter, in both the allocation and deletion paths. Fixes: ef1ca9271313b ("net: airoha: Add sched HTB offload support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260619-airoha-qos-fixes-v2-2-5c43485038f9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 daysnet: airoha: Fix off-by-one in airoha_tc_remove_htb_queue()Lorenzo Bianconi1-1/+1
airoha_tc_htb_alloc_leaf_queue() computes the HTB QoS channel index as opt->classid % AIROHA_NUM_QOS_CHANNELS and stores it in qos_sq_bmap. However, airoha_tc_remove_htb_queue() clears the HTB configuration using queue + 1 as the channel index, causing an off-by-one error. Use queue directly as the QoS channel index to match the allocation logic. Fixes: ef1ca9271313b ("net: airoha: Add sched HTB offload support") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260619-airoha-qos-fixes-v2-1-5c43485038f9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 dayseth: fbnic: fix ordering of heartbeat vs ownershipJakub Kicinski1-5/+4
When requesting ownership of the NIC (MAC/PHY control), we set up the heartbeat to look stale: /* Initialize heartbeat, set last response to 1 second in the past * so that we will trigger a timeout if the firmware doesn't respond */ fbd->last_heartbeat_response = req_time - HZ; fbd->last_heartbeat_request = req_time; The response handler then sets: fbd->last_heartbeat_response = jiffies; for which we wait via: fbnic_fw_init_heartbeat() -> fbnic_fw_heartbeat_current() The scheme is a bit odd, but it should work in principle. Fix the ordering of operations. We have to set up the stale heartbeat before we send the message. Otherwise if the response is very fast we will override it. This triggers on QEMU if we run on the core that handles the IRQ, and results in ndo_open failing with ETIMEDOUT. The change in ordering doesn't impact releasing the ownership. Both ndo_stop and heartbeat check are under rtnl_lock. Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20260622154753.827506-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: au1000: move free_irq out of the close-time spinlocked sectionRunyu Xiao1-1/+2
au1000_close() calls free_irq() while aup->lock is still held with spin_lock_irqsave(). free_irq() can sleep because it takes the IRQ descriptor request mutex, so it does not belong inside the close-time spinlocked section. This was found by our static analysis tool and then confirmed by manual review of the in-tree au1000_close() .ndo_stop path. The reviewed path keeps aup->lock held across the MAC reset, queue stop and free_irq(dev->irq, dev). A directed runtime validation kept that ndo_stop carrier and the same free_irq(dev->irq, dev) operation under the driver lock. Lockdep reported "BUG: sleeping function called from invalid context" and "Invalid wait context" while free_irq() was taking desc->request_mutex, with au1000_close() and free_irq() on the stack. Drop aup->lock before freeing the IRQ. The protected close-time work still stops the device and queue before IRQ teardown, but the sleepable IRQ core path now runs outside the spinlocked section. Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260619151816.1144289-1-runyu.xiao@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: usb: lan78xx: restore VLAN and hash filters after link upNicolai Buchwitz1-6/+31
Configured VLANs intermittently stop receiving traffic after a link down/up cycle, e.g. when the network cable is unplugged and plugged back in. VLAN filtering stays enabled but all VLAN-tagged frames are dropped until a VLAN is added or removed again. The LAN7801 datasheet (DS00002123E) states: "A portion of the MAC operates on clocks generated by the Ethernet PHY. During a PHY reset event, this portion of the MAC is designed to not be taken out of reset until the PHY clocks are operational" (section 8.10, MAC Reset Watchdog Timer) "After a reset event, the RFE will automatically initialize the contents of the VHF to 0h." (section 7.1.4, VHF Organization) Thus a link down/up cycle stops and restarts the PHY clock, resets the PHY-clocked portion of the MAC, and the RFE clears its VLAN/DA hash filter (VHF) memory. The VHF holds both the VLAN filter table and the multicast hash table, but the driver never reprograms either from its shadow copy once the link is back, so both stay empty. Reprogram the VLAN filter and multicast hash tables on link up. Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de> Closes: https://lore.kernel.org/netdev/BEZP281MB224501E38B30BFDC4BD3D364D9E32@BEZP281MB2245.DEUP281.PROD.OUTLOOK.COM/T/#u Tested-by: Sven Schuchmann <schuchmann@schleissheimer.de> Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260622102911.484045-1-nb@tipi-net.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysveth: fix NAPI leak in XDP enable error pathEric Dumazet1-0/+2
During XDP enablement in veth, if xdp_rxq_info_reg() or xdp_rxq_info_reg_mem_model() fails, the driver rolls back the changes. However, the rollback loop: for (i--; i >= start; i--) { decrements the loop index 'i' before the first iteration. This correctly skips unregistering the rxq for the failed index 'i' (as registration failed or was already cleaned up), but it also erroneously skips calling netif_napi_deli() for rq[i].xdp_napi. Since netif_napi_add() was already called for index 'i', this leaves a dangling napi_struct in the device's napi_list. When the veth device is later destroyed, the freed queue memory (which contains the leaked NAPI structure) can be reused. The subsequent device teardown iterates the NAPI list and corrupts the reallocated memory, leading to UAF. Fix this by explicitly deleting the NAPI association for the failed index 'i' before rolling back the successfully configured queues. Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path") Reported-by: Guenter Roeck <groeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20260622111825.88337-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: ti: icssg: Fix XSK zero copy TX during application wakeupMeghana Malladi1-12/+11
emac_xsk_xmit_zc() handles tx xmit for zero copy and gets called inside napi context. User application wakes up the kernel while initiating the transmit which triggers napi to start processing the tx packets. The num_tx check inside emac_tx_complete_packets() returns early if no packet transfer happen hindering the call to emac_xsk_xmit_zc(). Remove this check to let application wakeup initiate zero copy xmit traffic. Add __netif_tx_lock() to ensure that the TX queue is protected from concurrent access during the transmission of XDP frames. This fixes netdev watchdog timeout for long runs. Fixes: e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20260618100348.2209907-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysnet: dsa: sja1105: round up PTP perout pin durationAleksandrova Alyona1-1/+1
pin_duration is converted from the user-provided period to SJA1105 clock ticks and is later passed as the cycle_time argument to future_base_time(). Very small period values may become zero after the conversion, which can lead to a division by zero in future_base_time(). Round zero pin_duration up to 1 tick so that the smallest unsupported periods use the minimum non-zero hardware duration instead of passing zero to future_base_time(). Fixes: 747e5eb31d59 ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT") Signed-off-by: Aleksandrova Alyona <aga@itb.spb.ru> Link: https://patch.msgid.link/20260618110508.53094-1-aga@itb.spb.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daysnet: airoha: Fix TX scheduler queue mask loop upper boundWayen Yan1-1/+1
In airoha_qdma_set_chan_tx_sched(), the loop clearing queue mask was using AIROHA_NUM_TX_RING (32) instead of AIROHA_NUM_QOS_QUEUES (8). Each channel has 8 queues, and TXQ_DISABLE_CHAN_QUEUE_MASK(channel, i) computes BIT(i + (channel * 8)). With i ranging 0..31, this causes: - channel 0: clears bit 0..31 (all 4 channels) instead of 0..7 - channel 1: clears bit 8..31 (channels 1-3) instead of 8..15 - channel 2: clears bit 16..31 (channels 2-3) instead of 16..23 - channel 3: clears bit 24..31 (channel 3 only) - correct by accident While BIT(32+) on arm64 produces 64-bit values truncated to 0 in u32 mask parameter, the loop still incorrectly clears queues within the same channel beyond queue 7. Even though this is functionally harmless (the register resets to 0 and is only ever cleared, never set — so clearing extra bits is a no-op), the loop bound is semantically wrong and should be fixed for correctness and clarity. Fix by using AIROHA_NUM_QOS_QUEUES (8) as the loop upper bound. Fixes: ef1ca9271313 ("net: airoha: Add sched HTB offload support") Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Wayen Yan <win847@gmail.com> Link: https://patch.msgid.link/178187479434.2400840.1312143943526335838@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daysbnx2x: fix potential memory leak in bnx2x_alloc_mem_bp()Abdun Nihaal1-2/+1
If the allocation of fp[i].tpa_info fails, the error path will not free the struct bnx2x_fastpath allocated earlier, as it is not linked to the bp structure yet. Fix that by linking it immediately after allocation. Cc: stable@vger.kernel.org Fixes: 15192a8cf8a8 ("bnx2x: Split the FP structure") Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260620062402.89549-1-nihaal@cse.iitm.ac.in Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 dayseth: bnxt: improve the timing of statsJakub Kicinski3-1/+53
Kernel selftests wait 1.25x of the promised stats refresh time (as read from ethtool -c). bnxt reports 1sec by default, but the stats update process has two steps. First device DMAs the new values, then the service task performs update in full-width SW counters. So the worst case delay is actually 2x. Note that the behavior is different for ring stats and port stats. Port stats are fetched synchronously by the service worker, so there's no risk of doubling up the delay there. The problem of stale stats impacts not only tests but real workloads which monitor egress bandwidth of a NIC. The inaccuracy causes double counting in the next cycle and spurious overload alarms. Try to read from the DMA buffer more aggressively, to mitigate timing issues between DMA and service task. The SW update should be cheap. Fixes: 51f307856b60 ("bnxt_en: Allow statistics DMA to be configurable using ethtool -C.") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260619191538.104165-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 dayshdlc_ppp: sync per-proto timers before freeing hdlc stateFan Wu1-2/+13
Each PPP control protocol (LCP/IPCP/IPV6CP) embedded in struct ppp registers a timer via timer_setup(). That struct ppp is the hdlc->state allocation, which detach_hdlc_protocol() frees with kfree() in both teardown paths: unregister_hdlc_device() and the re-attach inside attach_hdlc_protocol(). The ppp proto never registered a .detach callback, so detach_hdlc_protocol() performs no timer synchronization before the kfree(). The only cancel, timer_delete(&proto->timer) in ppp_cp_event(), is partial (it does not wait for a running callback) and only runs on the ->CLOSED transition; ppp_stop()/ppp_close() do not sync either. A ppp_timer callback already executing (blocked on ppp->lock) survives the kfree and then dereferences proto->state / ppp->lock in freed memory, leading to a use-after-free. Fix this by adding a .detach helper that calls timer_shutdown_sync() on every per-proto timer. detach_hdlc_protocol() invokes proto->detach(dev) before kfree(hdlc->state), so timer_shutdown_sync() now runs on both free paths. timer_shutdown_sync() is used instead of timer_delete_sync() because the keepalive path re-arms the timer through add_timer()/mod_timer() and shutdown blocks any re-activation during teardown. Initialize the per-protocol timers in ppp_ioctl() when the protocol is attached, and remove the now-redundant timer_setup() from ppp_start(), so that the timers are initialized exactly once at attach time and ppp_timer_release() never operates on uninitialized timer_list structures. attach_hdlc_protocol() uses kmalloc() (not kzalloc), so struct ppp's protos[i].timer is uninitialized garbage until the first timer_setup(); without this init-at-attach, attaching the PPP protocol without ever bringing the device up would leave timer_shutdown_sync() operating on uninitialized memory in .detach. Moving the init out of ppp_start() (which only runs on NETDEV_UP) into the attach path makes the initialization unconditional and avoids initializing the same timer_list twice. This bug was found by static analysis. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Fan Wu <fanwu01@zju.edu.cn> Link: https://patch.msgid.link/20260617020518.116319-1-fanwu01@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daysnet: ethernet: ti: icssg: guard PA stat lookupsPhilippe Schenker1-21/+28
icssg_ndo_get_stats64() unconditionally calls emac_get_stat_by_name() with FW PA stat names regardless of whether the PA stats block is present on the hardware. emac_get_stat_by_name() already guards the PA stats lookup with `if (emac->prueth->pa_stats)`; when that pointer is NULL the lookup falls through to netdev_err() and returns -EINVAL. Because ndo_get_stats64 is polled regularly by the networking stack this produces thousands of log entries of the form: icssg-prueth icssg1-eth end0: Invalid stats FW_RX_ERROR A secondary consequence is that the int(-EINVAL) return value is implicitly widened to a near-ULLONG_MAX unsigned value when accumulated into the __u64 fields of rtnl_link_stats64, silently corrupting the rx_errors, rx_dropped and tx_dropped counters reported by `ip -s link`. Every other PA-aware code path in the driver is already guarded with the same `if (emac->prueth->pa_stats)` check. Apply the same guard here. Fixes: 0d15a26b247d ("net: ti: icssg-prueth: Add ICSSG FW Stats") Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch> Reviewed-by: Simon Horman <horms@kernel.org> Cc: danishanwar@ti.com Cc: rogerq@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260618093037.3448858-1-dev@pschenker.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 daysrocker: Fix memory leak in ofdpa_port_fdb()Ziran Zhang1-0/+3
In ofdpa_port_fdb(), the hash_del() only unlinks the node from hash table, but does not free it. Fix this by adding kfree(found) after the !found == removing check, where the pointer value is no longer needed. Found by Coccinelle kfree script. Cc: <stable+noautosel@kernel.org> # rocker is a test harness, it's never loaded on production systems Signed-off-by: Ziran Zhang <zhangcoder@yeah.net> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260616013245.7098-1-zhangcoder@yeah.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 dayse1000e: Reconfigure PLL clock gate timeout and re-enable K1 on Meteor LakeDima Ruinskiy2-1/+17
Commit 3c7bf5af21960 ("e1000e: Introduce private flag to disable K1") disabled K1 by default on Meteor Lake and newer systems due to packet loss observed on various platforms. However, disabling K1 caused an increase in power consumption. To mitigate this, reconfigure the PLL clock gate value so that K1 can remain enabled without incurring the additional power consumption. Re-enable K1 by default, but keep the private flag to support disabling it via ethtool. Additionally, introduce a DMI quirk table, so that K1 may be disabled by default on known problematic systems. Currently, this includes the Dell Pro 16 Plus, where the issue has been reported to persist despite the changes to the PLL lock timeout. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220954 Link: https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20250623/048860.html Link: https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20260330/054059.html Signed-off-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Co-developed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Fixes: 3c7bf5af21960 ("e1000e: Introduce private flag to disable K1") Tested-by: Moriya Kadosh <moriyax.kadosh@intel.com> Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysi40e: Fix i40e_debug() to use struct i40e_hw argumentMohamed Khalfella1-1/+1
i40e_debug() macro takes struct i40e_hw *h as first argument. But the macro body uses hw instead of h. This has been working so far because hw happens to be the name of the variable in the context where the macro is expanded. Fix the macro to use the passed argument. Fixes: 5dfd37c37a44 ("i40e: Split i40e_osdep.h") Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: dpll: fix memory leak in ice_dpll_init_info error pathsZhaoJinming1-6/+10
Several error return paths in ice_dpll_init_info() directly return without freeing previously allocated resources, causing memory leaks: - When de->input_prio allocation fails, d->inputs is leaked - When dp->input_prio allocation fails, d->inputs and de->input_prio are leaked - When ice_get_cgu_rclk_pin_info() fails, all previously allocated inputs/outputs/input_prio are leaked - When ice_dpll_init_pins_info(RCLK_INPUT) fails, same resources are leaked Fix this by jumping to the deinit_info label which properly calls ice_dpll_deinit_info() to free all allocated resources. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Signed-off-by: ZhaoJinming <zhaojinming@uniontech.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: dpll: set pointers to NULL after kfree in ice_dpll_deinit_infoZhaoJinming1-0/+4
ice_dpll_deinit_info() calls kfree() on several pf->dplls fields (inputs, outputs, eec.input_prio, pps.input_prio) but does not set the pointers to NULL afterward. This leaves dangling pointers in the pf->dplls structure. While not currently exploitable through existing code paths, this is unsafe because: 1. If ice_dpll_init_info() is called again after a deinit (e.g. during driver recovery), and a subsequent allocation within init fails, the error path will jump to deinit_info and call ice_dpll_deinit_info() again. Since some pointers still hold the old freed addresses, this would result in a double-free. 2. Any future code that checks these pointers before use or after free would be unprotected against use-after-free. Follow the common kernel convention of setting pointers to NULL after kfree() so that: - kfree(NULL) is a safe no-op, preventing double-free - NULL checks on these pointers become meaningful This is a preparatory fix for a subsequent patch that routes additional error paths in ice_dpll_init_info() to the deinit_info label. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Signed-off-by: ZhaoJinming <zhaojinming@uniontech.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: call netif_keep_dst() once when entering switchdev modeMarcin Szycik1-2/+2
netif_keep_dst() only needs to be called once for the uplink VSI, not once for each port representor. Move it from ice_eswitch_setup_repr() to ice_eswitch_enable_switchdev(). Fixes: defd52455aee ("ice: do Tx through PF netdev in slow-path") Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Patryk Holda <patryk.holda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: fix ice_init_link() error return preventing probePaul Greenwalt1-11/+5
ice_init_link() can return an error status from ice_update_link_info() or ice_init_phy_user_cfg(), causing probe to fail. An incorrect NVM update procedure can result in link/PHY errors, and the recommended resolution is to update the NVM using the correct procedure. If the driver fails probe due to link errors, the user cannot update the NVM to recover. The link/PHY errors logged are non-fatal: they are already annotated as 'not a fatal error if this fails'. Since none of the errors inside ice_init_link() should prevent probe from completing, convert it to void and remove the error check in the caller. All failures are already logged; callers have no meaningful recovery path for link init errors. Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Cc: stable@vger.kernel.org Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: fix AQ error code comparison in ice_set_pauseparam()Lukasz Czapnik2-5/+8
Fix unreachable code: the conditionals in ice_set_pauseparam() used the bitwise-AND operator suggesting aq_failures is a bitmap, but it is actually an enum, making the third condition logically unreachable. Replace the if-else ladder with a switch statement. Also move the aq_failures initialization to the variable declaration and remove the redundant zeroing from ice_set_fc(). Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysice: fix FDIR CTRL VSI resource leak in ice_reset_all_vfs()Dawid Osuchowski1-1/+1
Resetting all VFs causes resource leak on VFs with FDIR filters enabled as CTRL VSIs are only invalidated and not freed. Fix by using ice_vf_ctrl_vsi_release() instead of ice_vf_ctrl_invalidate_vsi() which aligns behavior with the ice_reset_vf() function. Reproduction: echo 1 > /sys/class/net/$pf/device/sriov_numvfs ethtool -N $vf flow-type ether proto 0x9000 action 0 echo 1 > /sys/class/net/$pf/device/reset Fixes: da62c5ff9dcd ("ice: Add support for per VF ctrl VSI enabling") Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
9 daysMerge tag 'usb-7.2-rc1' of ↵Linus Torvalds1-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and Thunderbolt driver updates from Greg KH: "Here is the big set of USB and Thunderbolt driver changes for 7.2-rc1. Lots of little stuff in here, major highlights include: - USB4STREAM support for Thunderbolt devices. A new way to send "raw" data very quickly over a USB4 connection to another system directly - Other thunderbolt updates and changes to make the stream code work - xhci driver updates and additions - typec driver updates and additions - usb gadget driver updates and fixes for reported issues - zh_CN documentation translation of the USB documentation - usb-serial driver updates - dts cleanups for some USB platforms - other minor USB driver updates and tweaks All of these have been in linux-next for over a week with no reported issues, most of them for many many weeks" * tag 'usb-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (131 commits) usb: ucsi: huawei_gaokun: support mode switching thunderbolt: debugfs: Fix sideband write size check thunderbolt: debugfs: Fix margining error counter buffer leak usb: host: xhci-rcar: Split R-Car Gen2 and Gen3 .plat_start() handling usb: host: xhci-rcar: Remove SET_XHCI_PLAT_PRIV_FOR_RCAR() macro usb: xhci: allocate internal DCBAA mirror dynamically usb: xhci: allocate DCBAA based on host controller max slots usb: xhci: refactor DCBAA struct xhci: Prevent queuing new commands if xhci is inaccessible xhci: dbc: detect and recover hung DbC during enumeraton xhci: dbc: add timestamps to DbC state changes in a new helper. xhci: dbc: add helper to set and clear DbC DCE enable bit xhci: dbc: serialize enabling and disabling dbc xhci: dbc: Fix sysfs ABI Documentation for xhci dbc states usb: xhci: Improve Soft Retries after short transfers usb: xhci: Remove isochronous URB_SHORT_NOT_OK handling usb: xhci: Remove skip_isoc_td() usb: xhci: Simplify xhci_quiesce() usb: xhci: remove legacy 'num_trbs_free' tracking usb: xhci: fix typo in xhci_set_port_power() comment ...
9 daysMerge tag 'tty-7.2-rc1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial driver updates from Greg KH: "Here is the big set of TTY and Serial driver updates for 7.2-rc1. Overall we end up removing more code than added, due to an obsolete synclink_gt driver being removed from the tree, always a nice thing to see happen. Other than that driver removal, major things included in here are: - max310x serial driver updates and fixes - 8250 driver updates and rework in places to make it more "modern" - dts file updates - serial driver core tweaks and updates - vt code cleanups - vc_screen crash fixes - other minor driver updates and cleanups All of these have been in linux-next for well over a week with no reported issues" * tag 'tty-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (49 commits) serial: 8250_pci: Don't specify conflicting values to pci_device_id members vc_screen: fix null-ptr-deref in vcs_notifier() during concurrent vcs_write serial: qcom_geni: Fix RX DMA stall when SE_DMA_RX_LEN_IN is zero vt: merge ucs_is_zero_width()/ucs_is_double_width() into ucs_get_width() serial: 8250: fix possible ISR soft lockup dt-bindings: serial: rs485: remove deprecated .txt binding stub serial: qcom-geni: trace: Add tracepoint support for Qualcomm GENI serial tty: serial: Use named initializers for arrays of i2c_device_data serial: 8250_dw: remove clock-notifier infrastructure serial: 8250_dw: unregister 8250 port if clk_notifier_register() fails amba/serial: amba-pl011: Bring back zx29 UART support serial: 8250: Add support for console flow control serial: 8250: Check LSR timeout on console flow control serial: 8250: Set cons_flow on port registration tty: serial: 8250: protect against NULL uart->port.dev in register arm64: dts: add support for A9 based Amlogic BY401 dt-bindings: arm: amlogic: add A311Y3 support serial: max310x: fix compile errors if CONFIG_SPI_MASTER is disabled serial: qcom-geni: Avoid probing debug console UART without console support serial: max310x: add comments for PLL limits ...
10 daysdpaa2-switch: do not accept VLAN uppers while bridgedIoana Ciornei1-0/+8
The dpaa2-switch driver does not support VLAN uppers while its ports are bridged. This scenario tried to be prevented by rejecting a bridge join while VLAN uppers exist but the reverse order was still possible. This patches adds a check so that the dpaa2-switch also does not accept VLAN uppers while bridged. Fixes: f48298d3fbfa ("staging: dpaa2-switch: move the driver out of staging") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260618092813.432535-2-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: airoha: Fix skb->priority underflow in airoha_dev_select_queue()Wayen Yan1-1/+1
In airoha_dev_select_queue(), the expression: queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES; implicitly converts to unsigned arithmetic: when skb->priority is 0 (the default for unclassified traffic), (0u - 1u) wraps to UINT_MAX, and UINT_MAX % 8 = 7, routing default best-effort packets to the highest-priority QoS queue. This causes QoS inversion where the majority of traffic on a PON gateway starves actual high-priority flows (VoIP, gaming, etc.). The "- 1" offset was a leftover from the ETS offload implementation that has since been removed. The correct mapping is a direct modulo: queue = skb->priority % AIROHA_NUM_QOS_QUEUES; This maps priority 0 → queue 0 (lowest), priority 7 → queue 7 (highest), with higher priorities wrapping around. This is the standard Linux sk_prio → HW queue mapping used by other drivers. Fixes: 2b288b81560b ("net: airoha: Introduce ndo_select_queue callback") Link: https://lore.kernel.org/netdev/178185573207.2378135.3729126358670287878@gmail.com/ Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Joe Damato <joe@dama.to> Signed-off-by: Wayen Yan <win847@gmail.com> Link: https://patch.msgid.link/178194366700.2485734.5368768965976693502@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: emac: Fix NULL pointer dereference in emac_probeRosen Penev1-7/+6
Move devm_request_irq() after devm_platform_ioremap_resource() so that dev->emacp is mapped before the interrupt handler can fire. An early interrupt hitting emac_irq() would dereference the NULL dev->emacp and crash. Also remove redundant error message. devm_platform_ioremap_resource() already returns an error message with dev_err_probe(). Fixes: dcc34ef7c834 ("net: ibm: emac: manage emac_irq with devm") Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260618023405.415644-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysMerge tag 'ieee802154-for-net-next-2026-06-20' of ↵Jakub Kicinski1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2026-06-20 An overdue pull request for ieee802154, catching up on all the AI found issues at last. Shitalkumar Gandhi fixed problems in the ca8210 driver for cases where we could have a leak or a pointer truncation. Robertus Diawan Chris made sure we do not overwrite the return code when associating. Michael Bommarito worked on properly gating our netlink API use in the llsec security context. Ivan Abramov cleaned up the netns cases as he did in other subsystems. Doruk Tan Ozturk ensures we have the correct skn ready in cryptoo operation (to avoid a silent overwrite). Aleksandr Nogikh fixed a kernel-infoleak detected by syzbot. * tag 'ieee802154-for-net-next-2026-06-20' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next: ieee802154: allow legacy LLSEC ADD/DEL ops to pass strict validation ieee802154: admin-gate legacy LLSEC dump operations mac802154: Prevent overwrite return code in mac802154_perform_association() ieee802154: fix kernel-infoleak in dgram_recvmsg() mac802154: llsec: add skb_cow_data() before in-place crypto ieee802154: ca8210: fix pointer truncation in kfifo on 64-bit ieee802154: ca8210: fix cas_ctl leak on spi_async failure ieee802154: Remove WARN_ON() in cfg802154_pernet_exit() ieee802154: Avoid calling WARN_ON() on -ENOMEM in cfg802154_switch_netns() ieee802154: Restore initial state on failed device_rename() in cfg802154_switch_netns() ==================== Link: https://patch.msgid.link/20260620174903.1010671-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysocteontx2-pf: mcs: Fix mcs resources free on PF shutdownGeetha sowjanya1-2/+7
On PF shutdown, the current driver free mcs hardware resources though mcs resources are not allocated to it. This patch checks the mcs resources status and if resources are allocated then only sends mailbox message to free them. Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1781636420-19816-3-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysocteontx2-pf: Clear stats of all resources when freeing resourcesSubbaraya Sundeep1-0/+1
When all MCS resources mapped to a PF are being freed then clear stats of all those resources too. Fixes: 815debbbf7b5 ("octeontx2-pf: mcs: Clear stats before freeing resource") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1781636420-19816-2-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysocteontx2-af: mcs: Fix unsupported secy stats readGeetha sowjanya2-4/+5
Secy control stats counter doesn't exist for CNF10KB platform. Skip reading this respective register for CNF10KB silicon while fetching secy stats. Fixes: 9312150af8da ("octeontx2-af: cn10k: mcs: Support for stats collection") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1781636420-19816-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysocteontx2-af: npc: cn20k: fix NPC defragRatheesh Kannoth1-3/+6
npc_defrag_alloc_free_slots() always passed NPC_MCAM_KEY_X2 into __npc_subbank_alloc(), which must match sb->key_type, so defrag never allocated replacement slots on X4 banks. Pass the subbank key type for bank 0, and only extend the search into bank 1 for X2 (X4 MCAM indices are confined to b0b..b0t). Fixes: 645c6e3c1999 ("octeontx2-af: npc: cn20k: virtual index support") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260617102149.1309913-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: ethernet: mtk_ppe: Fix rhashtable leak in mtk_ppe_init error pathsWayen Yan1-2/+2
In mtk_ppe_init(), when accounting is enabled, the error paths for dmam_alloc_coherent(mib) and devm_kzalloc(acct) failures return NULL directly, bypassing the err_free_l2_flows label that destroys the rhashtable initialized earlier. While this leak only occurs during probe (not runtime) and the leaked memory is minimal (an empty rhash table), fixing it ensures proper error path cleanup consistency. Fix by changing the two return NULL statements to goto err_free_l2_flows. Fixes: 603ea5e7ffa7 ("net: ethernet: mtk_eth_soc: fix memory leak in error path") Signed-off-by: Wayen Yan <win847@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/178167550101.2217645.14579307712717502425@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: marvell: prestera: initialize err in prestera_port_sfp_bindRuoyu Wang1-1/+1
prestera_port_sfp_bind() returns err after walking the ports node. If no child node matches the port's front-panel id, err is never assigned. Initialize err to 0 because absence of a matching optional port device tree node is not an error. In that case no phylink is created and port creation should continue with port->phy_link left NULL. Errors from malformed matched nodes and phylink_create() still propagate. Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support") Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Acked-by: Elad Nachman <enachman@marvell.com> Link: https://patch.msgid.link/20260617193228.1653582-1-ruoyuw560@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 daysieee802154: ca8210: fix pointer truncation in kfifo on 64-bitShitalkumar Gandhi1-2/+4
ca8210_test_int_driver_write() and ca8210_test_int_user_read() exchange a kmalloc'd buffer pointer through a struct kfifo, but pass a literal '4' as the byte count to kfifo_in()/kfifo_out(). This is correct on 32-bit (pointer = 4 bytes), but on 64-bit only the low 4 bytes of the 8-byte pointer are written into the FIFO. The reader then reads back 4 bytes into an 8-byte local pointer variable, leaving the upper 4 bytes uninitialized stack data. The first dereference of the reconstructed pointer (fifo_buffer[1]) accesses an arbitrary kernel address and generally results in an oops. Use sizeof(fifo_buffer) so the byte count matches pointer width on every architecture. The driver has no architecture restriction in Kconfig, so any 64-bit build with CONFIG_IEEE802154_CA8210_DEBUGFS=y is exposed. Issue has been latent since the driver was added in 2017 because it is most commonly deployed on 32-bit MCUs. Found via a custom Coccinelle semantic patch hunting for short-byte kfifo I/O on byte-mode kfifos used to shuttle pointers. Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Cc: stable@vger.kernel.org Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/20260520105750.30144-1-shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
12 daysieee802154: ca8210: fix cas_ctl leak on spi_async failureShitalkumar Gandhi1-1/+2
ca8210_spi_transfer() allocates cas_ctl with kzalloc_obj(GFP_ATOMIC) and relies entirely on the SPI completion callback ca8210_spi_transfer_complete() to free it. The spi_async() API only invokes the completion callback on successful submission. On failure it returns a negative error code without ever queuing the callback, which leaves cas_ctl and its embedded spi_message and spi_transfer orphaned. Every kfree(cas_ctl) in the driver is inside the completion callback, so there is no other reclamation path. ca8210_spi_transfer() is called from ca8210_spi_exchange(), the interrupt handler ca8210_interrupt_handler(), and from the retry path inside the completion callback itself. The exchange and interrupt handler paths loop on -EBUSY, so under sustained SPI bus contention every retry iteration leaks a fresh cas_ctl (~600 bytes per occurrence). Fix it by freeing cas_ctl on the spi_async() error path. While here, correct the misleading error string: the function calls spi_async(), not spi_sync(). Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Cc: stable@vger.kernel.org Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/20260421073259.2259783-1-shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
13 dayseth: fbnic: take netif_addr_lock_bh() around rx mode address programmingDaniel Zahka3-1/+12
When __fbnic_set_rx_mode() is called from contexts other than .ndo_set_rx_mode_async(), the uc and mc addr lists are accessed without the addr lock that __hw_addr_sync_dev() and __hw_addr_unsync_dev() require. Wrap these unprotected accesses with netif_addr_lock_bh(). fbnic_clear_rx_mode() has similar issues. Fixes: eb690ef8d1c2 ("eth: fbnic: Add L2 address programming") Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260617-linux-fbnic-hwaddr-v1-1-3f9f5dee7f99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: wangxun: don't advertise IFF_SUPP_NOFCSRongguang Wei2-2/+0
Like commit a24162f18825("i40e: don't advertise IFF_SUPP_NOFCS"), ngbe and txgbe also advertises IFF_SUPP_NOFCS and allowing users to use the SO_NOFCS socket option. But the driver does not check skb->no_fcs, so this option is silently ignored. With this change, send() fails with -EPROTONOSUPPORT when AF_PACKET socket is set SO_NOFCS option. Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn> Link: https://patch.msgid.link/20260617092854.133992-1-clementwei90@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: ena: clean up XDP TX queues when regular TX setup failsDawei Feng1-2/+21
create_queues_with_size_backoff() creates XDP TX queues before setting up the regular TX path. If the subsequent allocation or creation of regular TX queues fails, the error handling paths omit the teardown of the XDP TX queues, leading to a resource leak. Fix this by explicitly destroying the XDP TX queue subset at the two missing failure points. The bug was first flagged by an experimental analysis tool we are developing for kernel memory-management bugs while analyzing v6.13-rc1. The tool is still under development and is not yet publicly available. Manual inspection confirms that the bug is still present in v7.1-rc7. An x86_64 allyesconfig build showed no new warnings. As we do not have an ENA device to test with, no runtime testing was able to be performed. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Cc: stable@vger.kernel.org Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn> Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com> Tested-by: Arthur Kiyanovski <akiyano@amazon.com> Link: https://patch.msgid.link/20260616142424.4005130-1-dawei.feng@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysgve: fix header buffer corruption with header-split and HW-GROAnkit Garg1-10/+18
The DQO RX datapath programs a per-buffer-queue-descriptor header_buf_addr at post time and reads the split header back at completion time. Both the post and the read currently index the header buffer by queue position rather than by the buffer's identity: - post (gve_rx_post_buffers_dqo): header_buf_addr is computed from bufq->tail - read (gve_rx_dqo): the header is read from desc_idx (the completion queue head index) This relies on the buffer-queue index and the completion-queue index being equal for the start of every packet, i.e. on the device consuming posted buffers and returning completions in the exact same order. That assumption does not hold once HW-GRO is enabled with multiple flows: coalesced segments are accepted and completed in an order that may differ from the order buffers were posted, and segments from different flows may interleave. That results in two problems: 1. Wrong header slot on read. Because the read offset is derived from the completion index (desc_idx) while the device wrote the header to the address programmed for the buffer's buf_id, the driver can copy a header belonging to a different packet. This shows up as throughput drop (about 30% drop and large numbers of TCP retransmissions) with header-split and HW-GRO both enabled and many streams. 2. Header buffer reused while still owned by the device. The driver advances bufq->head by one per completion and re-posts buffers based on that. Arrival of N RX completions only guarantees that at least N RX buffer descriptors have been read by the device. It does not guarantee that the device has relinquished the ownership of all the buffers corresponding to those N descriptors. With out-of-order completions (e.g. the completion for a packet copied into buffer N arrives before the completion for a packet copied into buffer N-1), the driver can re-post and overwrite a header buffer that the device is still going to write into, corrupting the header of a packet whose completion has not yet been processed. Fix both issues by indexing the header buffer by buf_id on both the post and read paths. Reading from buf_id's slot is therefore always correct regardless of completion ordering (fixes problem 1). Indexing by buf_id also ties each header slot to the lifetime of its buffer state. A buffer state is only returned to the free/recycle lists when its own completion (buf_id) is processed, so its header slot can only be re-posted after the device is done with it. This makes header slot reuse safe under out-of-order completions (fixes problem 2). Allocate (gve_rx_alloc_hdr_bufs) and free (gve_rx_free_hdr_bufs) the header buffers based on num_buf_states to match the buf_id indexing. Cc: stable@vger.kernel.org Fixes: 5e37d8254e7f ("gve: Add header split data path") Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260617013208.3781453-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: thunderbolt: Fix frags[] overflow by bounding frame_countMaoyi Xie1-2/+6
tbnet_poll() assembles a multi-frame ThunderboltIP packet into one skb. The first frame goes into the skb linear area and every further frame is added as a page fragment. skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, hdr_size, frame_size, TBNET_RX_PAGE_SIZE - hdr_size); A packet of frame_count frames therefore ends up with frame_count - 1 fragments. tbnet_check_frame() only bounds the peer supplied frame_count to TBNET_RING_SIZE / 4 (64), which is far above MAX_SKB_FRAGS (17 by default). A peer that sends a packet of 19 or more small frames pushes nr_frags past MAX_SKB_FRAGS, so skb_add_rx_frag() writes past skb_shinfo()->frags[] and corrupts memory after the shared info. Tighten the start of packet bound to MAX_SKB_FRAGS + 1 so a packet can never produce more fragments than frags[] can hold. This matches the recent skb frags overflow fixes in other receive paths, for example f0813bcd2d9d ("net: wwan: t7xx: fix potential skb->frags overflow in RX path") and 600dc40554dc ("net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()"). Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Cc: stable@vger.kernel.org Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://patch.msgid.link/178163152194.2486768.14724194232649760778@maoyixie.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnetconsole: don't drop the last byte of a full-sized messageBreno Leitao1-5/+7
nt->buf is exactly MAX_PRINT_CHUNK bytes, but scnprintf() reserves one byte for its NUL terminator, so a non-fragmented payload of exactly MAX_PRINT_CHUNK loses its last byte (emitted as a stray NUL in the release path). Grow nt->buf to MAX_PRINT_CHUNK + 1 and bound the scnprintf() calls with sizeof(nt->buf); the transmitted length stays capped at MAX_PRINT_CHUNK. Alternatively, nt->buf could be left at MAX_PRINT_CHUNK and the NUL byte reserved by routing exactly-MAX_PRINT_CHUNK payloads to fragmentation ('len < MAX_PRINT_CHUNK'), at the cost of fragmenting those messages. But it would look less sane, thus the current approach. Fixes: c62c0a17f9b7 ("netconsole: Append kernel version to message") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260616-max_print_chunk-v1-1-8dc125d67083@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: macb: add TX stall timeout callback to recover from lost TSTART writeLukasz Raczylo1-0/+8
The MACB found in the Raspberry Pi RP1 suffers from sporadic stalls on the TX queue. While the exact root cause is not yet fully understood, it is likely related to a hardware issue where a TSTART write to the NCR register is missed, preventing the transmission from being kicked off. Implement a timeout callback to handle TX queue stalls, triggering the existing restart mechanism to recover. Link: https://lore.kernel.org/all/20260514215459.36109-1-lukasz@raczylo.com/ Fixes: dc110d1b23564 ("net: cadence: macb: Add support for Raspberry Pi RP1 ethernet controller") Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com> Co-developed-by: Steffen Jaeckel <sjaeckel@suse.de> Signed-off-by: Steffen Jaeckel <sjaeckel@suse.de> Co-developed-by: Andrea della Porta <andrea.porta@suse.com> Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://patch.msgid.link/468f480454a314303bac6a54780b153f689f2267.1781598350.git.andrea.porta@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: airoha: fix foe_check_time allocation sizeWayen Yan1-1/+2
foe_check_time is declared as u16 pointer but was allocated with only ppe_num_entries bytes instead of ppe_num_entries * sizeof(u16). When airoha_ppe_foe_verify_entry() is called with hash >= ppe_num_entries/2, it writes beyond the allocated buffer, causing heap buffer overflow and potential kernel crash. Fixes: 6d5b601d52a2 ("net: airoha: ppe: Dynamically allocate foe_check_time array in airoha_ppe struct") Signed-off-by: Wayen Yan <win847@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/178161119471.2163752.14373384830691569758@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysocteontx2-af: cn10k: restrict VF LMTLINE sharing to its own PFJunrui Luo1-0/+9
rvu_mbox_handler_lmtst_tbl_setup() uses req->base_pcifunc as a direct index into the LMT map table to read another function's LMTLINE physical base address and copy it into the caller's own LMT map table entry. The mailbox dispatcher authenticates req->hdr.pcifunc from the IRQ source, but req->base_pcifunc is a separate payload field and is not sanitized. Reject the request with -EPERM when a VF caller's base_pcifunc is not a valid function under its own PF. is_pf_func_valid() bounds the FUNC field to the PF's configured VF count, keeping the computed index inside the caller's own slot block. Fixes: 893ae97214c3 ("octeontx2-af: cn10k: Support configurable LMTST regions") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB78811656934E713B77DA6CEDAFE62@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysnet: pch_gbe: handle TX skb allocation failureRuoyu Wang1-9/+29
pch_gbe_alloc_tx_buffers() allocates an skb for each TX descriptor and then passes the returned pointer to skb_reserve(). If netdev_alloc_skb() fails, skb_reserve() dereferences NULL. Make pch_gbe_alloc_tx_buffers() return an error when an skb allocation fails. On failure, let pch_gbe_alloc_tx_buffers() clean the partially allocated TX ring before returning the error. While bringing the device up, release the RX buffer pool through a shared cleanup helper before unwinding the IRQ setup. Cc: stable+noautosel@kernel.org # untested fix to unlikely error path Fixes: 77555ee72282 ("net: Add Gigabit Ethernet driver of Topcliff PCH") Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260615125043.3537046-1-ruoyuw560@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysdpaa2-switch: fix VLAN upper check not rejecting bridge joinIoana Ciornei1-1/+1
The blamed commit refactored the prechangeupper event handling but failed to actually return an error in case dpaa2_switch_prevent_bridging_with_8021q_upper() detected a 802.1q upper on a port which tries to join a bridge. Fix this by returning err instead of 0. Fixes: 45035febc495 ("net: dpaa2-switch: refactor prechangeupper sanity checks") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260616105430.3725910-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysvirtio-net: fix len check in receive_big()Xiang Mei1-3/+6
receive_big() bounds the device-announced length by (big_packets_num_skbfrags + 1) * PAGE_SIZE. That is still too loose: add_recvbuf_big() sets sg[1] to start at offset sizeof(struct padded_vnet_hdr) into the first page, so the chain actually carries hdr_len + (PAGE_SIZE - sizeof(padded_vnet_hdr)) + big_packets_num_skbfrags * PAGE_SIZE bytes -- 20 bytes less than the check allows for the common hdr_len == 12 case. A malicious virtio backend can announce a len in that gap. page_to_skb() then walks one frag past the page chain, storing a NULL page->private into skb_shinfo()->frags[MAX_SKB_FRAGS], which is both an out-of-bounds write past the static frag array and a NULL frag handed up the rx path. Bound len by the size add_recvbuf_big() actually advertised. Fixes: 0c716703965f ("virtio-net: fix received length check in big packets") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com> Link: https://patch.msgid.link/20260616042837.2249468-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysnet/mlx5: Remove broken and unused mlx5_query_mtppse()Li RongQing2-20/+0
mlx5_query_mtppse() reads the Event Trigger Pin (MTPPSE) register but reads the returned arm and mode values from the input buffer 'in' instead of the output buffer 'out', so it always returns the values that were written rather than the actual hardware state, making the query useless. The function has no in-tree callers. Remove it rather than fix it. Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260615140406.1828-1-lirongqing@baidu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysnet: ehea: unwind probe_port sysfs file on failurePengpeng Hou1-0/+2
ehea_create_device_sysfs() creates probe_port and then remove_port. If the second device_create_file() fails, the helper returns the error but leaves probe_port installed even though probe treats the sysfs setup as failed. Remove probe_port on the remove_port creation failure path so the helper leaves no partial sysfs state behind. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260615070033.43461-1-pengpeng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysocteontx2-af: npc: Log successful MCAM drop-on-non-hit install at debug levelRatheesh Kannoth1-1/+1
npc_install_mcam_drop_rule() used dev_err() after a successful rvu_mbox_handler_npc_mcam_write_entry() call, so normal installs appeared as errors in dmesg. Use dev_dbg() for the success path and keep dev_err() for real failures. Fixes: 3571fe07a090 ("octeontx2-af: Drop rules for NPC MCAM") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260615033157.535237-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysocteontx2-pf: Fix leak of SQ timestamp buffer on teardownRatheesh Kannoth1-0/+1
The send-queue timestamp ring is allocated with qmem_alloc() when timestamping is used, but otx2_free_sq_res() never freed sq->timestamps, leaking that memory across ifdown and device removal. Add the missing qmem_free() alongside the other SQ companion buffers. Fixes: c9c12d339d93 ("octeontx2-pf: Add support for PTP clock") Cc: Aleksey Makarov <amakarov@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260615030704.504536-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysnet: ethernet: mtk_eth_soc: fix supported_interface set after phylink_createChristian Marangi1-5/+5
Everything configured in phylink_config it's assumed to be set before calling phylink_create() to permit correct parsing of all the different modes and capabilities. Commit 51cf06ddafc9 ("net: ethernet: mtk_eth_soc: add support for MT7988 internal 2.5G PHY") while introducing support for 2.5G phy for MT7988, probably due to an auto-rebase, placed the configuration of the INTERNAL interface mode for the supported_interfaces for phylink_config right after phylink_create() introducing a possible problem with supported interfaces parsing. While this doesn't currently create any problem/bug, move setting this bit before phylink_create() to prevent any possible regression in future code change in phylink core. Fixes: 51cf06ddafc9 ("net: ethernet: mtk_eth_soc: add support for MT7988 internal 2.5G PHY") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20260615151106.15438-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysnet: pse-pd: set user byte command SUB2 fieldRobert Marko1-1/+1
The Set User Byte to Save command has three subject bytes. The PD692x0 protocol guides defines SUB2 with value 0x4e, while SUB1 carries the NVM user byte. Template only initialized SUB and SUB1. Fill SUB2 explicitly so the command matches the documented layout. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Acked-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20260611102517.445549-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds3-0/+1035
Pull virtio updates from Michael Tsirkin: - new virtio CAN driver - support for LoongArch architecture in fw_cfg - support for firmware notifications in vdpa/octeon_ep - support for VFs in virtio core - fixes, cleanups all over the place, notably: - vhost: fix vhost_get_avail_idx for a non empty ring fixing an significant old perf regression - READ_ONCE() annotations mean virtio ring is now free of KCSAN warnings * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (37 commits) can: virtio: Fix comment in UAPI header can: virtio: Add virtio CAN driver virtio: add num_vf callback to virtio_bus fw_cfg: Add support for LoongArch architecture vdpa/octeon_ep: fix IRQ-to-ring mapping in interrupt handler vdpa/octeon_ep: Add vDPA device event handling for firmware notifications vdpa/octeon_ep: Use 4 bytes for mailbox signature vdpa/octeon_ep: Fix PF->VF mailbox data address calculation vhost_task_create: kill unnecessary .exit_signal initialization vhost: remove unnecessary module_init/exit functions vdpa/mlx5: Use kvzalloc_flex() for MTT command memory vdpa_sim_net: switch to dynamic root device vdpa_sim_blk: switch to dynamic root device virtio-mem: Destroy mutex before freeing virtio_mem virtio-balloon: Destroy mutex before freeing virtio_balloon tools/virtio: fix build for kmalloc_obj API and missing stubs virtio_ring: Add READ_ONCE annotations for device-writable fields vduse: fix compat handling for VDUSE_IOTLB_GET_FD/VDUSE_VQ_GET_INFO tools/virtio: check mmap return value in vringh_test vhost/net: complete zerocopy ubufs only once ...
2026-06-17Merge tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-1/+1
Pull workqueue updates from Tejun Heo: - Continued progress toward making alloc_workqueue() unbound by default: more callers converted to WQ_PERCPU / system_percpu_wq / system_dfl_wq, and new warnings for queues that use neither WQ_PERCPU nor WQ_UNBOUND or the legacy system_wq / system_unbound_wq. - Misc: drop the now-trivial apply_wqattrs_lock()/unlock() wrappers, forbid the TEST_WORKQUEUE benchmark from being built-in, and fix a spurious pointer level in the worker debug-dump path. * tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: drm/bridge: anx7625: Add WQ_PERCPU add to alloc_workqueue wifi: ath6kl: fix invalid workqueue flags in ath6kl_usb_create() btrfs: Drop WQ_PERCPU from ordered_flags in btrfs_init_workqueues() workqueue: Add warnings and ensure one among WQ_PERCPU or WQ_UNBOUND is present workqueue: Add warnings and fallback if system_{unbound}_wq is used workqueue: drop spurious '*' from print_worker_info() fn declaration workqueue: forbid TEST_WORKQUEUE from being built-in workqueue: drop apply_wqattrs_lock()/unlock() wrappers umh: replace use of system_unbound_wq with system_dfl_wq rapidio: rio: add WQ_PERCPU to alloc_workqueue users media: ddbridge: add WQ_PERCPU to alloc_workqueue users platform: cznic: turris-omnia-mcu: replace use of system_wq with system_percpu_wq media: synopsys: hdmirx: replace use of system_unbound_wq with system_dfl_wq virt: acrn: Add WQ_PERCPU to alloc_workqueue users
2026-06-17Merge tag 'bitmap-for-7.2' of https://github.com/norov/linuxLinus Torvalds4-8/+8
Pull bitmap updates from Yury Norov: "This includes the new FIELD_GET_SIGNED() helper, bitmap_print_to_pagebuf() removal, RISCV/bitrev support, and a couple cleanups. - new handy helper FIELD_GET_SIGNED() (Yury) - arch test_and_set_bit_lock() and clear_bit_unlock() cleanup (Randy) - __bf_shf() simplification (Yury) - bitmap_print_to_pagebuf() removal (Yury) - RISCV/bitrev conditional support (Jindie, Yury)" * tag 'bitmap-for-7.2' of https://github.com/norov/linux: MAINTAINERS: BITOPS: include bitrev.[ch] arch/riscv: Add bitrev.h file to support rev8 and brev8 bitops: Define generic___bitrev8/16/32 for reuse lib/bitrev: Introduce GENERIC_BITREVERSE arch: select HAVE_ARCH_BITREVERSE conditionally on BITREVERSE bitmap: fix find helper documentation bitmap: drop bitmap_print_to_pagebuf() cpumask: switch cpumap_print_to_pagebuf() to using scnprintf() bitfield: wire __bf_shf to __builtin_ctzll bitops: use common function parameter names ptp: switch to using FIELD_GET_SIGNED() rtc: rv3032: switch to using FIELD_GET_SIGNED() wifi: rtw89: switch to using FIELD_GET_SIGNED() iio: mcp9600: switch to using FIELD_GET_SIGNED() iio: pressure: bmp280: switch to using FIELD_GET_SIGNED() iio: magnetometer: yas530: switch to using FIELD_GET_SIGNED() iio: intel_dc_ti_adc: switch to using FIELD_GET_SIGNED() x86/extable: switch to using FIELD_GET_SIGNED() bitfield: add FIELD_GET_SIGNED()
2026-06-17Merge tag 'bpf-next-7.2' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "Major changes: - Recover from BPF arena page faults using a scratch page and add ptep_try_set() for lockless empty-slot installs on x86 and arm64. This allows BPF kfuncs to access arena pointers directly. The 'arena_direct_access' stable branch was created for this work and was pulled into sched-ext and bpf-next trees (Tejun Heo, Kumar Kartikeya Dwivedi) - Lift old restriction and support 6+ arguments in BPF programs and kfuncs on x86 and arm64 (Yonghong Song, Puranjay Mohan) Other features and fixes: - Add 24-bit BTF vlen and reclaim unused bits in the BTF UAPI to ease addition of new BTF kinds (Alan Maguire) - Raise the maximum BPF call chain depth from 8 to 16 frames (Alexei Starovoitov) - Refactor object relationship tracking in the verifier and fix a dynptr use-after-free bug (Amery Hung) - Harden the signed program loader and reject exclusive maps as inner maps (Daniel Borkmann) - Replace the verifier min/max bounds fields with a circular number (cnum) representation and improve 32->64 bit range refinements (Eduard Zingerman) - Introduce the arena library and runtime (libarena) with a buddy allocator, rbtree and SPMC queue data structures, ASAN support and a parallel test harness. Allow subprograms to return arena pointers and switch to a BTF type-tag based __arena annotation (Emil Tsalapatis) - Cache build IDs in the sleepable stackmap path and avoid faultable build ID reads under mm locks (Ihor Solodrai) - Introduce the tracing_multi link to attach a single BPF program to many kernel functions at once. Allow specifying the uprobe_multi target via FD (Jiri Olsa) - Extend the bpf_list family of kfuncs with bpf_list_add/del(), and bpf_list_is_first/is_last/empty() (Kaitao Cheng) - Extend the BPF syscall with common attributes support for prog_load, btf_load and map_create (Leon Hwang) - Wrap rhashtable as BPF map (Mykyta Yatsenko, Herbert Xu) - Add sleepable support for tracepoint programs and fix deadlocks in LRU map due to NMI reentry (Mykyta Yatsenko) - Fix OOB access in bpf_flow_keys, fix nullness analysis of inner arrays, enforce write checks for global subprograms (Nuoqi Gui) - Report the maximum combined stack depth and print a breakdown of instructions processed per subprogram (Paul Chaignon) - Add an XDP load-balancer benchmark and arm64 JIT support for stack arguments (Puranjay Mohan) - Add kfuncs to traverse over wakeup_sources (Samuel Wu) - Allow sleepable BPF programs to use LPM trie maps directly (Vlad Poenaru) - Many more fixes and cleanups across the verifier, BTF, sockmap, devmap, bpffs, security hooks, s390/riscv/loongarch JITs, rqspinlock, libbpf, bpftool, selftests" * tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (336 commits) selftests/bpf: Work around llvm stack overflow in crypto progs selftests/bpf: add test for bpf_msg_pop_data() overflow bpf, sockmap: fix integer overflow in bpf_msg_pop_data() bounds check sockmap: Fix use-after-free in udp_bpf_recvmsg() bpf, sockmap: keep sk_msg copy state in sync bpf, sockmap: Fix wrong rsge offset in bpf_msg_push_data() bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data() selftsets/bpf: Retry map update on helper_fill_hashmap() selftests/bpf: Add test for sleepable lsm_cgroup rejection selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket selftests/bpf: Avoid static LLVM linking for cross builds selftests/bpf: Use common CFLAGS for urandom_read selftests/bpf: Initialize operation name before use tools/bpf: build: Append extra cflags libbpf: Initialize CFLAGS before including Makefile.include bpftool: Append extra host flags bpftool: Avoid adding EXTRA_CFLAGS to HOST_CFLAGS bpftool: Pass host flags to bootstrap libbpf selftests/bpf: correct CONFIG_PPC64 macro name in comment ...
2026-06-17Merge tag 'net-next-7.2' of ↵Linus Torvalds932-25504/+58716
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Work on removing rtnl_lock protection throughout the stack continues. In this chapter: - don't use rtnl_lock for IPv6 multicast routing configuration - don't take rtnl_lock in ethtool for modern drivers - prepare Qdisc dump callbacks for rtnl_lock removal - Support dumping just ifindex + name of all interfaces, under RCU. It's a common operation for Netlink CLI tools (when translating names to ifindexes) and previously required full rtnl_lock. - Support dumping qdiscs and page pools for a specific netdev. Even tho user space wants a dump of all netdevs, most of the time, the OOO programming model results in repeating the dump for each netdev. Which, in absence of a cache, leads to a O(n^2) behavior. - Flush nexthops once on multi-nexthop removal (e.g. when device goes down), another O(n^2) -> O(n) improvement. - Rehash locally generated traffic to a different nexthop on retransmit timeout. - Honor oif when choosing nexthop for locally generated IPv6 traffic. - Convert TCP Auth Option to crypto library, and drop non-RFC algos. - Increase subflow limits in MPTCP to 64 and endpoint limit to 256. - Support MPTCP signaling of IPv6 address + port (ADD_ADDR). We need to selectively skip reporting of the standard TCP Timestamp option, because they won't fit into the header space together (12 + 30 > 40). - Support using bridge neighbor suppression, Duplicate Address Detection, Gratuitous ARP and unsolicited NA forwarding - in EVPN deployments, e.g. VXLAN fabrics (IPv4 and IPv6). - Improve link state reporting for upper netdevs (e.g. macvlan) over tunnel devices (again, mostly for EVPN deployments). - Support binding GENEVE tunnels to a local address. - Speed up UDP tunnel destruction (remove one synchronize_rcu()). - Support exponential field encoding in multicast (IGMPv3 and MLDv2). - Support attaching PSP crypto offload to containers (veth, netkit). - Add a new IPSec Netlink message XFRM_MSG_MIGRATE_STATE that allows migrating individual IPsec SAs independently of their policies. The existing XFRM_MSG_MIGRATE is tightly coupled to policy+SA migration, lacks SPI for unique SA identification, and cannot express reqid changes or migrate Transport mode selectors. The new interface identifies the SA via SPI and mark, supports reqid changes, address family changes, encap removal, and uses an atomic create+install flow under x->lock to prevent SN/IV reuse during AEAD SA migration. - Implement GRO/GSO support for PPPoE. - Convert sockopt callbacks in a number of protocols to iov_iter. Cross-tree stuff: - Remove support for Crypto TFM cloning (unblocked after the TCP Auth Option rework). This feature regressed performance for all crypto API users, since it changed crypto transformation objects into reference-counted objects. - Add FCrypt-PCBC implementation to rxrpc and remove it from the global crypto API as obsolete and insecure. Wireless: - Major rework of station bandwidth handling, fixing issues with lower capability than AP. - Cleanups for EMLSR spec issues (drafts differed). - More Neighbor Awareness Networking (Wi-Fi Aware) work (multicast, schedule improvements, multi-station etc.) - Some Ultra High Reliability (UHR) / IEEE 802.11bn (D1.4) work (e.g. non-primary channel access, UHR DBE support). - Fine Timing Measurement ranging (i.e. distance measurement) APIs. Netfilter: - Use per-rule hash initval in nf_conncount. This avoids unnecessary lock contention with short keys (e.g. conntrack zones) in different namespaces. - Various safety improvements, both in packet parsing and object lifetimes. Notably add refcounts to conntrack timeout policy. Deletions: - Remove TLS + sockmap integration. TLS wants to pin user pages to avoid a copy, and sockmap wants to write to the input stream. More work on this integration is clearly needed, and we can't find any users (original author admitted that they never deployed it). - Remove support for TLS offload with TCP Offload Engine (the far more common opportunistic offload is retained). The locking looks unfixable (driver sleeps under TCP spin locks) and people from the vendor that added this are AWOL. - Remove more ATM code, trying to leave behind only what PPPoATM needs, AAL5 and br2684 with permanent circuits. - Remove AppleTalk. Let it join hamradio in our out of tree protocol graveyard, I mean, repository. - Disable 32-bit x_tables compatibility (32bit binaries on 64bit kernel) interface in user namespaces. To be deleted completely, soon. - Remove 5/10 MHz support from cfg80211/mac80211. Drivers: - Software: - Support DEVMEM/DMABUF Tx over NETMEM_TX_NO_DMA devices (netkit) - bonding: add knob to strictly follow 802.3ad for link state - New drivers: - Alibaba Elastic Ethernet Adaptor (cloud vNIC). - NXP NETC switch within i.MX94. - DPLL: - Add operational state to pins (implement in zl3073x). - Add generic DPLL type, for daisy-chaining DPLLs (implement in ice). - Ethernet high-speed NICs: - Huawei (hinic3): - enhance tc flow offload support with queue selection, tunnels - nVidia/Mellanox: - avoid over-copying payload to the skb's linear part (up to 60% win for LRO on slow CPUs like ARM64 V2) - expose more per-queue stats over the standard API - support additional, unprivileged PFs in the DPU configuration - support Socket Direct (multi-PF) with switchdev offloads - add a pool / frag allocator for DMA mapped buffers for control objects, save memory on systems with 64kB page size - take advantage of the ability to dynamically change RSS table size, even when table is configured by the user - increase the max RSS table size for even traffic distribution - Ethernet NICs: - Marvell/Aquantia: - AQC113 PTP support - Realtek USB (r8152): - support 10Gbit Link Speeds and Energy-Efficient Ethernet (EEE) - support firmware loaded (for RTL8157/RTL8159) - support for the RTL8159 - Intel (ixgbe): - support Energy-Efficient Ethernet (EEE) on E610 devices - Ethernet switches: - Airoha: - support multiple netdevs on a single GDM block / port - Marvell (mv88e6xxx): - support SERDES of mv88e6321 - Microchip (ksz8/9): - rework the driver callbacks to remove one indirection layer - Motorcomm (yt921x): - support port rate policing - support TBF qdisc offload - support ACL/flower offload - nVidia/Mellanox: - expose per-PG rx_discards - Realtek: - rtl8365mb: bridge offloading and VLAN support - Ethernet PHYs: - Airoha: - support Airoha AN8801R Gigabit PHYs. - Micrel: - implement 3 low-loss cable tunables - Realtek: - support MDI swapping for RTL8226-CG - support MDIO for RTL931x - Qualcomm: - at803x: Rx and Tx clock management for IPQ5018 PHY - Motorcomm: - support YT8522 100M RMII PHY - set drive strength in YT8531s RGMII - TI: - dp83822: add optional external PHY clock - Bluetooth: - hci_sync: add support for HCI_LE_Set_Host_Feature [v2] - SMP: use AES-CMAC library API - Intel: - support Product level reset - support smart trigger dump - Mediatek: - add event filter to filter specific event - Realtek: - fix RTL8761B/BU broken LE extended scan - WiFi: - Broadcom (b43): - new support for a 11n device - MediaTek (mt76): - support mt7927 - mt792x: broken usb transport detection - mt7921: regulatory improvements - Qualcomm (ath9k): - GPIO interface improvements - Qualcomm (ath12k): - WDS support - replace dynamic memory allocation in WMI Rx path - thermal throttling/cooling device support - 6 GHz incumbent interference detection - channel 177 in 5 GHz - Realtek (rt89): - RTL8922AU support - USB 3 mode switch for performance - better monitor radiotap support - RTL8922DE preparations" * tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1778 commits) ipv4: fib_rule: Move fib4_rules_exit() to ->exit(). net: serialize netif_running() check in enqueue_to_backlog() net: skmsg: preserve sg.copy across SG transforms appletalk: move the protocol out of tree appletalk: stop storing per-interface state in struct net_device selftests/bpf: test that TLS crypto is rejected on a sockmap socket selftests/bpf: drop the unused kTLS program from test_sockmap selftests/bpf: remove sockmap + ktls tests tls: remove dead sockmap (psock) handling from the SW path tls: reject the combination of TLS and sockmap atm: remove orphaned uAPI for deleted drivers, protocols and SVCs atm: remove unused ATM PHY operations atm: remove the unused pre_send and send_bh device operations atm: remove the unused change_qos device operation atm: remove SVC socket support and the signaling daemon interface atm: remove the local ATM (NSAP) address registry atm: remove dead SONET PHY ioctls atm: remove the unused send_oam / push_oam callbacks atm: remove AAL3/4 transport support net: dsa: sja1105: fix lastused timestamp in flower stats ...
2026-06-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski24-170/+247
Merge in late fixes in preparation for the net-next PR. Conflicts: net/tls/tls_sw.c 406e8a651a7b ("net: skmsg: preserve sg.copy across SG transforms") 79511603a65b ("tls: remove dead sockmap (psock) handling from the SW path") drivers/net/ethernet/microsoft/mana/mana_en.c f8fd56977eeea ("net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check") d07efe5a6e641 ("net: mana: Use per-queue allocation for tx_qp to reduce allocation size") https://lore.kernel.org/ajAPXu-C_PuTgV-a@sirena.org.uk No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: sja1105: fix lastused timestamp in flower statsDavid Yang1-2/+1
flow_stats_update() takes an absolute timestamp for lastused, not delta. Fix that. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260614141320.1133321-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15ionic: Get "link_down_count" ext link stat from firmwareEric Joyner6-5/+39
The number of times that link has gone down at the port level is tracked by the firmware and sent to the driver via regular DMA writes to an instance of struct ionic_port_status in the driver's memory. This statistic was never reported in favor of a driver-derived stat, but doing it in the driver was never necessary since firmware had been reporting it the whole time. Since it would be more accurate and true to the description of the statistic to get this count at the PHY level, replace the driver-calculated statistic with one derived from the firmware one and remove the driver-calculated one entirely. The stat reported by the ethtool .get_link_ext_stats() handler is normalized to 0 on driver load and any device resets that require the driver to rebuild state while also handling overflows. Signed-off-by: Eric Joyner <eric.joyner@amd.com> Link: https://patch.msgid.link/20260614205303.48088-5-eric.joyner@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15ionic: Report "rx_bits_phy" stat to ethtoolEric Joyner4-1/+73
This stat contains the number of total bits that the PHY has received; it's useful for BER calculations. Add it to the ethtool stats output. However, since this is one of the new "extra port stats", it's reported in a different manner than the existing port stats and only conditionally added to the ethtool stats output list: both the DEV_CAP_EXTRA_STATS capability must be supported by the firmware, and the firmware must set the value of the statistic to something other than IONIC_STAT_INVALID. To help support this scheme, the extra port stats region is initialized to 0xff's/IONIC_STAT_INVALID by the driver, to ensure the statistics that the driver knows about but the firmware does not are still invalid to the driver. Signed-off-by: Eric Joyner <eric.joyner@amd.com> Link: https://patch.msgid.link/20260614205303.48088-4-eric.joyner@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15ionic: Update ionic_if.h with new extra port statsEric Joyner1-53/+11
Add a new structure to report additional statistics from the firmware to struct ionic_port_info. This new struct currently only contains FEC related statistics, but any new port-level statistics collected by the firmware would go into it. The new structure is located in the same area as the unused ionic_port_pb_stats structure, so this patch also removes that and its supporting enumerations since they was never used in this driver. Finally, to indicate firmware support for the new structure, introduce a new device capability that the driver can use to see if the attached device supports reporting these extra stats. Signed-off-by: Eric Joyner <eric.joyner@amd.com> Link: https://patch.msgid.link/20260614205303.48088-3-eric.joyner@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15ionic: Fix check in ionic_get_link_ext_statsBrett Creeley1-2/+9
The current check will fail if SR-IOV is not initialized for the physical function; this is because is_physfn is 0 if sriov_init() isn't run or fails. Change the check that prevents getting the link down count to use is_virtfn instead so that VFs don't get this functionality, which was the original intent. Fixes: 132b4ebfa090 ("ionic: add support for ethtool extended stat link_down_count") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Eric Joyner <eric.joyner@amd.com> Link: https://patch.msgid.link/20260614205303.48088-2-eric.joyner@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: mxl862xx: add support for SerDes portsDaniel Golle6-3/+668
The MxL862xx has two XPCS/SerDes interfaces (XPCS0 for ports 9-12, XPCS1 for ports 13-16). Each can operate in various single-lane modes (SGMII, 1000Base-X, 2500Base-X, 10GBase-R, 10GBase-KR, USXGMII) or as QSGMII or 10G_QXGMII providing four sub-ports per interface. Implement phylink PCS operations using the firmware's XPCS API: - pcs_enable/pcs_disable: refcount the sub-ports sharing an XPCS and power it down once the last sub-port is released. - pcs_config: configure negotiation mode and CL37/SGMII advertising. - pcs_get_state: read link state and the link-partner ability word from firmware and decode using phylink's standard CL37, SGMII, and USXGMII decoders. - pcs_an_restart: restart CL37 or CL73 auto-negotiation. - pcs_link_up: force speed/duplex for SGMII. - pcs_inband_caps: report per-mode in-band status capabilities. Register a PCS instance for each SerDes interface and QSGMII/10G_QXGMII sub-ports during setup. Advertise the supported interface modes in phylink_get_caps based on port number. Firmware older than 1.0.84 lacks the XPCS API and instead configures the SerDes itself, using defaults stored in flash. mac_select_pcs() returns NULL in that case while the single-lane interface modes stay advertised, so a CPU port keeps working in the firmware-configured mode. Lacking support for expressing PHY-side role modes in Linux only the MAC-side of SGMII, QSGMII, USXGMII and 10G_QXGMII are implemented for now. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/736e4df02e4cb8c530c1670cbe7efac20b5d696d.1781319534.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: mxl862xx: move API macros to mxl862xx-host.hDaniel Golle2-7/+8
Move the MXL862XX_API_WRITE, MXL862XX_API_READ and MXL862XX_API_READ_QUIET convenience macros from mxl862xx.c to mxl862xx-host.h next to the mxl862xx_api_wrap() prototype they wrap. This makes them available to other compilation units that include mxl862xx-host.h, which is needed once the SerDes PCS code in mxl862xx-phylink.c also calls firmware commands. No functional change. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/914f57931e79cc3932a9f32813465c08d29cf4bf.1781319534.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: mxl862xx: move phylink stubs to mxl862xx-phylink.cDaniel Golle4-38/+67
Move the phylink MAC operations and get_caps callback from mxl862xx.c into a dedicated mxl862xx-phylink.c file. This prepares for the SerDes PCS implementation which adds substantial phylink/PCS code -- keeping it in a separate file avoids function-position churn in the main driver file. No functional change. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/fb9336de94bef47a0834287cbca87954e5e4c795.1781319534.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: mxl862xx: store firmware version for feature gatingDaniel Golle2-0/+26
Query the firmware version at init (already done in wait_ready), cache it in priv->fw_version, and provide MXL862XX_FW_VER_MIN() for version-gated code paths throughout the driver. MXL862XX_FW_VER() packs major/minor/revision into a u32 with bitwise shifts so that versions compare with natural ordering, independent of host endianness. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/91a26a8ffeaa2ce1729f98347e93e779973976bb.1781319534.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: airoha: use int instead of atomic_t for qdma users counterLorenzo Bianconi2-3/+3
QDMA users counter is always accessed holding RTNL lock so we do not require atomic_t for it. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ethernet: oa_tc6: Remove FCS size in RX frameSelvamani Rajagopal1-0/+11
OA TC6 MAC-PHY appends FCS to the incoming frame. It must be removed from the frame before being passed to the stack. With FCS in the frame, many applications, like ping or any application that uses IP layer may work as they may carry the packet size information in the protocol. Application like ptp4l, particularly if it uses layer 2 for its communication, it will fail with "bad message" due to the extra 4 bytes added by the presence of FCS. Fixes: d70a0d8f2f2d ("net: ethernet: oa_tc6: implement receive path to receive rx ethernet frames") Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com> Link: https://patch.msgid.link/20260611-level-trigger-v5-3-4533a9e85ce2@onsemi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ethernet: oa_tc6: mdiobus->parent initialized with NULLSelvamani Rajagopal1-2/+1
As "dev" pointer in oa_tc6 structure is never initialized, mbiobus->parent was initialized with NULL. This change fixes it by initializing it with device pointer of spi. Fixes: 8f9bf857e43b ("net: ethernet: oa_tc6: implement internal PHY initialization") Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com> Link: https://patch.msgid.link/20260611-level-trigger-v5-2-4533a9e85ce2@onsemi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ethernet: oa_tc6: Interrupt is active low, level triggered.Selvamani Rajagopal1-50/+76
According OPEN Alliance 10BASET1x MAC-PHY Serial Interface specification, interrupt is active low, level triggered. Code used edge triggered interrupt which has the risk of losing an interrupt on instances like when interrupt is disabled. Level triggered interrupt won't be deasserted unless handler runs and clear the interrupting conditions. Interrupt handler mechanism is changed to threaded irq from interrupt handler and kernel thread waiting on work queue. Threaded irq mechanism is best suited for level triggered interrupt as it disables the interrupt until handler is run in thread level, while giving us an ability to have interrupt context handler to signal the threaded irq handler. Introduced a logic to disable the device interrupt on error. Error could be due in data chunk's header and footer or SPI interface itself. This will avoid having repeated interrupts, in case the driver couldn't recover from the error condition with the available recovery mechanism. Fixes: 2c6ce5354453 ("net: ethernet: oa_tc6: implement mac-phy interrupt") Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com> Link: https://patch.msgid.link/20260611-level-trigger-v5-1-4533a9e85ce2@onsemi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ti: icssg: Use undirected TX tag for XDP zero copy in HSR offload modeMeghana Malladi1-2/+11
emac_xsk_xmit_zc() has the same issue as the fixed emac_xmit_xdp_frame(): it always sets the CPPI5 descriptor destination tag to emac->port_id, which directs the PRU firmware to transmit on only one slave port in HSR mode, breaking redundancy. Apply the same fix: in HSR offload mode when NETIF_F_HW_HSR_DUP is set, use PRUETH_UNDIRECTED_PKT_DST_TAG (port 0) so the PRU duplicates frames to both ports. Also set PRUETH_UNDIRECTED_PKT_TAG_INS when NETIF_F_HW_HSR_TAG_INS is set so the PRU re-inserts the HSR sequence tag that was stripped by the PRU on RX before the XDP program saw the frame. This ensures XSK XDP_TX frames in HSR mode are treated identically to skb TX via hsr0. Fixes: 8756ef2eb078 ("net: ti: icssg-prueth: Add AF_XDP zero copy for TX") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20260611185744.2498070-4-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ti: icssg: Use undirected TX tag for native XDP in HSR offload modeMeghana Malladi1-2/+19
emac_xmit_xdp_frame() always sets the CPPI5 descriptor destination tag to emac->port_id, which directs the PRU firmware to transmit the frame on that specific slave port only. In HSR offload mode this bypasses the firmware's HSR duplication logic: the frame goes out on one ring leg and never appears on the other, breaking HSR redundancy for XDP_TX paths. icssg_ndo_start_xmit() already handles this correctly: when HSR offload mode is active and NETIF_F_HW_HSR_DUP is set it substitutes PRUETH_UNDIRECTED_PKT_DST_TAG (port 0) so the PRU duplicates the frame to both slave ports. It also sets PRUETH_UNDIRECTED_PKT_TAG_INS in epib[1] when NETIF_F_HW_HSR_TAG_INS is set so the PRU inserts the HSR sequence tag, which XDP_TX frames lack (the tag is stripped by the PRU on RX before the frame reaches the XDP program). Apply the same logic in emac_xmit_xdp_frame() so XDP_TX frames in HSR mode are treated identically to skb TX via hsr0. Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20260611185744.2498070-3-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: ti: icssg-prueth: Fix AF_XDP fill ring alloc and wakeup conditionMeghana Malladi1-8/+6
emac_rx_packet_zc() calls prueth_rx_alloc_zc() with count (frames received in the current NAPI poll) as the allocation budget. Two problems arise from this: 1. When the CPPI5 descriptor pool is exhausted (avail_desc == 0, FDQ already holds the maximum number of descriptors), count > 0 still triggers allocation attempts that all fail, spamming the kernel log with "rx push: failed to allocate descriptor" at high packet rates. 2. The XSK wakeup condition "ret < count" is wrong when avail_desc is zero: ret == 0 and count can be up to 64, so the condition is always true. This causes ~200 spurious ndo_xsk_wakeup() calls per second even when the FDQ is already full, wasting CPU cycles in repeated NAPI invocations that process zero frames. Fix both by introducing alloc_budget = min(budget, avail_desc): - When avail_desc == 0 no allocation is attempted, avoiding pool exhaustion errors. The wakeup condition "ret < alloc_budget" evaluates to 0 < 0 == false, correctly clearing the wakeup flag so the hardware IRQ re-arms NAPI without spurious kicks. - In steady state avail_desc == count <= budget, so alloc_budget == count and behaviour is unchanged. - After a dry-ring stall (count == 0, avail_desc > 0), alloc_budget > 0 causes new descriptors to be posted to the FDQ so the hardware can resume receiving immediately. Fixes: 7a64bb388df3 ("net: ti: icssg-prueth: Add AF_XDP zero copy for RX") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20260611185744.2498070-2-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: airoha: Fix MODULE_LICENSE to match SPDX GPL-2.0-only identifierWayen.Yan2-2/+2
Both airoha_eth.c and airoha_npu.c declare SPDX-License-Identifier: GPL-2.0-only but use MODULE_LICENSE("GPL"), which the kernel module loader interprets as GPL-2.0+ (any GPL version). This mismatch causes license compliance tools (FOSSology, ScanCode, etc.) to misidentify the effective license as more permissive than intended. Replace MODULE_LICENSE("GPL") with MODULE_LICENSE("GPL v2") to align with the GPL-2.0-only SPDX identifier. Per include/linux/module.h, "GPL v2" maps to GPL-2.0-only, matching the source files' declared license. Signed-off-by: Wayen <win847@gmail.com> Link: https://patch.msgid.link/6a2ded59.63d39acb.391892.7632@mx.google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: HWS: correct CONFIG_MLX5_HW_STEERING macro name in commentEthan Nelson-Moore1-1/+1
A comment in drivers/net/ethernet/mellanox/mlx5/core/steering/hws/fs_hws.h incorrectly refers to CONFIG_MLX5_HWS_STEERING instead of CONFIG_MLX5_HW_STEERING. Correct it. Discovered while searching for CONFIG_* symbols referenced in code but not defined in any Kconfig file. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260613225904.140791-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: airoha: Fix typos in comments and KconfigWayen.Yan2-6/+6
Fix several typos found during code review: - Kconfig: "Aiorha" -> "Airoha" in NET_AIROHA_FLOW_STATS help text - Comment: "CMD1" -> "CDM1" (Central DMA, not Command) - Comments: "GMD1/2/3/4" -> "GDM1/2/3/4" (Gigabit DMA, not GMD) These are pure comment and documentation fixes with no functional impact. Signed-off-by: Wayen.Yan <win847@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/6a2ca74a.c5b1db4e.21a698.01e7@mx.google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: airoha: Fix always-true condition in PPE1 queue reservation loopWayen.Yan1-1/+1
In airoha_fe_pse_ports_init(), the inner condition for PPE1 queue reservation is identical to the for-loop bound, making it always true and the else branch dead code: for (q = 0; q < pse_port_num_queues[FE_PSE_PORT_PPE1]; q++) { if (q < pse_port_num_queues[FE_PSE_PORT_PPE1]) /* always true */ set RSV_PAGES; else set 0; /* unreachable */ } The intended behavior is to reserve pages only for the first half of the queues, matching the PPE2 implementation on line 334 which correctly uses the /2 divisor. Fix the PPE1 condition accordingly. Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Wayen.Yan <win847@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/6a2ca3de.ad59c0a6.147df9.2ac1@mx.google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: airoha: Fix non-standard return value in airoha_ppe_get_wdma_info()Wayen.Yan1-1/+1
airoha_ppe_get_wdma_info() returns -1 when the last path in the forwarding path stack is not of type DEV_PATH_MTK_WDMA. This is not a standard kernel error code. Replace it with -EINVAL since the input path type is invalid from the caller's perspective. Signed-off-by: Wayen.Yan <win847@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/6a2ca3d9.ad59c0a6.147df9.2a62@mx.google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 PTP support in aq_ptp and driver coreSukhdeep Singh5-138/+472
aq_ptp.c / aq_ptp.h: - Add aq_ptp_state enum (AQ_PTP_FIRST_INIT, AQ_PTP_LINK_UP, AQ_PTP_NO_LINK) to distinguish first init from link-change events; on AQC113 only reset the TSG clock on first init to avoid disrupting ongoing synchronization. - Add aq_ptp_dpath_enable() for comprehensive L3/L4 PTP filter setup/teardown, replacing the previous single-filter approach with an array of 4 slots for IPv4 and IPv6 PTP multicast addresses (224.0.1.129, 224.0.0.107, ff0e::181, ff02::6b). - Add aq_ptp_parse_rx_filters() to map hwtstamp_rx_filters to L2/L4 enable flags and call aq_ptp_dpath_enable(). - Re-apply RX filters on link change (hardware state lost after reset). - Extend PTP ring alloc/init/start/stop to handle AQC113 PTP ring ops. - Add per-instance PTP offset table for AQC113 with empirically measured values at 100M/1G/2.5G/5G/10G link speeds. - Export aq_ptp_dpath_enable() and updated ring helpers in aq_ptp.h. aq_hw.h: - Include hw_atl2/hw_atl2.h for AQC113 PTP type definitions. aq_nic.c: - Account for PTP IRQ vector (AQ_HW_PTP_IRQS) in vector count math. - Call hw_atl2 PTP re-enable hook after hardware reset in aq_nic_update_link_status(). aq_pci_func.c: - Pass PTP IRQ index to aq_ptp_irq_alloc() in probe path. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-13-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 TX timestamp polling and PTP TX classificationSukhdeep Singh6-15/+64
aq_ring.h / aq_ring.c: - Add ptp_ts_deadline field to aq_ring_s to track TX timestamp timeout. - In aq_ring_tx_clean(): when hw_ring_tx_ptp_get_ts() returns 0 (HW not yet written back the timestamp), clear buff->is_mapped and buff->pa before breaking to prevent double dma_unmap on retry. When ptp_ts_deadline expires, dequeue and drop the head of skb_ring to keep it in lockstep with buff_ring, then clear request_ts and free the skb via dev_kfree_skb_any() to unblock the ring. aq_main.c: - Add IPv6 PTP packet detection in aq_ndev_start_xmit() using ipv6_hdr()->nexthdr for ETH_P_IPV6 frames, steering them through aq_ptp_xmit() alongside the existing IPv4 path. - Use PTP_EV_PORT/PTP_GEN_PORT constants instead of magic numbers 319/320. - Remove duplicate aq_reapply_rxnfc_all_rules() and aq_filters_vlans_update() calls from aq_ndev_open() - now covered by aq_nic_start(), which also ensures filters are restored correctly after PM resume. aq_nic.c: - Move aq_reapply_rxnfc_all_rules() and aq_filters_vlans_update() into aq_nic_start() after hardware init, replacing the duplicate calls that were removed from aq_ndev_open(). Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-12-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 PTP hardware ops in hw_atl2Sukhdeep Singh5-3/+223
Add the hardware-layer PTP implementation for AQC113 (Antigua): - hw_atl2.h/hw_atl2_utils.h/hw_atl2_internal.h: add PTP offset constants, RX timestamp size (HW_ATL2_RX_TS_SIZE=8), and reduced HW_ATL2_RXBUF_MAX=172 (AQC113 on-chip RX packet buffer hardware limit for data TCs). - hw_atl2.c: implement hw_atl2_enable_ptp() to reset and enable TSG clocks and set PTP TC scheduling priority after hardware reset. - hw_atl2.c: implement hw_atl2_adj_sys_clock(), hw_atl2_adj_clock_freq(), and aq_get_ptp_ts() for TSG clock read/adjust/increment operations. - hw_atl2.c: implement hw_atl2_gpio_pulse() for PPS output generation via TSG pulse generator. - hw_atl2.c: implement hw_atl2_hw_tx_ptp_ring_init() and hw_atl2_hw_rx_ptp_ring_init() for PTP ring setup. - hw_atl2.c: implement hw_atl2_hw_ring_tx_ptp_get_ts() to read TX timestamp from descriptor writeback, and hw_atl2_hw_rx_extract_ts() to extract RX timestamp from the 8-byte packet trailer. - hw_atl2.c: add hw_atl2_hw_get_clk_sel() helper. - Wire all new ops into hw_atl2_ops. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-11-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: extend hw_ops and TX descriptor for AQC113 PTPSukhdeep Singh4-9/+48
Extend the aq_hw_ops interface with new function pointers required for PTP support on AQC113: - enable_ptp: enable/disable PTP counter with clock selection - hw_ring_tx_ptp_get_ts: read TX timestamp from descriptor writeback - hw_tx_ptp_ring_init/hw_rx_ptp_ring_init: per-ring PTP initialization - hw_get_clk_sel: query active TSG clock selection Update existing hw_ops signatures to support AQC113 dual-clock architecture: - hw_gpio_pulse: add clk_sel and hightime parameters - hw_extts_gpio_enable: add channel parameter Add PTP-related hardware defines: - AQ_HW_TXD_CTL_TS_EN/TS_TSG0 for TX descriptor timestamp control - AQ2_HW_PTP_COUNTER_HZ for AQC113 TSG clock frequency - AQ_HW_PTP_IRQS for PTP interrupt vector accounting - PTP enable flags (L2/L4) and TSG clock selection constants Add request_ts and clk_sel bitfields to aq_ring_buff_s for per-packet TX timestamp request tracking. Update hw_atl_b0.c (AQC107) implementations: - Adapt gpio_pulse and extts_gpio_enable to new signatures - Add TX descriptor timestamp bits for AQC113 when ANTIGUA chip feature is detected Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-10-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 PTP traffic class and TX path setupSukhdeep Singh2-7/+46
Add PTP traffic class (TC) buffer reservation and TX path improvements for AQC113: - Reserve dedicated TX and RX buffer space for PTP TC when PTP is enabled, reducing user TC buffers accordingly (TX: 8KB, RX: 16KB). - Configure PTP TC with no flow control and highest priority scheduling to ensure timely PTP packet transmission. TX path improvements: - Increase TX data and descriptor read-request limits when firmware has already enabled extended PCIe tag mode. Also simplify RSS queue calculation in hw_atl2_hw_rss_set() by extracting to a local variable and use unsigned types for loop variables to match their usage. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-9-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: implement AQC113 L2/L3/L4 RX filter opsSukhdeep Singh1-0/+518
Implement complete RX filter management for AQC113 hardware: - Add tag-based ethertype filter policy (hw_atl2_filter_tag_get/put) that allocates and releases ART tags for L2 ethertype filters. - Add L3/L4 filter sharing via serialized usage counters in hw_atl2_l3_filter/hw_atl2_l4_filter, managed through hw_atl2_rxf_l3_get/put and hw_atl2_rxf_l4_get/put. - Implement L3 (IPv4/IPv6 source/destination address and protocol) filter find, get (program HW and increment refcount), and put (decrement refcount and clear HW when last user releases). - Implement L4 (TCP/UDP/SCTP source/destination port) filter management with the same find/get/put pattern. - Add combined L3L4 filter configuration (hw_atl2_new_fl3l4_configure) that translates legacy aq_rx_filter_l3l4 commands into AQC113 separate L3+L4 filter programming with Action Resolver Table (ART) entries. - Add L2 ethertype filter set/clear (hw_atl2_hw_fl2_set/clear) with tag-based ART integration. - Wire .hw_filter_l2_set, .hw_filter_l2_clear, .hw_filter_l3l4_set into hw_atl2_ops. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-8-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: fix AQC113 HW init: ART, L2 filter slot, MAC addressSukhdeep Singh2-11/+72
Fix initialization issues in hw_atl2 to correctly support AQC113: - hw_atl2_hw_reset: replace unconditional priv memset with selective field clears so that l3l4_filters[].l3_index and l4_index can be initialized to -1 (not allocated) rather than 0; 0 is a valid filter index and would incorrectly appear as an occupied slot after a reset. - hw_atl2_hw_init_new_rx_filters: use firmware-reported ART section base and count (clamped to 16) instead of hardcoded 0xFFFF mask; enable simultaneous IPv4/IPv6 L3 filter mode (rpf_l3_v6_v4_select); tag the UC MAC slot using firmware-supplied l2_filters_base_index instead of hardcoded HW_ATL2_MAC_UC. - hw_atl2_hw_init_rx_path: enable only the firmware-assigned MAC slot (priv->l2_filters_base_index) instead of always slot 0. - Add hw_atl2_hw_mac_addr_set() that programs the MAC address into the firmware-assigned L2 filter slot. Wire into hw_atl2_ops replacing the A1 hw_atl_b0_hw_mac_addr_set; call it from hw_init. - Wire .hw_get_regs into hw_atl2_ops. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-7-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 filter data structures, firmware query and ↵Sukhdeep Singh5-15/+169
register dump Add filter infrastructure for AQC113 hardware: - Define L3 (IPv4/IPv6), L4 (TCP/UDP/SCTP), and combined L3L4 filter structures with serialized usage counter for filter sharing. - Define tag policy structure for ethertype filter management. - Add RPF L3/L4 command bit definitions for filter programming. - Add filter count constants for L3L4, L3V4, L4, VLAN, and ethertype. - Extend hw_atl2_priv with filter arrays, base indices, and counts discovered from firmware. Query filter capabilities from firmware shared memory at init time to discover available L2/L3/L4/VLAN/ethertype filter resources and ART (Action Resolver Table) configuration. Add hardware register dump utility for AQC113 debug support. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-6-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: add AQC113 hardware register definitions and accessorsSukhdeep Singh3-4/+615
Add low-level hardware register definitions and accessor functions for AQC113 (Antigua) chip features: - L3/L4 filter command, tag, and address registers for IPv4/IPv6 - Ethertype filter tag registers - TSG (Time Stamp Generator) clock control, modification, and GPIO event generation/input timestamp registers - TX descriptor timestamp writeback, timestamp enable, and AVB enable registers - TX data/descriptor read request limit registers - TPB highest priority TC registers - PCIe extended tag enable register - RX descriptor timestamp request register - Action resolver section enable getter - GPIO special mode and TSG external GPIO TS input select Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-5-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: decouple aq_set_data_fl3l4() from driver internalsSukhdeep Singh2-8/+26
Refactor aq_set_data_fl3l4() to take an ethtool_rx_flow_spec pointer and an explicit HW register location instead of driver-internal structures (aq_nic_s, aq_rx_filter). This makes the function reusable for PTP filter setup which constructs flow specs independently. Key changes: - Add aq_is_ipv6_flow_type() helper to derive IPv6 status from the flow_type field, replacing the dependency on rx_fltrs->fl3l4.is_ipv6 shared state. - Change aq_set_data_fl3l4() signature to accept (fsp, data, location, add) and export it via aq_filters.h. - Update aq_add_del_fl3l4() to compute the HW register location and pass it explicitly. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-4-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: move active_ipv4/ipv6 bitmap updates after HW writeSukhdeep Singh1-13/+23
Move active_ipv4/active_ipv6 bitmap updates from aq_set_data_fl3l4() into aq_add_del_fl3l4() after the hardware write succeeds. The bitmaps track which filter slots are actively programmed in hardware and must only be updated once the HW write is confirmed. The bitmap updates in aq_nic_reserve_filter() and aq_nic_release_filter() are intentionally retained: they guard the aq_check_approve_fl3l4() IPv4/IPv6 mixing validation for callers such as the AQC113 PTP path that program filters directly via hw_atl2_new_fl3l4_configure() without going through aq_add_del_fl3l4(). Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-3-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: atlantic: correct L3L4 filter flow_type masking and IPv6 handlingSukhdeep Singh1-16/+17
Correct three issues in aq_set_data_fl3l4() required for the AQC113 PTP filter path introduced later in this series: 1. Mask FLOW_EXT from flow_type before the protocol switch statement. Flow types with FLOW_EXT set (e.g. TCP_V4_FLOW | FLOW_EXT) fall through to the default case and skip protocol comparison flags. 2. Extend the L3 address comparison check to cover all four IPv6 words. The original code only checked ip_src[0]/ip_dst[0] and required !is_ipv6, so CMP_SRC_ADDR_L3/CMP_DEST_ADDR_L3 were never set for IPv6 filters. 3. Use explicit flow type checks for port extraction instead of negating IP_USER_FLOW/IPV6_USER_FLOW. The old check did not mask FLOW_EXT, so IP_USER_FLOW | FLOW_EXT would incorrectly attempt port extraction. Use the actual flow type to pick the correct union member directly. Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com> Link: https://patch.msgid.link/20260610115448.272-2-sukhdeeps@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: netc: implement dynamic FDB entry ageingWei Fang2-0/+74
The NETC switch does not age out dynamic FDB entries automatically. Without software management, stale entries persist after topology changes and cause incorrect forwarding. Add a delayed work that periodically removes entries that have not been refreshed within the specified cycles. The effective ageing time is: ageing_time = fdbt_ageing_delay * 100 Default values are 3s interval and 100 cycles (300s total), matching the IEEE 802.1Q default ageing time. The work starts when the first port joins a bridge (tracked via br_cnt) and is cancelled when the last port leaves. All FDB operations are serialized under fdbt_lock. Implement .set_ageing_time() to allow the bridge layer to reconfigure ageing parameters on demand. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-10-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: netc: add bridge mode supportWei Fang2-16/+363
Wire up the port_bridge_join, port_bridge_leave and port_vlan_filtering DSA callbacks to support both VLAN-unaware and VLAN-aware bridge modes. For VLAN-unaware bridges, each bridge instance is assigned a dedicated internal PVID via NETC_VLAN_UNAWARE_PVID(bridge.num), counting down from VID 4095. A VFT entry is created for this PVID with hardware MAC learning and flood-on-miss forwarding enabled. The CPU port is included as a VFT member so frames can reach the host. The reserved VID range is blocked in port_vlan_add to prevent user-space conflicts. Only one VLAN-aware bridge is supported at a time; this constraint is enforced in port_bridge_join and port_vlan_filtering. The per-port PVID is tracked in software and written to the BPDVR register whenever VLAN filtering is active. When a port leaves the bridge, its dynamic FDB entries are flushed right away in port_bridge_leave(), without waiting for the ageing cycle. When a link down event occurs on a port, netc_mac_link_down() will also clear the port's dynamic FDB entries via netc_port_remove_dynamic_entries(). Non-bridge ports have no dynamic FDB entries, so this call is always safe. Additionally, .port_fast_age() callback is added to flush the dynamic FDB entries associated to a port. Host flood rules are removed from the ingress port filter table when a port joins a bridge to avoid bypassing FDB lookup and MAC learning. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-9-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: netc: add VLAN filter table and egress treatment managementWei Fang2-0/+463
Implement the DSA .port_vlan_add and .port_vlan_del operations to enable VLAN-aware bridge offloading on the NETC switch. VLAN membership is maintained in the VLAN Filter Table (VFT). Adding the first port to a VLAN creates a new VFT entry with hardware MAC learning and flood-on-miss forwarding; subsequent ports update the existing entry's membership bitmap. Removing the last port deletes the entry. Egress tagging is handled through the Egress Treatment Table (ETT). Each VLAN is allocated a group of ETT entries, one per available port. Ports are assigned a sequential ett_offset during initialisation, used to address each port's entry within the group. Untagged ports configure the ETT to strip the outer VLAN tag; tagged ports pass frames through unmodified. Each ETT group is optionally paired with an Egress Counter Table (ECT) group for per-port frame counting, allocated on a best-effort basis. When the egress rule of an ETT entry changes, the counter of the corresponding ECT entry will be recounted to track the number of frames that match the new egress rule. A software shadow list serialised by vft_lock tracks active VLAN state across both port membership and egress tagging. VID 0 is used for single port mode and is ignored by both callbacks. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-8-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: enetc: add helpers to set/clear table bitmapWei Fang1-0/+24
NTMP index tables require software to allocate and manage entry IDs. Add two bitmap helper functions to facilitate this management: ntmp_lookup_free_eid(): finds the first zero bit in the given bitmap, sets it to mark the entry as in-use, and returns the corresponding entry ID. Returns NTMP_NULL_ENTRY_ID if no free entry is available. ntmp_clear_eid_bitmap(): clears the bit associated with the given entry ID in the bitmap to mark the entry as free. It is a no-op if the entry ID is NTMP_NULL_ENTRY_ID. Both functions are exported for use by other modules, such as the NETC switch driver which needs to manage group index bitmaps for the Egress Treatment Table (ETT) and Egress Count Table (ECT). Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-7-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: netc: initialize the group bitmap of ETT and ECTWei Fang2-1/+95
The Egress Treatment Table (ETT) and Egress Count Table (ECT) are both index tables whose entry IDs are allocated by software. Every num_ports entries form a group, where each entry in the group corresponds to one port. To facilitate group allocation and management, initialize the group index bitmaps for both tables based on hardware capabilities reported by ETTCAPR and ECTCAPR registers. The bitmap size per table is calculated as the total number of hardware entries divided by the number of available ports, which gives the number of groups available for software allocation. A set bit in the bitmap represents a group index that has been allocated. These bitmaps will be used by subsequent patches that add VLAN support. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-6-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: enetc: add "Update" operation to the egress count tableWei Fang1-0/+45
The egress count table is a static bounded index table, egress related statistics are maintained in this table. The table is implemented as a linear array of entries accessed using an index (0, 1, 2, ..., n) that uniquely identifies an entry within the array. Egress Counter Entry ID (EC_EID) is used as an index to an entry in this table. The EC_EID is specified in the egress treatment table. Egress count table entries are always present and enabled. The table only supports access via entry ID, which is assigned by the software. And it supports Update, Query and Query followed by Update operations. Currently, only Update operation is supported. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-5-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: enetc: add interfaces to manage egress treatment tableWei Fang2-0/+114
Each entry in the egress treatment table contains the egress packet processing actions to be applied to a grouping or scope of packets exiting on a particular egress port of the switch. A scope of packets, for example, could be the packets exiting a particular VLAN, matching a particular 802.1Q bridge forwarding entry or belonging to a stream identified at ingress. The egress treatment table is implemented as a linear array of entries accessed using an index (0,1, 2, ..., n) that uniquely identifies an entry within the array. The egress treatment table only supports access vid entry ID, which is assigned by the software. It supports Add, Update, Delete and Query operations. Note that only Query operation is not supported yet. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-4-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: enetc: add "Update" and "Delete" operations to VLAN filter tableWei Fang2-7/+102
Add two interfaces to manage entries in the VLAN filter table: ntmp_vft_update_entry(): Update the configuration element data of the specified VLAN filter entry based on the given VLAN ID. It uses the exact key access method to locate the entry. ntmp_vft_delete_entry(): Delete the VLAN filter entry corresponding to the specified VLAN ID. It also uses the exact key access method to identify the target entry. In addition, introduce struct vft_req_qd to describe the request data buffer format for Query and Delete actions of the VLAN filter table, which contains a common request data header and a VLAN access key. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-3-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: enetc: add interfaces to manage dynamic FDB entriesWei Fang2-2/+164
Add three interfaces to manage dynamic entries in the FDB table: ntmp_fdbt_update_activity_element(): Update the activity element of all dynamic FDB entries. For each entry, if its activity flag is not set, which means no packet has matched this entry since the last update, the activity counter is incremented. Otherwise, both the activity flag and activity counter are reset. The activity counter is used to track how long an FDB entry has been inactive, which is useful for implementing an ageing mechanism. ntmp_fdbt_delete_ageing_entries(): Delete all dynamic FDB entries whose activity flag is not set and whose activity counter is greater than or equal to the specified threshold. This is used to remove stale entries that have been inactive for too long. ntmp_fdbt_delete_port_dynamic_entries(): Delete all dynamic FDB entries associated with the specified switch port. This is typically called when a port goes down or is removed from a bridge. Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260611021458.2629145-2-wei.fang@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: sfp: extend SMBus supportJonas Jelonek1-27/+110
Commit 7662abf4db94 ("net: phy: sfp: Add support for SMBus module access") added SMBus access for SFP modules, but limited it to single-byte transfers. As a side effect, hwmon is disabled (16-bit reads cannot be guaranteed atomic) and a warning is printed. Many SMBus-only I2C controllers in the wild support more than just byte access, and SFP cages are often wired to such controllers rather than to a full-featured I2C controller -- e.g. the SMBus controllers in the Realtek longan and mango SoCs, which advertise word access and I2C block reads. Today, they cannot drive an SFP at all without falling back to the byte-only path. Extend sfp_smbus_read()/sfp_smbus_write() so that, in addition to the existing byte access, they also use SMBus word access and SMBus I2C block access whenever the adapter advertises them. Both directions are handled in a single read and a single write helper that pick the largest supported transfer per chunk and fall back as needed. I2C-block is preferred unconditionally when available: the protocol carries any length 1..32, so it can serve every chunk -- including the 1- and 2-byte tails -- without help from word or byte access. Note that this requires I2C_FUNC_SMBUS_I2C_BLOCK, which reads a caller-specified number of bytes. This deviates from the official SMBus Block Read (length is supplied by the slave) but is widely supported by Linux I2C controllers/drivers. Capability matrix this implementation supports: - BYTE only: works (unchanged behaviour); 1-byte xfers, hwmon disabled. - BYTE + WORD: word for >=2-byte chunks, byte for trailing odd byte. - I2C_BLOCK present (with or without BYTE/WORD): block as the universal transport for every chunk. - WORD only (no BYTE/BLOCK): accepted with WARN_ONCE. Even-length transfers work; odd-length transfers (e.g. the 3-byte cotsworks fixup write) hit the BYTE branch which the adapter does not implement, so the xfer returns an error and the operation is aborted. No mainline I2C driver was found to advertise WORD without BYTE; the warning lets us learn about it if it ever shows up. Adapters with asymmetric R/W capabilities (e.g. only READ_I2C_BLOCK but not WRITE_I2C_BLOCK) remain functionally correct -- the per-iteration fallback uses the direction-specific bits -- but the shared i2c_max_block_size is sized by the all-bits-set check, so a transfer in the better-supported direction is not upgraded. None of the mainline I2C bus drivers surveyed during review advertise such asymmetry; promoting i2c_max_block_size to per-direction sizes can be revisited if needed. Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260614133418.2068201-3-jelonek.jonas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: sfp: apply I2C adapter quirks to limit block sizeJonas Jelonek1-2/+10
The SFP driver assumes all I2C adapters support reading and writing the pre-defined block size SFP_EEPROM_BLOCK_SIZE of 16 bytes. This constant was probably chosen based on good guesses and known limitations of a range of I2C adapters and SFP modules. However, I2C adapters may even support less and usually need to specify this via I2C quirks. Theoretically, such an adapter may provide full functionality but only support a read and write length of e.g. 8 bytes. Currently, the SFP driver doesn't account for that. Add handling for I2C quirks in SFP I2C configuration taking the fields max_read_len and max_write_len in struct i2c_adapter_quirks into account to further limit the maximum block size if needed. Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260614133418.2068201-2-jelonek.jonas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15octeontx2-vf: clear stale mailbox IRQ state before request_irq()Runyu Xiao1-12/+10
otx2vf_register_mbox_intr() currently installs the VF mailbox IRQ handler before clearing stale mailbox interrupt state. The code then says that local interrupt bits should be cleared first to avoid spurious interrupts, but that clear still happens only after request_irq() has already made the handler reachable. A running system can reach this during VF mailbox interrupt registration while stale or latched RVU_VF_INT state is still present. If delivery happens in the request_irq()-to-clear window, otx2vf_vfaf_mbox_intr_handler() can run before local quiesce and touch the same vf->mbox and vf->mbox_wq carrier that probe and teardown later reuse or destroy. Move the stale mailbox interrupt clear ahead of request_irq(), but keep interrupt enabling after the handler is installed. This closes the pre-clear early-IRQ window without creating a new enable-before-handler window. Fixes: 3184fb5ba96e ("octeontx2-vf: Virtual function driver support") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260611160014.3202224-3-runyu.xiao@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15octeontx2-pf: clear stale mailbox IRQ state before request_irq()Runyu Xiao1-11/+9
otx2_register_mbox_intr() currently installs the PF mailbox IRQ handler before clearing stale mailbox interrupt state. The function itself then comments that the local interrupt bits must be cleared first to avoid spurious interrupts, but that clear happens only after request_irq() has already exposed the handler to irq delivery. A running system can reach this during PF mailbox interrupt registration while stale or latched RVU_PF_INT state is still present. If delivery happens in the request_irq()-to-clear window, otx2_pfaf_mbox_intr_handler() can run before local quiesce and touch the same pf->mbox and pf->mbox_wq carrier that probe and teardown later reuse or destroy. Move the stale mailbox interrupt clear ahead of request_irq(), but keep interrupt enabling after the handler is installed. This closes the pre-clear early-IRQ window without creating a new enable-before-handler window. Fixes: 5a6d7c9daef3 ("octeontx2-pf: Mailbox communication with AF") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260611160014.3202224-2-runyu.xiao@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: phy: sfp: detect presence via I2C when no MOD_DEF0 GPIOGreg Patrick1-3/+80
An SFP cage (compatible "sff,sfp") whose MOD_DEF0 signal is not wired to a GPIO currently falls back to sff_gpio_get_state(), which unconditionally reports the module as present. An empty cage therefore fails its probe and is parked in SFP_MOD_ERROR forever; because SFP_F_PRESENT never deasserts there is no REMOVE event to recover the state machine, so a module inserted after boot is never detected, and empty cages spam -EIO at boot. This affects boards that route none of the cage presence signal to a software-readable input. On the NicGiga S100-0800S-M (RTL9303, 8x SFP+) the cage I2C bus is the switch's SMBus master; TX_DISABLE is driven via a PCA9534 I/O expander, but no MOD_ABS/MOD_DEF0 line reaches a readable GPIO (the RTL9303 gpio0 lines read stuck-low, the single PCA9534 is fully consumed by TX_DISABLE, and there is no RTL8231). The Horaco ZX-SW82TS-L2P (RTL9302D, 2x SFP+) is independently affected in the same way. For such an SFP cage, derive presence from a throttled single-byte I2C read of the module EEPROM instead: a successful read asserts SFP_F_PRESENT, R_PROBE_ABSENT consecutive failures clear it (to ride out a transient error on a live module). The existing poll then emits SFP_E_INSERT / SFP_E_REMOVE normally, giving working hot-plug and silencing the boot-time -EIO spam on empty cages. Presence is re-probed every T_PROBE_PRESENT, so insertion is detected within that interval and removal within T_PROBE_PRESENT * R_PROBE_ABSENT. A soldered-down module (compatible "sff,sff") has no presence signal and is genuinely always present, so it continues to use sff_gpio_get_state(); the new path is gated on the cage type advertising SFP_F_PRESENT. Signed-off-by: Greg Patrick <gregspatrick@hotmail.com> Tested-by: Manuel Stocker <mensi@mensi.ch> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260611175341.2223184-1-gregspatrick@hotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: mv88e6xxx: Avoid devlink resource IDs collision with PARENT_TOPDavid Yang1-4/+5
The devlink resource ID for ATU collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). As a result, ATU_bin_* are registered as in fact registered as top-level siblings, not as children of ATU. Whether intentional or unintentional, clarify it by keeping the real resource IDs starting at 1. Unfortunately ATU_bin_* are already registered at top-level, so keep their parent to PARENT_TOP. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260611070856.889700-5-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: hellcreek: avoid devlink resource IDs collision with PARENT_TOPDavid Yang1-0/+1
This might not cause real problems, but the hellcreek devlink resource ID collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it by keeping the real resource IDs starting at 1. Signed-off-by: David Yang <mmyangfl@gmail.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://patch.msgid.link/20260611070856.889700-4-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: b53: avoid devlink resource IDs collision with PARENT_TOPDavid Yang1-0/+1
This might not cause real problems, but the b53 devlink resource ID collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it by keeping the real resource IDs starting at 1. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260611070856.889700-3-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: dsa_loop: avoid devlink resource IDs collision with PARENT_TOPDavid Yang1-0/+1
This might not cause real problems, but the dsa_loop devlink resource ID collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it by keeping the real resource IDs starting at 1. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260611070856.889700-2-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net: dsa: hellcreek: replace kcalloc with struct_sizeRosen Penev2-10/+6
One fewer allocation for the priv struct. Signed-off-by: Rosen Penev <rosenp@gmail.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://patch.msgid.link/20260608045640.5172-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, enable SD over ECPF and allow switchdev transitionShay Drory2-14/+0
Remove the restriction blocking SD on embedded CPU PFs (ECPF), enabling SD functionality on BlueField DPUs. Remove the blocker preventing SD devices from transitioning to switchdev mode. The infrastructure added in earlier patches properly handles this case. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-16-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, defer vport metadata init until SD is readyShay Drory3-3/+93
Allow SD devices to transition to switchdev before the SD group is fully up. Metadata allocation requires the SD group to be ready, so defer it from esw_offloads_enable() until SD shared-FDB activation. Add mlx5_esw_offloads_init_deferred_metadata() which allocates per-vport metadata and refreshes the ingress ACLs that were previously programmed with metadata=0. The helper is idempotent and can be called multiple times. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-15-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: E-Switch, Tie rep load/unload to SD LAG stateShay Drory5-0/+58
On an SD device, vport representors are not functional until the SD group is combined and shared FDB is active. Skip the initial load and the reload paths in that window; reps are loaded as part of the SD LAG activation flow once it becomes active. In addition, explicitly unload representors when SD LAG is destroyed. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-14-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: LAG, add MPESW over SD LAG supportShay Drory4-8/+105
Enable MPESW LAG creation over SD LAG members, forming a composite LAG hierarchy. This allows bonding multiple SD groups together under a single MPESW configuration with shared FDB. When enabling composite MPESW, the individual SD LAG shared FDB configurations are temporarily torn down and recreated when the composite LAG is disabled. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-13-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: LAG, introduce software vport LAG implementationShay Drory5-3/+280
SD LAG is a virtual LAG without hardware LAG support, so it cannot use the firmware vport LAG commands. Implement a software-based vport LAG using egress ACL bounce rules. Add esw_set_slave_egress_rule() to create an egress ACL rule on the slave's manager vport that bounces traffic to the master's manager vport. This achieves the same traffic steering as hardware vport LAG. Redirect mlx5_cmd_create_vport_lag() and mlx5_cmd_destroy_vport_lag() to the software implementation when operating in SD LAG mode. In addition, adjust lag_demux creation to check SD LAG mode as well. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-12-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: LAG, disable both regular and SD LAG on lag_disable_changeShay Drory1-2/+27
Extend mlx5_lag_disable_change() to properly disable both regular LAG and SD LAG when requested. Each LAG type uses its own devcom component for locking. Use mlx5_sd_get_devcom() helper to retrieve the SD devcom component, needed for proper locking when disabling SD LAG. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: LAG, store demux resources per master lag_funcShay Drory2-34/+68
The lag demux resources (flow table, flow group, and rules xarray) are stored on the shared ldev. With Socket Direct, multiple SD groups each create their own demux FT/FG during their master's IB device initialization. Since they all write to the same ldev fields, the second group's init overwrites the first group's pointers, leaking the first group's FT/FG. During teardown, the cleanup uses the overwritten pointers, destroying the wrong group's resources and leaving leaked flow tables in the LAG namespace. These leaked tables can interfere with subsequently created demux tables. Move the demux resources from the shared ldev to per-master lag_func instances. Each master device now owns its own independent demux state. The rule_add and rule_del helpers look up the appropriate master's lag_func via the existing filter/group infrastructure. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: E-Switch, notify SD on eswitch disableShay Drory1-0/+1
When eswitch is disabled, notify the SD layer so it can clean up SD-specific resources such as the TX flow table root configuration on secondary devices. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, support switchdev mode transition with shared FDBShay Drory3-3/+154
When the eswitch transitions, propagate the change to SD: secondaries get their TX flow table root reconfigured for the new mode, and when all group devices move to switchdev, the per-group shared FDB is activated. Shared FDB activation is best-effort - failure does not block the eswitch transition; the next transition retries. Note: the existing mlx5_get_sd() guard that blocks switchdev for SD devices is intentionally retained. It will be removed once all supporting patches are in place. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, expend vport metadata for SD secondary devicesShay Drory3-3/+25
In Socket Direct configurations the primary and secondary PFs share the same native_port_num. The eswitch vport metadata encodes pf_num in its upper bits to distinguish vports across PFs. Without SD-awareness, both PFs generate identical metadata, causing FDB rules to steer traffic to the wrong representor. Add mlx5_sd_pf_num_get() which remaps the pf_num for SD devices. Use it so each PF in an SD group produces unique vport metadata. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, add L2 table silent mode query supportShay Drory3-14/+114
Add mlx5_fs_cmd_query_l2table_silent() to query the current silent mode state from firmware. This allows detecting if firmware has already put secondary devices into silent mode. During SD group registration, query the silent mode of each device. If a device is already in silent mode (set by firmware), record this in the fw_silents_secondaries flag and use it to help determine the primary/secondary roles. When fw_silents_secondaries is set, skip the driver-initiated silent mode set/unset operations since firmware manages this state. This handles configurations where firmware persistently silences secondary devices. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: SD, make primary/secondary role determination more robustShay Drory1-34/+104
Refactor SD group registration to use devcom event-driven role determination to ensure SD is marked as ready only after roles are fully assigned and the group state is consistent, making outside accessors, which will be added in downstream patches, safe to use without races. The devcom events: - SD_PRIMARY_SET event: each device compares bus numbers with peers to determine which should be primary - SD_SECONDARIES_SET event: secondaries register themselves with the elected primary device Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: devcom, add DEVCOM_CANT_FAIL for non-rollback eventsShay Drory2-1/+8
Some devcom events are not expected to fail. Rather than attempting a rollback that may not be meaningful, allow callers to pass DEVCOM_CANT_FAIL as the rollback_event to indicate that the event handler should not fail. If it does, emit a warning and stop propagating to further peers, but skip the rollback path. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: devcom, expose locked variant of send_eventShay Drory2-7/+25
Factor mlx5_devcom_send_event() into two functions: - mlx5_devcom_locked_send_event(): performs the dispatch (and rollback) with comp->sem already held by the caller. - mlx5_devcom_send_event(): unchanged wrapper that takes comp->sem, calls the locked variant, and releases it. This lets callers bracket multiple event broadcasts under a single held write lock, eliminating the gap between consecutive dispatches where peer state could change. Will be used by a downstream patch. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15net/mlx5: E-Switch, skip uplink IB rep load for SD secondary devicesShay Drory1-4/+21
SD secondary devices share the primary's uplink and do not have their own uplink representor. When reloading IB reps on secondary devices, skip the uplink and only load VF/SF vport IB reps. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260612113904.537595-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-15Merge tag 'x86-cpu-2026-06-14' of ↵Linus Torvalds1-0/+1
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Ingo Molnar: - CPUID API updates (Ahmed S. Darwish): - Introduce a centralized CPUID parser - Introduce a centralized CPUID data model - Introduce <asm/cpuid/leaf_types.h> - Rename cpuid_leaf()/cpuid_subleaf() APIs - treewide: Explicitly include the x86 CPUID headers - Update to x86-cpuid-db v3.1 (Maciej Wieczor-Retman) - Continued removal of pre-i586 support and related simplifications (Ingo Molnar) - Add Intel CPU model number for rugged Panther Lake (Tony Luck) - Misc fixes, updates and cleanups by Arnd Bergmann, Chao Gao, Lukas Bulwahn, Sohil Mehta, Maciej Wieczor-Retman. * tag 'x86-cpu-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/cpu: Make CONFIG_X86_CX8 unconditional x86/cpu: Remove unused !CONFIG_X86_TSC code x86/cpuid: Update bitfields to x86-cpuid-db v3.1 tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.1 x86/cpu: Make CONFIG_X86_TSC unconditional MAINTAINERS: Drop obsolete FPU EMULATOR section x86/cpu: Fix a F00F bug warning and clean up surrounding code x86/cpu: Add Intel CPU model number for rugged Panther Lake x86/cpuid: Introduce a centralized CPUID parser x86/cpu: Introduce a centralized CPUID data model x86/cpuid: Introduce <asm/cpuid/leaf_types.h> x86/cpuid: Rename cpuid_leaf()/cpuid_subleaf() APIs x86/cpu: Do not include the CPUID API header in asm/processor.h Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs x86/cpu, cpufreq: Remove AMD ELAN support x86/fpu: Remove the math-emu/ FPU emulation library x86/fpu: Remove the 'no387' boot option x86/fpu: Remove MATH_EMULATION and related glue code treewide: Explicitly include the x86 CPUID headers x86/cpu: Remove the CONFIG_X86_INVD_BUG quirk ...
2026-06-15Merge tag 'timers-ptp-2026-06-13' of ↵Linus Torvalds7-13/+21
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip Pull timekeeping updates from Thomas Gleixner: "Updates for NTP/timekeeping and PTP: - Expand timekeeping snapshot mechanisms The various snapshot functions are mostly used for PTP to collect "atomic" snapshots of various involved clocks. They lack support for the recently introduced AUX clocks and do not provide the underlying counter value (e.g. TSC) to user space. Exposing the counter value snapshot allows for better control and steering. Convert the hard wired ktime_get_snapshot() to take a clock ID, which allows the caller to select the clock ID to be captured along with CLOCK_MONONOTONIC_RAW. Additionally capture the underlying hardware counter value and the clock source ID of the counter. Expand the hardware based snapshot capture where devices provide a mechanism to snapshot the hardware PTP clock and the system counter (usually via PCI/PTM) to support AUX clocks and also provide the captured counter value back to the caller and not only the clock timestamps derived from it. - Add a new optional read_snapshot() callback to clocksources That is required to capture atomic snapshots from clocksources which are derived from TSC with a scaling mechanism (e.g. Hyper-V, KVMclock). The value pair is handed back in the snapshot structure to the callers, so they can do the necessary correlations in a more precise way. This touches usage sites of the affected functions and data structure all over the tree, but stays fully backwards compatible for the existing user space exposed interfaces. New PTP IOCTLs will provide access to the extended functionality in later kernel versions" * tag 'timers-ptp-2026-06-13' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (28 commits) ptp: vmclock: Use hw_cycles from snapshot for precise TSC pairing x86/kvmclock: Implement read_snapshot() for kvmclock clocksource clocksource/hyperv: Implement read_snapshot() for TSC page clocksource timekeeping: Add clocksource read_snapshot() method and hw_cycles to snapshot ptp: Switch to ktime_get_snapshot_id() for pre/post timestamps timekeeping: Add support for AUX clock cross timestamping timekeeping: Remove system_device_crosststamp::sys_realtime ALSA: hda/common: Use system_device_crosststamp::sys_systime wifi: iwlwifi: Use system_device_crosststamp::sys_systime ptp: Use system_device_crosststamp::sys_systime timekeeping: Prepare for cross timestamps on arbitrary clock IDs timekeeping: Remove ktime_get_snapshot() virtio_rtc: Use provided clock ID for history snapshot net/mlx5: Use provided clock ID for history snapshot igc: Use provided clock ID for history snapshot ice/ptp: Use provided clock ID for history snapshot wifi: iwlwifi: Adopt PTP cross timestamps to core changes timekeeping: Add CLOCK ID to system_device_crosststamp timekeeping: Add system_counterval_t to struct system_device_crosststamp timekeeping: Add CLOCK_AUX support for ktime_get_snapshot_id() ...
2026-06-15Merge tag 'driver-core-7.2-rc1' of ↵Linus Torvalds1-1/+1
gitolite.kernel.org:pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "Deferred probe: - Fix race where deferred probe timeout work could be permanently canceled by using mod_delayed_work() - Fix missing jiffies conversion in deferred_probe_extend_timeout() - Guard timeout extension with delayed_work_pending() to prevent premature firing - Use system_percpu_wq instead of the deprecated system_wq - Update deferred_probe_timeout documentation device: - Replace direct struct device bitfield access (can_match, dma_iommu, dma_skip_sync, dma_ops_bypass, state_synced, dma_coherent, of_node_reused, offline, offline_disabled) with flag-based accessors using bit operations - Reject devices with unregistered buses - Delete unused DEVICE_ATTR_PREALLOC() - Add low-level device attribute macros with const show/store callbacks, allowing device attributes to reside in read-only memory - Move core device attributes to read-only memory - Constify group array pointers in driver_add_groups() / driver_remove_groups(), struct bus_type, and struct device_driver device property: - Fix fwnode reference leak in fwnode_graph_get_endpoint_by_id() - Initialize all fields of fwnode_handle in fwnode_init() - Provide swnode_get()/swnode_put() wrappers around kobject_get/put() - Allow passing struct software_node_ref_args pointers directly to PROPERTY_ENTRY_REF() driver_override: - Migrate amba, cdx, vmbus, and rpmsg to the generic driver_override infrastructure, fixing a UAF from unsynchronized access to driver_override in bus match() callbacks - Remove the now-unused driver_set_override() firmware loader: - Fix recursive lock deadlock in device_cache_fw_images() when async work falls back to synchronous execution - Fix device reference leak in firmware_upload_register() platform: - Pass KBUILD_MODNAME through the platform driver registration macro to create module symlinks in sysfs for built-in drivers; move module_kset initialization to a pure_initcall and tegra cbb registration to core_initcall to ensure correct ordering - Pass THIS_MODULE implicitly through a coresight_init_driver() macro sysfs: - Upgrade OOB write detection in sysfs_kf_seq_show() from printk to WARN - Add return value clamping to sysfs_kf_read() Rust: - ACPI: Fix missing match data for PRP0001 by exporting acpi_of_match_device() - Auxiliary: Replace drvdata() with dedicated registration data on auxiliary_device. drvdata() exposed the driver's bus device private data beyond the driver's own scope, creating ordering constraints and forcing the data to outlive all registrations that access it. Registration data is instead scoped structurally to the Registration object, making lifecycle ordering enforced by construction rather than convention. - Rust-native device driver lifetimes (HRT): Allow Rust device drivers to carry a lifetime parameter on their bus device private data, tied to the device binding scope -- the interval during which a bus device is bound to a driver. Device resources like pci::Bar<'a> and IoMem<'a> can be stored directly in the driver's bus device private data with a lifetime bounded by the binding scope, so the compiler enforces at build time that they do not outlive the binding. This removes Devres indirection from every access site and eliminates try_access() failure paths in destructors. Bus driver traits use a Generic Associated Type (GAT) Data<'bound> to introduce the lifetime on the private data, rather than parameterizing the Driver trait itself. Auxiliary registration data, where the lifetime is not introduced by a trait callback but must be threaded through Registration, uses the ForLt trait (a type-level abstraction for types generic over a lifetime). Misc: - Fix DT overlayed devices not probing by reverting the broken treewide overlay fix and re-running fw_devlink consumer pickup when an overlay is applied to a bound device - Use root_device_register() for faux bus root device; add sanity check for failed bus init - Fix dev_has_sync_state() data race with READ_ONCE() and move it to base.h - Avoid spurious device_links warning when removing a device while its supplier is unbinding - Switch ISA bus to dynamic root device - Fix suspicious RCU usage in kernfs_put() - Remove devcoredump exit callback - Constify devfreq_event_class" * tag 'driver-core-7.2-rc1' of gitolite.kernel.org:pub/scm/linux/kernel/git/driver-core/driver-core: (81 commits) software node: allow passing reference args to PROPERTY_ENTRY_REF() driver core: platform: set mod_name in driver registration coresight: pass THIS_MODULE implicitly through a macro kernel: param: initialize module_kset in a pure_initcall soc/tegra: cbb: Move driver registration from pure_initcall to core_initcall firmware_loader: Fix recursive lock in device_cache_fw_images() driver core: Use system_percpu_wq instead of system_wq driver core: remove driver_set_override() rpmsg: use generic driver_override infrastructure Drivers: hv: vmbus: use generic driver_override infrastructure cdx: use generic driver_override infrastructure amba: use generic driver_override infrastructure rust: devres: add 'static bound to Devres<T> samples: rust: rust_driver_auxiliary: showcase lifetime-bound registration data rust: auxiliary: generalize Registration over ForLt rust: types: add `ForLt` trait for higher-ranked lifetime support gpu: nova-core: separate driver type from driver data samples: rust: rust_driver_pci: use HRT lifetime for Bar rust: io: make IoMem and ExclusiveIoMem lifetime-parameterized rust: pci: make Bar lifetime-parameterized ...
2026-06-14wifi: ath6kl: fix invalid workqueue flags in ath6kl_usb_create()wuyankun1-1/+1
ath6kl_usb_create() currently creates ath6kl_wq with flags set to 0: alloc_workqueue("ath6kl_wq", 0, 0) This triggers a runtime warning in __alloc_workqueue() because the queue is created with neither WQ_PERCPU nor WQ_UNBOUND set: workqueue: ath6kl_wq is using neither WQ_PERCPU or WQ_UNBOUND. Setting WQ_PERCPU. Set WQ_PERCPU explicitly to match the actual execution model and remove the warning during device probe. No functional change intended. Fixes: 21c05ca88a54 ("workqueue: Add warnings and ensure one among WQ_PERCPU or WQ_UNBOUND is present") Reported-by: syzbot+f80c62f371ba6a1e7d79@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/6a289c01.39669fcc.33b062.00aa.GAE@google.com/T/ Signed-off-by: wuyankun <wuyankun@uniontech.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-06-14geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZEAlice Mikityanska1-1/+1
GRO_LEGACY_MAX_SIZE = 65536; total_len being 65536 is too big to fit into a u16. As can be seen in skb_gro_receive, packets bigger or equal to gro_max_size (or GRO_LEGACY_MAX_SIZE) are dropped with -E2BIG. Apply the same boundary to geneve_post_decap_hint to avoid writing 65536 to a 16-bit iph->tot_len field with an overflow. Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path") Signed-off-by: Alice Mikityanska <alice@isovalent.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260611192955.604661-3-alice.kernel@fastmail.im Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-06-13net: hns3: move fd code to a separate fileJijie Shao5-2610/+2650
The hclge_main.c file has become very large, so the fd code has been moved to a separate hclge_fd.c file. This patch only moves the code and does not modify any functionality. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13net: hns3: debugfs support for dumping fd rulesJijie Shao3-0/+162
Currently, the tc tool only supports adding and deleting rules from the driver but does not support querying rules from the driver. This patch adds a rule dump file in debugfs to check whether the driver's configuration matches the configuration issued by tc flow. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13net: hns3: support IP and tunnel VNI dissectors for tc flowJijie Shao2-7/+88
Currently, the driver does not support FLOW_DISSECTOR_KEY_IP and FLOW_DISSECTOR_KEY_ENC_KEYID. But the hardware supports ip_tos (FLOW_DISSECTOR_KEY_IP) and outer_tun_vni (FLOW_DISSECTOR_KEY_ENC_KEYID). This patch adds support for FLOW_DISSECTOR_KEY_IP and FLOW_DISSECTOR_KEY_ENC_KEYID. Additionally, since tc flow cannot effectively support l2_user_def, l3_user_def, and l4_user_def, this patch explicitly sets them to not be used. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13net: hns3: support two more actions for tc flowJijie Shao1-7/+56
Currently, the driver supports only one action:HCLGE_FD_ACTION_SELECT_TC. This patch adds support for HCLGE_FD_ACTION_SELECT_QUEUE and HCLGE_FD_ACTION_DROP_PACKET. A rule can have only one action. Therefore, the driver intercepts rules that have multiple actions or no action. Note: The driver considers cls_flower->classid as an action: HCLGE_FD_ACTION_SELECT_TC. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13net: hns3: improve the unused_tuple parameter settingJijie Shao1-0/+12
Currently, when the tc tool is used to set flow table rules, the IP address and MAC address can be configured separately, for example, src_xx or dst_xx can be configured separately. Therefore, the driver needs to check whether the mask is all zero in keys, such as FLOW_DISSECTOR_KEY_IPV4_ADDRS, FLOW_DISSECTOR_KEY_IPV6_ADDRS, and FLOW_DISSECTOR_KEY_ETH_ADDRS. If the mask is all zero, the tuple is not configured. In this case, the driver adds the tuple to unused_tuple. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13net: hns3: refactor add_cls_flower to prepare for multiple actionsJijie Shao3-38/+60
Remove the tc parameter from the add_cls_flower() ops callback and refactor action parsing to support future extensions for SELECT_QUEUE and DROP_PACKET actions. Changes: * Remove the tc parameter from the add_cls_flower() callback signature. * Extract TC-based action parsing into hclge_get_tc_flower_action(). * Move the dissector->used_keys check from hclge_parse_cls_flower() to hclge_check_cls_flower(), and restrict ETH_ADDRS to HCLGE_FD_MODE_DEPTH_2K_WIDTH_400B_STAGE_1 mode since hardware only supports MAC matching there. * Migrate error reporting from dev_err() to netlink extended ACK (extack). Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260610060618.834987-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13dpaa2-switch: unify the FDB update logic in dpaa2_switch_port_set_fdb()Ioana Ciornei1-31/+19
For both the join and leave paths, the logic goes through the following steps: determines which FDB should be used on a port after the current changeupper change, populate the private port structures with the new FDB and, if necessary, make as not used the old FDB. Instead of having two distinct paths inside the dpaa2_switch_port_set_fdb() for linking=true and linking=false, unify them. This will hopefully help in making this function easier to read. No behavior changes are expected. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260610150912.1788482-6-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13dpaa2-switch: move FDB selection for leave path into a helperIoana Ciornei1-18/+30
Move the FDB selection for when a port leaves bridge into a new helper - dpaa2_switch_fdb_for_leave(). This will hopefully make the dpaa2_switch_port_set_fdb() function easier to read and follow. The new helper only determines the FDB to be used, any updates into the private port structure still gets done in the set_fdb() function. No changes in the actual behavior are intended. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260610150912.1788482-5-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13dpaa2-switch: move FDB selection for join path into a helperIoana Ciornei1-25/+35
The dpaa2_switch_port_set_fdb() function handles the setup of the FDB for both changeupper cases: join and leave. Move the code block which handles the join path into a new helper - dpaa2_switch_fdb_for_join() - with the hope that the entire function will become easier to read and extend with other use cases in the future. This new helper just determines and returns what FDB should be used for a specific port, the cleanup of the old FDB and the actual setup in the per port structure remains in the dpaa2_switch_port_set_fdb() function. No changes in the actual behavior are intended. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260610150912.1788482-4-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13dpaa2-switch: factor out the FDB in-use check into a helperIoana Ciornei1-13/+18
The dpaa2_switch_port_set_fdb() function is hard to follow and open-coding the in-use check into it makes it even harder to read. Factor out that code block into a new helper - dpaa2_switch_fdb_in_use_by_others(). Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260610150912.1788482-3-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13dpaa2-switch: change dpaa2_switch_port_set_fdb() function prototypeIoana Ciornei1-16/+13
Since there dpaa2_switch_port_set_fdb() never fails and its return value was never checked, change its prototype to return void. Also, instead of determining if the DPAA2 port is joining or leaving an upper based on the value of the 'bridge_dev' parameter, add the 'linking' parameter to explicitly specify the action. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260610150912.1788482-2-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13igc: fix typos in commentsMaximilian Pezzullo2-3/+3
Fix spelling errors in code comments: - igc_diag.c: 'autonegotioation' -> 'autonegotiation' - igc_main.c: 'revisons' -> 'revisions' (two occurrences) Signed-off-by: Maximilian Pezzullo <maximilianpezzullo@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20260609213559.178657-16-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13igb: fix typos in commentsMaximilian Pezzullo4-4/+4
Fix spelling errors in code comments: - e1000_nvm.c: 'likley' -> 'likely' - e1000_mac.c: 'auto-negotitation' -> 'auto-negotiation' - e1000_mbx.h: 'exra' -> 'extra' - e1000_defines.h: 'Aserted' -> 'Asserted' Signed-off-by: Maximilian Pezzullo <maximilianpezzullo@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20260609213559.178657-15-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13e1000e: limit endianness conversion to boundary wordsAgalakov Daniil1-7/+12
[Why] In e1000_set_eeprom(), the eeprom_buff is allocated to hold a range of words. However, only the boundary words (the first and the last) are populated from the EEPROM if the write request is not word-aligned. The words in the middle of the buffer remain uninitialized because they are intended to be completely overwritten by the new data via memcpy(). The previous implementation had a loop that performed le16_to_cpus() on the entire buffer. This resulted in endianness conversion being performed on uninitialized memory for all interior words. Fix this by converting the endianness only for the boundary words immediately after they are successfully read from the EEPROM. Found by Linux Verification Center (linuxtesting.org) with SVACE. Co-developed-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Agalakov Daniil <ade@amicon.ru> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20260609213559.178657-14-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13e1000: limit endianness conversion to boundary wordsAgalakov Daniil1-4/+6
[Why] In e1000_set_eeprom(), the eeprom_buff is allocated to hold a range of words. However, only the boundary words (the first and the last) are populated from the EEPROM if the write request is not word-aligned. The words in the middle of the buffer remain uninitialized because they are intended to be completely overwritten by the new data via memcpy(). The previous implementation had a loop that performed le16_to_cpus() on the entire buffer. This resulted in endianness conversion being performed on uninitialized memory for all interior words. Fix this by converting the endianness only for the boundary words immediately after they are successfully read from the EEPROM. Found by Linux Verification Center (linuxtesting.org) with SVACE. Co-developed-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Iskhakov Daniil <dish@amicon.ru> Signed-off-by: Agalakov Daniil <ade@amicon.ru> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20260609213559.178657-13-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-13e1000e: Use __napi_schedule_irqoff()Matt Vollrath1-3/+3
The __napi_schedule_irqoff() macro is intended to bypass saving and restoring IRQ state when scheduling is requested from an IRQ handler, where hard interrupts are already disabled. Use this macro in all three interrupt handlers. This was tested on a system with an I218-V and MSI interrupts. Because this is an optimization, I was interested in measuring the impact, so I added ktime_get() time measurement to e1000_intr_msi and a print of the last sample in the watchdog task. For each test case I ran a bi-directional iperf3 to saturate the line. With some help from awk, here are the statistics. 49 samples each, all units ns previous: min 678 max 1265 mean 879.429 median 806 stddev 137.188 noirq: min 707 max 1165 mean 811.857 median 790 stddev 89.486 According to this informal comparison, the mean time to handle an interrupt from start to finish is improved by about 8% under load. Signed-off-by: Matt Vollrath <tactii@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Michal Cohen <michalx.cohen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20260609213559.178657-12-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>