diff options
| author | Jakub Kicinski <kuba@kernel.org> | 2026-06-24 11:20:17 -0700 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-06-25 10:18:40 -0700 |
| commit | cd1c188db1091991fc1d7f565824d077d659425b (patch) | |
| tree | 7e851c2ced991bade8899e3b8d07aee78d14092a /Documentation | |
| parent | 129cdce9da9e44c52d38889e0411be9817bca114 (diff) | |
| download | ath-cd1c188db1091991fc1d7f565824d077d659425b.tar.gz | |
vlan: defer real device state propagation to netdev_work
vlan_device_event() generates nested UP/DOWN, MTU and feature
change events. It executes an event for the VLAN device directly
from the notifier - while the locks of the lower device are held.
This causes deadlocks, for example:
bond (3) bond_update_speed_duplex(vlan)
| ^ v
vlan (2) UP(vlan) (4) vlan_ethtool_get_link_ksettings()
| ^ v
dummy (1) UP(dummy) (5) __ethtool_get_link_ksettings()
The dummy device is ops locked, vlan creates a nested event (2),
then bond wants to ask vlan for link state (3). bond uses the
"I'm already holding the instance lock" flavor of API. But in
this case the lock held refers to vlan itself. We hit vlan's
link settings trampoline (4) and call __ethtool_get_link_ksettings()
which tries to lock dummy. Deadlock. There's no clean way for us
to tell the vlan_ethtool_get_link_ksettings() that the caller
is already in lower device's critical section.
Defer the propagation to the per-netdev work facility instead:
the notifier only schedules netdev_work_sched(vlandev, VLAN_WORK_*),
and ndo_work (vlan_dev_work) applies the change later. Hopefully
nobody expects the VLAN state changes to be instantaneous.
If someone does expect the changes to be instantaneous we will
have to do the same thing Stan did for rx_mode and "strategically"
place sync calls, to make sure such delayed works are executed
after we drop the ops lock but before we drop rtnl_lock.
Stan suggests that if we need that down the line we may
consider reshaping the mechanism into "async notifications".
AFAICT only vlan does this sort of netdev open chaining,
so as a first try I think that sticking the complexity into
the vlan code makes sense.
One corner case is that we need to cancel the event if user
explicitly changes the state before work could run. Consider
the following operations with vlan0 on top of dummy0:
ip link set dev dummy0 up # queues work to up vlan0
ip link set dev vlan0 down # user explicitly downs the vlan
ndo_work # acts on the stale event
Reported-by: syzbot+09da62a8b78959ceb8bb@syzkaller.appspotmail.com
Reported-by: syzbot+cb67c392b0b8f0fd0fc1@syzkaller.appspotmail.com
Reported-by: syzbot+9bb8bd77f3966641f298@syzkaller.appspotmail.com
Fixes: 9f275c2e9020 ("net: ethtool: make sure __ethtool_get_link_ksettings() is ops-locked")
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260624182018.2445732-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/networking/netdevices.rst | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index fde601acd1d25..d2a238f8cc8b9 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -433,6 +433,8 @@ exceptions) notifiers run under the instance lock. Please extend this documentation whenever you make explicit assumption about lock being held from a notifier. +Drivers **must not** generate nested notifications of the ops-locked types. + NETDEV_INTERNAL symbol namespace ================================ |
