Skip to content

System freeze after suspending two instances of Neovim with io_uring enabled #1113

Description

@pajlada

Hi!
When running Arch Linux or Fedora Rawhide and suspending two instances of Neovim, which uses libuv, which uses io_uring, I experience a system freeze. It stops me from typing anything in any shell, or spawn any new shell, but I'm able to run some simple commands over ssh (e.g. ssh myserver ls -la).
dmesg doesn't report anything interesting as far as I could tell, other than some of the apps that were running not being responsive.

Mar 31 11:53:31 billy kernel: INFO: task st:29757 blocked for more than 122 seconds.
Mar 31 11:53:31 billy kernel:       Tainted: P           OE      6.8.2-arch2-1 #1
Mar 31 11:53:31 billy kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 31 11:53:31 billy kernel: task:st              state:D stack:0     pid:29757 tgid:29757 ppid:29751  flags:0x00004006
Mar 31 11:53:31 billy kernel: Call Trace:
Mar 31 11:53:31 billy kernel:  <TASK>
Mar 31 11:53:31 billy kernel:  __schedule+0x3e6/0x1520
Mar 31 11:53:31 billy kernel:  schedule+0x32/0xd0
Mar 31 11:53:31 billy kernel:  schedule_timeout+0x151/0x160
Mar 31 11:53:31 billy kernel:  wait_for_completion+0x86/0x170
Mar 31 11:53:31 billy kernel:  __flush_work.isra.0+0x173/0x280
Mar 31 11:53:31 billy kernel:  ? __pfx_wq_barrier_func+0x10/0x10
Mar 31 11:53:31 billy kernel:  n_tty_poll+0x134/0x1e0
Mar 31 11:53:31 billy kernel:  tty_poll+0x57/0xc0
Mar 31 11:53:31 billy kernel:  do_select+0x362/0x880
Mar 31 11:53:31 billy kernel:  ? pollwake+0x50/0xa0
Mar 31 11:53:31 billy kernel:  ? __pfx_pollwake+0x10/0x10
Mar 31 11:53:31 billy kernel:  ? __pfx_pollwake+0x10/0x10
Mar 31 11:53:31 billy kernel:  ? __pfx_pollwake+0x10/0x10
Mar 31 11:53:31 billy kernel:  core_sys_select+0x36b/0x530
Mar 31 11:53:31 billy kernel:  do_pselect.constprop.0+0xe9/0x180
Mar 31 11:53:31 billy kernel:  __x64_sys_pselect6+0x3d/0x70
Mar 31 11:53:31 billy kernel:  do_syscall_64+0x86/0x170
Mar 31 11:53:31 billy kernel:  ? do_syscall_64+0x96/0x170
Mar 31 11:53:31 billy kernel:  ? do_syscall_64+0x96/0x170
Mar 31 11:53:31 billy kernel:  ? exc_page_fault+0x7f/0x180
Mar 31 11:53:31 billy kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0x76
Mar 31 11:53:31 billy kernel: RIP: 0033:0x7541e536b640
Mar 31 11:53:31 billy kernel: RSP: 002b:00007ffec990be70 EFLAGS: 00000202 ORIG_RAX: 000000000000010e
Mar 31 11:53:31 billy kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007541e536b640
Mar 31 11:53:31 billy kernel: RDX: 0000000000000000 RSI: 00007ffec990bf50 RDI: 0000000000000005
Mar 31 11:53:31 billy kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffec990beb0
Mar 31 11:53:31 billy kernel: R10: 0000000000000000 R11: 0000000000000202 R12: bff0000000000000
Mar 31 11:53:31 billy kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000010
Mar 31 11:53:31 billy kernel:  </TASK>

The freeze doesn't occur after disabling io_uring in libuv using UV_USE_IO_URING=0 or in the kernel with sysctl kernel.io_uring_disabled=1

uname -a from the tested systems

  • Linux billy 6.6.23-1-lts #1 SMP PREEMPT_DYNAMIC Wed, 27 Mar 2024 07:47:20 +0000 x86_64 GNU/Linux running Arch Linux
  • Linux yolen 6.8.2-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 28 Mar 2024 17:06:35 +0000 x86_64 GNU/Linux running Arch Linux
  • Linux localhost 6.9.0-0.rc1.20240329git317c7bc0ef03.20.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Mar 29 14:04:53 UTC 2024 x86_64 GNU/Linux running Fedora Rawhide

Reproduction steps

  • Be on Arch Linux (bug reproducible on netcup & hetzner servers) or Fedora Rawhide
  • Install Neovim (pacman -S neovim)
  • Run Neovim
  • Suspend Neovim (by pressing CTRL+Z)
  • Run Neovim again
  • Suspend Neovim (by pressing CTRL+Z)
  • Your system is now most likely unresponsive

The suspension can be done in separate shells, or as different users with the same results.

Video showing off the freeze

nvim-suspend-freeze.mp4

I'm still able to run certain apps on the system, but not open a shell

If the io-uring@vger.kernel.org email is a better place for this report let me know and I'll report it there instead.
Originally reported in libuv/libuv#4377

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions