MAINTAINERS: update by ojeda · Pull Request #125 · Rust-for-Linux/linux

@ojeda

  - Add Alex and Miguel as maintainers.
  - Sort the keys properly.
  - Remove the unexisting `F:` key.
  - Add `B:` and `T:` keys.

Fixes #123.

Reported-by: Josh Abraham <sinisterpatrician@gmail.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

alex

approved these changes Mar 18, 2021

@ojeda ojeda deleted the maintainers branch

March 18, 2021 18:23

ojeda pushed a commit that referenced this pull request

Jan 22, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs")
for s390x, add support for 64-bit pointers to kfuncs for LoongArch.
Since the infrastructure is already implemented in BPF core, the only
thing need to be done is to override bpf_jit_supports_far_kfunc_call().

Before this change, several test_verifier tests failed:

  # ./test_verifier | grep # | grep FAIL
  #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL
  #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL
  #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL
  #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL
  #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL
  #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL
  #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL
  #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL
  #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL
  #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL
  #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #486/p map_kptr: ref: reference state created and released on xchg FAIL

This is because the kfuncs in the loaded module are far away from
__bpf_call_base:

  ffff800002009440 t bpf_kfunc_call_test_fail1    [bpf_testmod]
  9000000002e128d8 T __bpf_call_base

The offset relative to __bpf_call_base does NOT fit in s32, which breaks
the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts
this limit.

Note that to reproduce the above result, tools/testing/selftests/bpf/config
should be applied, and run the test with JIT enabled, unpriv BPF enabled.

With this change, the test_verifier tests now all passed:

  # ./test_verifier
  ...
  Summary: 777 PASSED, 0 SKIPPED, 0 FAILED

Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

Darksonn pushed a commit to Darksonn/linux that referenced this pull request

Sep 23, 2024
The start_kthread() and stop_thread() code was not always called with the
interface_lock held. This means that the kthread variable could be
unexpectedly changed causing the kthread_stop() to be called on it when it
should not have been, leading to:

 while true; do
   rtla timerlat top -u -q & PID=$!;
   sleep 5;
   kill -INT $PID;
   sleep 0.001;
   kill -TERM $PID;
   wait $PID;
  done

Causing the following OOPS:

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty Rust-for-Linux#125 a533010b71dab205ad2f507188ce8c82203b0254
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 RIP: 0010:hrtimer_active+0x58/0x300
 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
 RSP: 0018:ffff88811d97f940 EFLAGS: 00010202
 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b
 RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28
 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60
 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d
 R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28
 FS:  0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0
 Call Trace:
  <TASK>
  ? die_addr+0x40/0xa0
  ? exc_general_protection+0x154/0x230
  ? asm_exc_general_protection+0x26/0x30
  ? hrtimer_active+0x58/0x300
  ? __pfx_mutex_lock+0x10/0x10
  ? __pfx_locks_remove_file+0x10/0x10
  hrtimer_cancel+0x15/0x40
  timerlat_fd_release+0x8e/0x1f0
  ? security_file_release+0x43/0x80
  __fput+0x372/0xb10
  task_work_run+0x11e/0x1f0
  ? _raw_spin_lock+0x85/0xe0
  ? __pfx_task_work_run+0x10/0x10
  ? poison_slab_object+0x109/0x170
  ? do_exit+0x7a0/0x24b0
  do_exit+0x7bd/0x24b0
  ? __pfx_migrate_enable+0x10/0x10
  ? __pfx_do_exit+0x10/0x10
  ? __pfx_read_tsc+0x10/0x10
  ? ktime_get+0x64/0x140
  ? _raw_spin_lock_irq+0x86/0xe0
  do_group_exit+0xb0/0x220
  get_signal+0x17ba/0x1b50
  ? vfs_read+0x179/0xa40
  ? timerlat_fd_read+0x30b/0x9d0
  ? __pfx_get_signal+0x10/0x10
  ? __pfx_timerlat_fd_read+0x10/0x10
  arch_do_signal_or_restart+0x8c/0x570
  ? __pfx_arch_do_signal_or_restart+0x10/0x10
  ? vfs_read+0x179/0xa40
  ? ksys_read+0xfe/0x1d0
  ? __pfx_ksys_read+0x10/0x10
  syscall_exit_to_user_mode+0xbc/0x130
  do_syscall_64+0x74/0x110
  ? __pfx___rseq_handle_notify_resume+0x10/0x10
  ? __pfx_ksys_read+0x10/0x10
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? syscall_exit_to_user_mode+0x116/0x130
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  entry_SYSCALL_64_after_hwframe+0x71/0x79
 RIP: 0033:0x7ff0070eca9c
 Code: Unable to access opcode bytes at 0x7ff0070eca72.
 RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c
 RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003
 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0
 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003
 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008
  </TASK>
 Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
 ---[ end trace 0000000000000000 ]---

This is because it would mistakenly call kthread_stop() on a user space
thread making it "exit" before it actually exits.

Since kthreads are created based on global behavior, use a cpumask to know
when kthreads are running and that they need to be shutdown before
proceeding to do new work.

Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/

This was debugged by using the persistent ring buffer:

Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/

Note, locking was originally used to fix this, but that proved to cause too
many deadlocks to work around:

  https://lore.kernel.org/linux-trace-kernel/20240823102816.5e55753b@gandalf.local.home/

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home
Fixes: e88ed22 ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

ojeda pushed a commit that referenced this pull request

Dec 16, 2024
Element replace (with a socket different from the one stored) may race
with socket's close() link popping & unlinking. __sock_map_delete()
unconditionally unrefs the (wrong) element:

// set map[0] = s0
map_update_elem(map, 0, s0)

// drop fd of s0
close(s0)
  sock_map_close()
    lock_sock(sk)               (s0!)
    sock_map_remove_links(sk)
      link = sk_psock_link_pop()
      sock_map_unlink(sk, link)
        sock_map_delete_from_link
                                        // replace map[0] with s1
                                        map_update_elem(map, 0, s1)
                                          sock_map_update_elem
                                (s1!)       lock_sock(sk)
                                            sock_map_update_common
                                              psock = sk_psock(sk)
                                              spin_lock(&stab->lock)
                                              osk = stab->sks[idx]
                                              sock_map_add_link(..., &stab->sks[idx])
                                              sock_map_unref(osk, &stab->sks[idx])
                                                psock = sk_psock(osk)
                                                sk_psock_put(sk, psock)
                                                  if (refcount_dec_and_test(&psock))
                                                    sk_psock_drop(sk, psock)
                                              spin_unlock(&stab->lock)
                                            unlock_sock(sk)
          __sock_map_delete
            spin_lock(&stab->lock)
            sk = *psk                        // s1 replaced s0; sk == s1
            if (!sk_test || sk_test == sk)   // sk_test (s0) != sk (s1); no branch
              sk = xchg(psk, NULL)
            if (sk)
              sock_map_unref(sk, psk)        // unref s1; sks[idx] will dangle
                psock = sk_psock(sk)
                sk_psock_put(sk, psock)
                  if (refcount_dec_and_test())
                    sk_psock_drop(sk, psock)
            spin_unlock(&stab->lock)
    release_sock(sk)

Then close(map) enqueues bpf_map_free_deferred, which finally calls
sock_map_free(). This results in some refcount_t warnings along with
a KASAN splat [1].

Fix __sock_map_delete(), do not allow sock_map_unref() on elements that
may have been replaced.

[1]:
BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330
Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063

CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ #125
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Workqueue: events_unbound bpf_map_free_deferred
Call Trace:
 <TASK>
 dump_stack_lvl+0x68/0x90
 print_report+0x174/0x4f6
 kasan_report+0xb9/0x190
 kasan_check_range+0x10f/0x1e0
 sock_map_free+0x10e/0x330
 bpf_map_free_deferred+0x173/0x320
 process_one_work+0x846/0x1420
 worker_thread+0x5b3/0xf80
 kthread+0x29e/0x360
 ret_from_fork+0x2d/0x70
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 1202:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 __kasan_slab_alloc+0x85/0x90
 kmem_cache_alloc_noprof+0x131/0x450
 sk_prot_alloc+0x5b/0x220
 sk_alloc+0x2c/0x870
 unix_create1+0x88/0x8a0
 unix_create+0xc5/0x180
 __sock_create+0x241/0x650
 __sys_socketpair+0x1ce/0x420
 __x64_sys_socketpair+0x92/0x100
 do_syscall_64+0x93/0x180
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 46:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 kasan_save_free_info+0x37/0x60
 __kasan_slab_free+0x4b/0x70
 kmem_cache_free+0x1a1/0x590
 __sk_destruct+0x388/0x5a0
 sk_psock_destroy+0x73e/0xa50
 process_one_work+0x846/0x1420
 worker_thread+0x5b3/0xf80
 kthread+0x29e/0x360
 ret_from_fork+0x2d/0x70
 ret_from_fork_asm+0x1a/0x30

The buggy address belongs to the object at ffff88811f5b9080
 which belongs to the cache UNIX-STREAM of size 1984
The buggy address is located 128 bytes inside of
 freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff888127d49401
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000
raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401
head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000
head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401
head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Disabling lock debugging due to kernel taint

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150
CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G    B              6.12.0+ #125
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Workqueue: events_unbound bpf_map_free_deferred
RIP: 0010:refcount_warn_saturate+0xce/0x150
Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05
RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349
R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000
R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024
FS:  0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __warn.cold+0x5f/0x1ff
 ? refcount_warn_saturate+0xce/0x150
 ? report_bug+0x1ec/0x390
 ? handle_bug+0x58/0x90
 ? exc_invalid_op+0x13/0x40
 ? asm_exc_invalid_op+0x16/0x20
 ? refcount_warn_saturate+0xce/0x150
 sock_map_free+0x2e5/0x330
 bpf_map_free_deferred+0x173/0x320
 process_one_work+0x846/0x1420
 worker_thread+0x5b3/0xf80
 kthread+0x29e/0x360
 ret_from_fork+0x2d/0x70
 ret_from_fork_asm+0x1a/0x30
 </TASK>
irq event stamp: 10741
hardirqs last  enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20
hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770
softirqs last  enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210
softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210

refcount_t: underflow; use-after-free.
WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150
CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G    B   W          6.12.0+ #125
Tainted: [B]=BAD_PAGE, [W]=WARN
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
Workqueue: events_unbound bpf_map_free_deferred
RIP: 0010:refcount_warn_saturate+0xee/0x150
Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7
RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349
R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000
R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024
FS:  0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __warn.cold+0x5f/0x1ff
 ? refcount_warn_saturate+0xee/0x150
 ? report_bug+0x1ec/0x390
 ? handle_bug+0x58/0x90
 ? exc_invalid_op+0x13/0x40
 ? asm_exc_invalid_op+0x16/0x20
 ? refcount_warn_saturate+0xee/0x150
 sock_map_free+0x2d3/0x330
 bpf_map_free_deferred+0x173/0x320
 process_one_work+0x846/0x1420
 worker_thread+0x5b3/0xf80
 kthread+0x29e/0x360
 ret_from_fork+0x2d/0x70
 ret_from_fork_asm+0x1a/0x30
 </TASK>
irq event stamp: 10741
hardirqs last  enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20
hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770
softirqs last  enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210
softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210

Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-3-1e88579e7bd5@rbox.co