aboutsummaryrefslogtreecommitdiffstats
path: root/fs/smb
AgeCommit message (Collapse)AuthorFilesLines
5 daysMerge tag 'v7.2-rc-part2-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds11-445/+1317
Pull smb server updates from Steve French: "This is mostly a correctness and compatibility update for ksmbd's SMB2/3 lease, oplock, durable handle, compound request, CREATE, rename, stream and share-mode handling. A large part of the series fixes cases found by smbtorture where ksmbd diverged from the SMB2/3 protocol requirements. The main changes are: - Rework SMB2 lease state handling so lease state is shared per ClientGuid/LeaseKey across opens, with better validation of lease create contexts, ACK handling, epochs, break-in-progress reporting, v2 lease notification routing, and chained lease breaks - Fix several oplock break corner cases, including ACK validation, timeout downgrade behavior, level-II break handling on unlink, share-conflict lease breaks, and read-control/stat-open behavior - Fix durable handle behavior around delete-on-close, stale reconnects, reconnect context parsing, oplock/lease break invalidation, and durable v2 AppInstanceId replacement - Fix compound request handling so related commands propagate failed statuses correctly, preserve response framing across chained errors, keep compound FIDs across READ/WRITE/FLUSH, and send interim STATUS_PENDING where clients expect cancellable compound I/O - Tighten CREATE and stream semantics, including create attribute validation, allocation size reporting, explicit create security descriptors, unnamed DATA stream handling, stream directory validation, and stream delete sharing against the base file - Fix rename and metadata behavior, including parent directory sharing checks, denying directory rename with open children, and preserving SMB ChangeTime across rename for open handles - Fix two important safety issues: a multichannel byte-range lock list owner race that could lead to use-after-free, and an NTLMv2 session key update before authentication proof validation - Fix a concurrent SMB2 NEGOTIATE preauth use-after-free, a UBSAN warning in compression capability parsing, a false hung-task warning in the durable handle scavenger, endian debug logging, Smatch indentation warnings, and kernel-doc warnings - Increase the default SMB3 transaction size from 1MB to 4MB to better match modern read/write negotiation and improve sequential I/O behavior" * tag 'v7.2-rc-part2-smb3-server-fixes' of git://git.samba.org/ksmbd: (50 commits) ksmbd: fix kernel-doc warnings in smb2_lease_break_noti() ksmbd: fix inconsistent indenting warnings ksmbd: validate NTLMv2 response before updating session key ksmbd: increase SMB3_DEFAULT_TRANS_SIZE from 1MB to 4MB ksmbd: fix UBSAN array-index-out-of-bounds in decode_compress_ctxt() ksmbd: sleep interruptibly in the durable handle scavenger ksmbd: start file id allocation at 1 ksmbd: treat read-control opens as stat opens only for leases ksmbd: validate :: stream type against directory create ksmbd: break conflicting-open leases only as far as needed ksmbd: break handle caching for share conflicts ksmbd: normalize ungrantable lease states ksmbd: return oplock protocol error for level II ack ksmbd: avoid level II oplock break notification on unlink ksmbd: downgrade oplock after break timeout ksmbd: apply create security descriptor first ksmbd: return requested create allocation size ksmbd: tighten create file attribute validation ksmbd: reject empty-attribute synchronize-only create ksmbd: honor stream delete sharing for base file ...
7 dayscifs: define variable sized buffer for querydir responsesShyam Prasad N4-3/+17
QueryDirectory responses today are stored in one of two fixed sized buffers: smallbuf (448 bytes) or bigbuf (16KB). These are borrowed from server struct and are not sufficient for large-sized query dir operations. With this change we will now define a new buffer type specifically for cifs_search_info to hold variable sized responses. These will be allocated by kmalloc and freed by kfree. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: do not account EOF extension as allocationHuiwen He1-3/+10
cifs_setsize() updates the local inode size after SetEOF succeeds. It also used the new EOF as a local i_blocks estimate, but extending EOF does not prove that the intervening range was allocated. For example, after writing 1 MiB and then extending EOF to 10 MiB, the client can report the file as fully allocated even though the server still reports a much smaller AllocationSize: $ dd if=/dev/zero of=test bs=1M count=1 $ truncate -s 10M test && stat -c 'size=%s blocks=%b' test $ stat --cached=never -c 'size=%s blocks=%b' test client stat: size=10485760 blocks=20480 server stat: size=10485760 blocks=2056 client stat after revalidation: size=10485760 blocks=2056 A later attribute revalidation may correct i_blocks, but callers such as xfstests generic/495 invoke swapon immediately after truncate. The swapfile hole check can therefore observe the inflated local i_blocks value and accept a sparse file. Do not grow i_blocks from cifs_setsize() on EOF extension. Only clamp it on shrink; allocation growth must come from write completion or from server-reported AllocationSize. With this change, EOF extension no longer makes a sparse file appear fully allocated before the next attribute revalidation, and xfstests generic/495 no longer accepts it through the inflated local i_blocks value. Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: preserve errors from smb2_set_sparse()Huiwen He1-15/+15
smb2_set_sparse() converts every FSCTL_SET_SPARSE failure to false and marks sparse support as broken for the share. Callers therefore report EOPNOTSUPP even for errors such as ENOSPC or EACCES, and later sparse operations remain disabled until the share is unmounted. Return the SMB2 ioctl error directly. Set broken_sparse_sup only for EOPNOTSUPP, which indicates that the server does not support the request. Update smb3_punch_hole() to propagate the error returned by smb2_set_sparse(). Fixes: 3d1a3745d8ca ("Add sparse file support to SMB2/SMB3 mounts") Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb: client: Fix next buffer leak in receive_encrypted_standard()Haoxiang Li1-4/+6
receive_encrypted_standard() allocates next_buffer before checking whether the number of compound PDUs already reached MAX_COMPOUND. If the limit check fails, the function returns immediately and the newly allocated next_buffer is not assigned to server->smallbuf/server->bigbuf, making it leaked. Move the MAX_COMPOUND check before allocating next_buffer. Fixes: b24df3e30cbf ("cifs: update receive_encrypted_standard to handle compounded responses") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: use %pe to print error pointerFredric Cover1-2/+2
Currently, __reconnect_target_locked() prints the error pointer 'hostname' using PTR_ERR() and %ld, printing only the error code. Fix this by using by using %pe to print the error symbolic code instead, and remove PTR_ERR(). Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: name the default fallocate modeHuiwen He1-1/+1
FALLOC_FL_ALLOCATE_RANGE identifies the default fallocate allocation mode and is defined as zero. Use the symbolic name instead of a literal zero in smb3_fallocate() to make the mode dispatch clearer. This does not change behavior. Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb common: add missing AAPL definesSteve French1-0/+24
Add various defines for AAPL open context, e.g. for queries of server capabilities. Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: fix chown/chgrp with SMB3 POSIX ExtensionsRalph Boehme1-1/+2
Ownership (chown) and group (chgrp) modifications were being ignored when mounting with SMB3 POSIX Extensions unless CIFS_MOUNT_CIFS_ACL or CIFS_MOUNT_MODE_FROM_SID were also explicitly set. Fix this by checking for posix_extensions in cifs_setattr_nounix() when updating UID and GID, ensuring that id_mode_to_cifs_acl() is called to map and set the ownership/group information on the server. Cc: stable@vger.kernel.org Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
7 dayssmb/client: fix security flag calculation when setting security descriptorsRalph Boehme1-6/+16
In id_mode_to_cifs_acl(), aclflag was initialized to CIFS_ACL_DACL by default. This forced the client to request setting the DACL even when only an ownership (chown) or group (chgrp) change was being performed. Let build_sec_desc() do the proper flag calculation by initializing aclflag to 0. build_sec_desc() sets the appropriate bits (CIFS_ACL_OWNER, CIFS_ACL_GROUP, or CIFS_ACL_DACL) depending on what actually changed. During ownership transfer, CIFS_ACL_DACL is only set if replace_sids_and_copy_aces() actually replaces the SIDs inside any of the DACL's ACEs. If build_sec_desc() results in aclflag being 0 (meaning no changes were mapped), exit early to avoid sending an empty security descriptor update to the server. Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
8 daysksmbd: fix kernel-doc warnings in smb2_lease_break_noti()Namjae Jeon1-0/+2
kernel test robot report missing kernel-doc descriptions for the 'wait_ack' and 'inc_epoch' parameters of smb2_lease_break_noti(): Warning: fs/smb/server/oplock.c:937 function parameter 'wait_ack' not described in 'smb2_lease_break_noti' Warning: fs/smb/server/oplock.c:937 function parameter 'inc_epoch' not described in 'smb2_lease_break_noti' Document both parameters to silence the warnings. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
8 daysksmbd: fix inconsistent indenting warningsNamjae Jeon1-6/+6
Detected by Smatch. fs/smb/server/oplock.c:1446 smb_grant_oplock() warn: inconsistent indenting Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
8 daysksmbd: validate NTLMv2 response before updating session keyHaofeng Li1-4/+16
ksmbd_auth_ntlmv2() derives the NTLMv2 session key into sess->sess_key before it verifies the NTLMv2 response. ksmbd_decode_ntlmssp_auth_blob() then continues into KEY_XCH even when ksmbd_auth_ntlmv2() failed. With SMB3 multichannel binding, the failed authentication operates on an existing session and the session setup error path does not expire binding sessions. A client can send a binding session setup with a bad NT proof and KEY_XCH and still modify sess->sess_key before STATUS_LOGON_FAILURE is returned. Relevant path: smb2_sess_setup() -> conn->binding = true -> ntlm_authenticate() -> session_user() -> ksmbd_decode_ntlmssp_auth_blob() -> ksmbd_auth_ntlmv2() -> calc_ntlmv2_hash() -> hmac_md5_usingrawkey(..., sess->sess_key) -> crypto_memneq() returns mismatch -> KEY_XCH arc4_crypt(..., sess->sess_key, ...) -> out_err without expiring the binding session Derive the base session key into a local buffer and copy it to sess->sess_key only after the proof matches. Return immediately on authentication failure so KEY_XCH is only processed after successful authentication. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Fixes: f9929ef6a2a5 ("ksmbd: add support for key exchange") Cc: stable@vger.kernel.org Signed-off-by: Haofeng Li <lihaofeng@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: increase SMB3_DEFAULT_TRANS_SIZE from 1MB to 4MBNamjae Jeon1-1/+1
This patch raises `SMB3_DEFAULT_TRANS_SIZE` to 4MB to align it with `smb2 max read/write`. This allows better I/O negotiation with modern clients and improves sequential read/write performance on high-speed networks. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: fix UBSAN array-index-out-of-bounds in decode_compress_ctxt()Namjae Jeon1-1/+9
decode_compress_ctxt() walks CompressionAlgorithms[] using the client supplied CompressionAlgorithmCount. That field is declared in struct smb2_compression_capabilities_context as a fixed 4-element array, but the number of algorithms is actually variable and clients such as Windows advertise more than four (e.g. LZ77, LZ77+Huffman, LZNT1, Pattern_V1 and LZ4). The on-wire context length is already validated, so the access is within the received buffer, but indexing the statically sized [4] array makes UBSAN report an out-of-bounds access: UBSAN: array-index-out-of-bounds in smb2pdu.c:1122:48 index 4 is out of range for type '__le16 [4]' Call Trace: smb2_handle_negotiate+0xda7/0xde0 [ksmbd] ksmbd_smb_negotiate_common+0x27b/0x3e0 [ksmbd] smb2_negotiate_request+0x14/0x20 [ksmbd] handle_ksmbd_work+0x181/0x500 [ksmbd] Walk the algorithms through a pointer so the fixed-array bounds check is not applied, while keeping the existing length validation that bounds the loop to the data actually received. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: sleep interruptibly in the durable handle scavengerNamjae Jeon1-2/+2
The durable handle scavenger kthread waits up to DURABLE_HANDLE_MAX_TIMEOUT (300 seconds) between scans using wait_event_timeout(), which sleeps in TASK_UNINTERRUPTIBLE. When there are no durable handles pending expiry the task stays in D state far longer than 120 seconds, so the hung task detector prints a bogus "task ksmbd-durable-s blocked for more than 120 seconds" warning with a backtrace, even though the thread is only idle. Use wait_event_interruptible_timeout() so the thread sleeps in TASK_INTERRUPTIBLE, which the hung task detector ignores. This also suits the already-freezable kthread. Treat a negative return (e.g. -ERESTARTSYS) like a timeout when recomputing the next wake interval. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: start file id allocation at 1Namjae Jeon2-2/+8
ksmbd allocates both the volatile id (per-session file table) and the persistent id (global file table) with idr_alloc_cyclic() starting at 0. The first open after the module loads therefore gets volatile id 0 and persistent id 0, and ksmbd returns an SMB2 FileId of {0, 0} in the create response. Clients treat an all-zero FileId as a null handle. smbtorture's smb2_util_handle_empty() considers {0, 0} empty, so tests that guard the close with it (e.g. smb2.oplock.statopen1, smb2.lease.statopen*) never close that first handle. The leaked open keeps the inode's oplock count non-zero, so a later batch oplock request on the same file is downgraded to level II and the test fails. Start the id allocation at 1 (KSMBD_START_FID) so no handle is ever assigned a {0, 0} FileId, matching the behaviour of other SMB servers. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: treat read-control opens as stat opens only for leasesNamjae Jeon1-3/+27
A second open that requests only metadata-level access must not break the existing caching state. ksmbd already skips the break for such opens via fp->attrib_only (FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES and FILE_SYNCHRONIZE). An open requesting only READ_CONTROL (reading the security descriptor) must be treated differently depending on the existing caching state. smbtorture smb2.lease.statopen4 expects a read-control open NOT to break a caching lease, while smb2.oplock.statopen1 expects the same open to break a batch oplock. So READ_CONTROL is a stat open for leases but not for oplocks. Extend the stat-open break-skip in smb_grant_oplock() to also cover a read-control-only open, but only when the existing holder is a lease. The global fp->attrib_only flag (used for share-mode, rename and truncate decisions) is left unchanged so oplock behaviour is preserved. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: validate :: stream type against directory createNamjae Jeon1-1/+8
smb2.streams.dir opens <dir>::$DATA with FILE_DIRECTORY_FILE and expects STATUS_NOT_A_DIRECTORY, then opens <dir>::$DATA without it and expects STATUS_FILE_IS_A_DIRECTORY. Commit "treat unnamed DATA stream as base file" canonicalizes the ::$DATA suffix to a NULL stream name so the open continues through the base-file path. That skipped the stream/directory type validation, which was guarded by "if (stream_name)", so opening a directory's ::$DATA stream with FILE_DIRECTORY_FILE incorrectly returned STATUS_OK and a plain open of it no longer reported STATUS_FILE_IS_A_DIRECTORY. parse_stream_name() still records the explicit $DATA type in s_type even when it clears stream_name. Run the data-stream vs directory validation whenever s_type is DATA_STREAM, not only when stream_name is set, so the canonicalized ::$DATA open is rejected with the correct status. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: break conflicting-open leases only as far as neededNamjae Jeon1-9/+14
smb2.lease.oplock and smb2.lease.breaking1 hold a lease and then issue a single conflicting open on the same file. The held lease must break one step to drop write caching (RWH->RH, RW->R) and then stop, so lease_break_info.count is 1 and the lease keeps its read/handle caching. ksmbd instead cascaded the break all the way down to none (e.g. RWH->RH->R->none), so the break count was 2 or 3 and the reported lease state ended at 0. Commit "chain pending lease breaks before waking waiters" forces break_level to SMB2_OPLOCK_LEVEL_NONE for any non-lease open against a handle-caching lease, which drives oplock_break()'s retry loop down to none even when only one open is contending. Drop that break_level override so a conflicting open breaks a lease only to its own compatible level (level II, i.e. RH/R). A deeper break is still required when a truncating open is also waiting behind the same lease break. smb2.lease.breaking3 keeps a normal open pending through RWH->RH and an overwrite open pending behind it, and expects the lease to continue RH->R->none before either open completes. The overwrite waiter sets open_trunc on the lease while it blocks on the pending break, so extend the retry loop to chain another break while that truncating waiter still needs the lease at none. The per-break open_trunc snapshot stays cleared, so the cascade steps down (RH->R->none) instead of collapsing straight to none, and the normal open stays pending until the lease is fully broken. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: break handle caching for share conflictsNamjae Jeon1-7/+19
smb2.lease.break_twice first opens a file with an RHW lease and then tries a second open with restrictive sharing. That open must fail with a sharing violation, but the existing lease should be broken from RHW to RW because only handle caching conflicts with the requested sharing. ksmbd used the normal write-cache break calculation for this path, so RHW was broken to RH. The following successful open then did not generate the expected second break from RW to R. Pass share-conflict context into the lease break helper and, for lease breaks caused by sharing, drop only SMB2_LEASE_HANDLE_CACHING from the current lease state. Other break paths keep the existing write/truncate break behavior. A share-conflict break must also remain a single break. The triggering open fails with a sharing violation and is never granted, so there is no target oplock level to converge on. The lease break retry loop, however, keeps breaking while the lease level is still above req_op_level, which broke RHW all the way down to R in one open (lease_break_info.count became 2 instead of 1). Skip the again loop for share-conflict breaks so the sharing open produces exactly one RHW->RW break and the later successful open produces the separate RW->R break the test expects. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: normalize ungrantable lease statesNamjae Jeon1-0/+13
smb2.lease.request verifies which SMB2 lease state combinations are granted by the server. Requests for H-only, W-only, and HW leases are valid lease state bitmasks, but they are not grantable combinations and should be returned as lease state none. ksmbd only checked that the requested bits were inside the SMB2 lease state mask. As a result it could grant H-only, W-only, or HW requests and return non-zero lease states where the client expects no lease. Keep the bitmask validation, but normalize ungrantable combinations to zero before allocating or looking up the lease. The grantable combinations remain unchanged: R, RH, RW, and RHW. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: return oplock protocol error for level II ackNamjae Jeon1-1/+4
SMB2 level II to none oplock breaks do not require an acknowledgment from the client. smb2.oplock.levelii500 intentionally acknowledges such a break and expects the server to reject it with STATUS_INVALID_OPLOCK_PROTOCOL. ksmbd drops the local level II oplock to none immediately after sending the break notification because it does not wait for an ACK. When the client then sends the invalid ACK, smb20_oplock_break_ack() sees that the oplock is not in OPLOCK_ACK_WAIT state and returns STATUS_INVALID_DEVICE_STATE before checking the current oplock level. If the oplock is already none when an unexpected SMB2 oplock break ACK arrives, report STATUS_INVALID_OPLOCK_PROTOCOL. Keep the existing STATUS_INVALID_DEVICE_STATE response for other unexpected non-wait states. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: avoid level II oplock break notification on unlinkNamjae Jeon3-9/+30
smb2_util_unlink() opens the target with FILE_DELETE_ON_CLOSE and then closes that handle. Other clients can also mark a file for delete with SMB2 SET_INFO FileDispositionInformation. When these unlink paths break existing SMB2 level II oplocks, ksmbd sends an unsolicited SMB2_OPLOCK_BREAK notification to none. This races with the synchronous CREATE or SET_INFO response expected by the client, and smbtorture reports NT_STATUS_INVALID_NETWORK_RESPONSE while running smb2.oplock.exclusive2. SMB2 level II oplock breaks do not require an acknowledgment in the delete path. Keep lease handling unchanged, but drop plain SMB2 level II oplocks locally for unlink requests without sending a break notification. Normal write/truncate paths still send the level II to none notification, preserving the behavior covered by smb2.oplock.levelII500. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: downgrade oplock after break timeoutNamjae Jeon1-2/+13
smb2.oplock.batch22a opens a file with a batch oplock and then issues a second open that waits for the oplock break timeout. After the timeout the second open should succeed, but the granted oplock level must be level II. When the break times out, oplock_break() returns -ENOENT after invalidating the previous opener. smb_grant_oplock() went straight to set_lev with the original requested oplock level, so the second open could be granted a new batch oplock. Downgrade the requested oplock to level II on the -ENOENT break-timeout path before granting the oplock to the new open. A break that completes because the previous owner closed its handle from the oplock break handler must be distinguished from a real timeout. smb2.oplock.batch7 closes the first handle during the break wait, and the second open is then expected to be granted the originally requested batch oplock. Return -EAGAIN from the non-lease break path when the previous opener closed during the break wait, recheck sharing in smb_grant_oplock(), and grant the requested oplock if the close removed the conflict. Real break timeouts still return -ENOENT and keep the downgrade to level II. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: apply create security descriptor firstNamjae Jeon1-7/+9
smb2.create.aclfile creates files with an SMB2_CREATE_SD_BUFFER create context and expects the resulting security descriptor to match the descriptor supplied by the client. ksmbd currently tries to inherit the parent DACL first and only parses the SMB2_CREATE_SD_BUFFER context when DACL inheritance fails. If inheritance succeeds, the explicit security descriptor supplied on create is ignored. This breaks create requests that include owner/group information in the security descriptor. Apply the create security descriptor first when the context is present. Fall back to the existing inherited/default ACL path only when no create security descriptor was supplied. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: return requested create allocation sizeNamjae Jeon2-11/+24
smb2.create.blob sends an SMB2_CREATE_ALLOCATION_SIZE create context with a 1MiB allocation size and expects the create response AllocationSize field to match the requested size. smb2.create.open additionally compares the AllocationSize returned in the CREATE response with the AllocationSize returned by FILE_ALL_INFORMATION on the same handle. ksmbd applies the allocation with fallocate(), but then fills both the create response and handle-based information from stat.blocks << 9. On filesystems such as ext4 this can include filesystem allocation rounding and metadata effects, causing a response larger than the SMB2 allocation size context and a disagreement between the two queries. Remember the requested allocation size while processing the create context, store the reported allocation size in struct ksmbd_file, and use it for both the create response and handle-based allocation size responses. Update the stored value when FILE_ALLOCATION_INFORMATION changes it, and fall back to stat.blocks << 9 when no allocation size context was provided. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: tighten create file attribute validationNamjae Jeon1-1/+6
smb2.create.gentest checks each create FileAttributes bit independently and expects FILE_ATTRIBUTE_INTEGRITY_STREAM and FILE_ATTRIBUTE_NO_SCRUB_DATA to be rejected with STATUS_INVALID_PARAMETER. ksmbd validates create FileAttributes against FILE_ATTRIBUTE_MASK, which includes those bits. It also rejects only requests that have no known attribute bit at all, so a request containing both known and unknown bits can pass validation. Use a create-specific attribute mask that excludes INTEGRITY_STREAM and NO_SCRUB_DATA, and reject any bit outside that mask. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: reject empty-attribute synchronize-only createNamjae Jeon1-0/+7
smb2.create.gentest checks each desired access bit independently and expects an open that requests only SYNCHRONIZE with CreateDisposition OPEN_IF and FileAttributes 0 to fail with STATUS_ACCESS_DENIED. Rejecting all SYNCHRONIZE-only opens is too broad: SYNCHRONIZE does not imply read, write, or delete data access, and smb2.sharemode.sharemode-access expects a SYNCHRONIZE-only open to succeed when it does not conflict with the existing share mode. Limit the rejection to the gentest create shape: SYNCHRONIZE-only access, OPEN_IF disposition, and no file attributes. Other synchronize-only opens are handled by the normal permission and share-mode checks. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: honor stream delete sharing for base fileNamjae Jeon3-0/+37
smb2.streams.delete opens an alternate data stream without FILE_SHARE_DELETE and then tries to delete the base file. Windows rejects the base-file delete with STATUS_SHARING_VIOLATION while the stream handle is open. ksmbd tracks stream opens on the same ksmbd_inode as the base file, but the delete-on-close path only checked delete access on the base handle before marking the inode delete-pending. As a result, deleting the base file succeeded even though an open stream handle denied delete sharing. Add a helper to detect open stream handles on the same inode that do not allow FILE_SHARE_DELETE, and reject base-file delete pending and DELETE opens with a sharing violation in that case. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: send pending interim for last compound I/ONamjae Jeon1-1/+29
smb2.compound_async.write_write and smb2.compound_async.read_read expect the last I/O request in a compound request to become cancellable before its final response is received. smb clients mark a request cancellable after receiving an interim STATUS_PENDING response. ksmbd handled the last READ/WRITE synchronously and returned the final response directly, so the client never observed STATUS_PENDING and req->cancel.can_cancel remained false. For the last READ or WRITE in a compound request, register the work briefly as async and send a STATUS_PENDING interim response before continuing with the normal synchronous completion. The final READ/WRITE response remains unchanged. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: return success for deferred final closeNamjae Jeon1-1/+3
ksmbd_close_fd() marks an open file as FP_CLOSED and drops the file table reference. If another in-flight request still holds a reference, the final close is deferred until that request drops its reference. The function currently returns -EINVAL in that deferred-final-close case because fp is cleared when the reference count does not reach zero. That turns a valid close into STATUS_FILE_CLOSED. smb2.compound_find.compound_find_close sends QUERY_DIRECTORY and then closes the same directory handle before receiving the find response. The query holds a reference while it builds the response, so close must mark the handle closed and return success even though final teardown is delayed. Track whether the handle was successfully transitioned to FP_CLOSED and return success when only the final close is deferred. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: preserve compound responses for chained errorsNamjae Jeon1-0/+7
set_smb2_rsp_status() resets the response iov and compound offsets before building an error response. That is fine for a single request, but it corrupts a compound response when an error is detected after an earlier compound element has already been completed. smb2.compound.invalid4 sends a READ as the first compound element and a bogus command as the second one. The READ response must remain in the compound response with STATUS_END_OF_FILE, followed by the bogus command response with STATUS_INVALID_PARAMETER. Resetting the response state for the second command breaks the compound framing and the client reports NT_STATUS_INVALID_NETWORK_RESPONSE. When setting an error for a chained command, update and pin only the current compound response slot instead of resetting the whole response. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: validate handle for create or get object idNamjae Jeon1-0/+9
FSCTL_CREATE_OR_GET_OBJECT_ID returned a dummy successful response without checking whether the request handle was valid. That let an invalid related compound handle succeed in smb2.compound.related5, although the client expected STATUS_FILE_CLOSED. Look up the file handle before building the object id response and fail with STATUS_FILE_CLOSED when the handle is invalid or already closed. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: propagate failed command status in related compoundsNamjae Jeon2-4/+156
In a related compound request, later commands can refer to the file handle from an earlier command using the related FID value. If the earlier command fails without producing a valid compound FID, the later related commands must fail with the same status instead of operating on an invalid or stale handle. smb2.compound.related4 sends CREATE followed by IOCTL, CLOSE and SET_INFO. The CREATE is expected to fail with STATUS_ACCESS_DENIED, and the remaining related commands are expected to return STATUS_ACCESS_DENIED as well. ksmbd only stored the compound FID on successful CREATE and did not remember failed compound statuses. Store the failed status in the work item and make related handle-based requests fail immediately with that status only when the compound FID is invalid. Also preserve and consume the related FID across successful FLUSH, READ and WRITE requests whose responses do not carry a file id. Keep a valid compound FID across non-close failures so later related commands can continue to use the handle. When extracting the FID from a successful READ, WRITE or FLUSH request, use the request structure matching the SMB2 command: READ and WRITE place PersistentFileId and VolatileFileId at a different offset than FLUSH, so a single smb2_flush_req cast can save the wrong value as compound_fid and make the following related request fail with STATUS_FILE_CLOSED (smb2.compound_async.write_write after smb2.compound_async.flush_flush). Only update the saved compound FID when the request carries a valid volatile FID. otherwise an all-ones related FID would overwrite the CREATE FID and break smb2.compound.related6. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: deny renaming directory with open childrenNamjae Jeon3-0/+32
Windows denies renaming a directory while a file below that directory is still open. smb2.rename.rename_dir_openfile checks this by keeping a file handle open under the directory and then attempting to rename the directory handle. ksmbd did not check open children before calling vfs_rename(), so the rename incorrectly succeeded. For non-POSIX clients, scan the global open file table for active handles whose dentries are below the directory being renamed. If any child is open, fail the rename with -EACCES so the client receives STATUS_ACCESS_DENIED. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: check parent directory sharing conflicts on renameNamjae Jeon1-2/+4
When renaming a file, some existing opens on the parent directory must block the rename with STATUS_SHARING_VIOLATION. This includes parent directory handles opened with DELETE access and handles opened without FILE_SHARE_DELETE. ksmbd checked only the parent's desired access for FILE_DELETE. That handled smb2.rename.share_delete_and_delete_access, but missed the case where the parent directory was opened without delete access and without delete sharing, so smb2.rename.no_share_delete_no_delete_access incorrectly succeeded. Attribute-only parent opens, however, must not block the rename. smb2.rename.msword opens the parent directory with only SYNCHRONIZE and FILE_READ_ATTRIBUTES, no share access, and then renames an already-open child file. Windows allows this pattern. Reject parent directory handles that request DELETE access, and reject non-attribute-only parent opens that deny FILE_SHARE_DELETE, while allowing attribute-only parent opens to coexist with child rename. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: preserve open change time across renameNamjae Jeon2-13/+11
inode ctime is updated when a file is renamed. ksmbd returned that ctime directly as SMB2 ChangeTime for handle-based query information. This makes ChangeTime change after a rename through an already-open handle, while Windows keeps the handle's ChangeTime stable for this case. Store the SMB ChangeTime in struct ksmbd_file when the handle is opened and use that value for create, close, and handle-based query information responses. If a client explicitly sets FILE_BASIC_INFORMATION ChangeTime, update the stored value as well. This fixes smbtorture smb2.rename.simple_modtime, which expects ChangeTime and LastWriteTime to remain unchanged after renaming an already-open file. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: handle durable v2 app instance idNamjae Jeon3-0/+110
The SMB2_CREATE_APP_INSTANCE_ID create context is used with durable v2 opens to identify another open from the same application instance. When a new durable v2 open arrives with the same AppInstanceId as an existing open, the server should close the previous open without sending an oplock break notification. ksmbd ignored this create context. A second durable v2 batch oplock open with the same AppInstanceId therefore went through the normal competing open path and sent an oplock break to the first opener. smbtorture smb2.durable-v2-open.app-instance expects no oplock break and then expects the old handle to be closed. Parse and store AppInstanceId for durable v2 opens. Before creating the new open, find an existing file with the same AppInstanceId and close it through the normal close teardown path without issuing an oplock break. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: fix durable reconnect context parsingNamjae Jeon1-9/+6
SMB2 create context DataLength describes only the create context data payload. It does not include the create context header, name field, or any local padding that exists in ksmbd's helper structures. ksmbd validated durable reconnect contexts by comparing DataOffset + DataLength against sizeof the whole helper structure. This rejects a valid durable v2 reconnect context because the wire DH2C data is 36 bytes while struct create_durable_handle_reconnect_v2 contains an extra four byte pad. Validate the durable context payload length against the corresponding payload member instead. Also keep the reconnect context authoritative when a later durable request context is present, matching the existing durable v1 reconnect behavior. This fixes smbtorture smb2.durable-v2-open.durable-v2-setinfo, where the durable v2 reconnect after SET_INFO was rejected with STATUS_INVALID_PARAMETER. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: invalidate durable handles on oplock breakNamjae Jeon4-8/+100
When a durable handle is preserved after a disconnect, its oplock state can still block later opens. If another client opens the same file and the preserved oplock or lease has to be broken, the old durable handle must no longer be reconnectable after the break cannot be acknowledged. ksmbd was treating a missing connection, or an oplock break timeout, as a successful break only by downgrading the oplock state. The old durable handle remained reconnectable, so a later durable reconnect for that stale handle could succeed. The open path can also see a detached durable handle before the break notification helpers fully dispose of it. Invalidate such a preserved durable handle directly when a competing open has to break its batch or exclusive oplock, while leaving ordinary durable reconnects without a competing open untouched. If the old handle still reaches the reconnect path, reject it when the same inode already has another active open. This matches the smb2.durable-open.open2-lease/open2-oplock sequence where a later open replaces the disconnected durable owner and the stale first handle must not be reclaimed. Also, reconnect lookup used only the persistent id. A new durable open can get a persistent id that matches the stale reconnect request after the old durable state is invalidated. Preserve the disconnected handle's old volatile id and require durable reconnect contexts to match it, so a stale reconnect cannot attach to a different durable open. Windows allows the later open to proceed and rejects the old reconnect with STATUS_OBJECT_NAME_NOT_FOUND. The smbtorture smb2.durable-open.oplock test covers this case. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: supersede disconnected delete-on-close durable handleNamjae Jeon3-0/+75
A durable handle opened with FILE_DELETE_ON_CLOSE is preserved across a disconnect so it can be reclaimed by a durable reconnect. smb2.durable-open.delete_on_close2 disconnects such a handle and then reconnects it, expecting the reconnect to succeed. When the client does not reconnect but instead opens the same name with a new delete-on-close create, the preserved handle keeps the file present with delete-on-close set. ksmbd then rejects the new open with STATUS_ACCESS_DENIED on the file_present + FILE_DELETE_ON_CLOSE + OPEN_IF/OVERWRITE_IF path. smb2.durable-open.delete_on_close1 expects this open to create a fresh, empty file instead, i.e. the disconnected handle's delete-on-close must take effect first. Add ksmbd_close_disconnected_durable_delete_on_close(), which closes disconnected (conn == NULL) durable handles that keep a delete-on-close file present. The final close promotes S_DEL_ON_CLS to S_DEL_PENDING and unlinks the file, so a re-resolved path is absent and the new open creates it fresh. Call it from smb2_open() before the delete-on-close conflict check, only for the conflicting open shapes. A live (connected) handle still keeps the file and blocks the open as before. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: handle missing create contexts for lease opensNamjae Jeon1-0/+3
smb2_find_context_vals() assumes that callers only search create contexts when the SMB2 CREATE request contains a non-empty create context area. That is not always true. a client can send RequestedOplockLevel set to SMB2_OPLOCK_LEVEL_LEASE without a lease create context. In that case parse_lease_state() searches for a lease context and smb2_find_context_vals() starts parsing from offset 0 with length 0, returning -EINVAL. This makes the open fail with STATUS_INVALID_PARAMETER. The smbtorture smb2.lease.duplicate_open test hits this while creating a second file without a lease request. Return NULL when the request has no create context area so the missing context is treated the same as any other absent create context. The open then continues without granting a lease. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: fix use-after-free of conn->preauth_info in concurrent SMB2 NEGOTIATEGil Portnoy1-4/+7
conn->preauth_info is shared connection state (struct preauth_integrity_info, kmalloc-96) that is allocated and freed by the SMB2 NEGOTIATE handler and read by the response send path. smb2_handle_negotiate() allocates conn->preauth_info, and on a deassemble_neg_contexts() failure kfrees it and sets it to NULL. Both the allocation and the free/NULL happen under ksmbd_conn_lock(conn) (the connection srv_mutex), which is held across the whole handler body. The response send path smb3_preauth_hash_rsp(), called from the send: block of __handle_ksmbd_work(), reads conn->preauth_info and dereferences conn->preauth_info->Preauth_HashValue (via ksmbd_gen_preauth_integrity_hash()) without taking conn_lock. When a client drives two SMB2 NEGOTIATE requests on the same connection, one worker can free conn->preauth_info on the failing-negotiate path while a concurrent send-path worker is reading it, producing a slab use-after-free read (KASAN-confirmed). The send-path read tested conn->preauth_info for NULL but raced with the free that occurs between the NULL check and the dereference, so the NULL guard alone does not close the window. Serialize the NEGOTIATE-branch read in smb3_preauth_hash_rsp() under ksmbd_conn_lock(conn) and re-check conn->preauth_info inside the lock. Because the negotiate handler holds conn_lock across its kfree + NULL assignment, a reader that also takes conn_lock either runs fully before the allocation or fully after the NULL store, and can never observe the freed-but-not-yet-NULLed pointer. ksmbd_gen_preauth_integrity_hash() takes no locks itself (it only computes a SHA-512 over the buffer), so no lock-ordering inversion is introduced, and conn_lock is a sleepable mutex which is safe on this send path (it already performs network I/O). Fixes: aa7253c2393f ("ksmbd: fix memory leak in smb2_handle_negotiate") Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: keep common response iovecs in the work itemNamjae Jeon2-21/+40
Most SMB responses need no more than four kvec entries, but every work item currently allocates a separate four-entry array and frees it after the response is sent. Embed the common array in struct ksmbd_work and allocate a larger array only when a response exceeds the inline capacity. This removes one allocation and one free from the common request path while preserving support for larger compound and read responses. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: route v2 lease breaks on the client lease channelNamjae Jeon2-2/+16
v2 leases are scoped by ClientGuid. When the same client uses multiple connections, smbtorture expects lease break notifications to be sent on the connection associated with the client lease table, not necessarily on the connection that owns the individual open being broken. Keep a referenced connection in the lease table and use it for v2 lease break notifications while it is still active. Fall back to the open's connection if the table connection is being released. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: break RH leases before delete-on-closeNamjae Jeon1-3/+7
The delete paths only marked the opened file delete pending or delete-on-close. When another client still held a read/handle lease, no lease break was sent before the delete state changed. smb2.lease.unlink uses a create request with FILE_DELETE_ON_CLOSE and expects the second client's unlink to break the first client's RH lease to R with ACK_REQUIRED set. SetInfo(FileDispositionInformation) has the same lease-breaking requirement. Break level-II/read-handle leases before setting delete pending or delete-on-close so clients are notified before the file is removed. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: honor SMB2 v2 lease epochsNamjae Jeon2-10/+30
v2 lease responses should continue from the client supplied epoch. Initialize a new v2 lease from the requested epoch plus one so create responses match the epoch returned by Windows and expected by smbtorture. For a single chained break sequence, increment the epoch only for the first break notification. Follow-up breaks such as RH->R and R->NONE in smb2.lease.v2_breaking3 reuse the same epoch. Record when a waiter slept behind pending_break and let the later truncate/open overwrite break consume that marker to reuse the current epoch instead of assigning a new one. Do not increment the epoch when a same-client, same-key create asks for the already granted RH state. The epoch changes only when the granted lease state changes. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: do not wait for RH lease break ack on overwriteNamjae Jeon1-5/+12
smb2.lease.breaking4 expects an overwrite against an RH lease to send RH->NONE lease break notification but complete the triggering create without waiting for the break ack. Keep the lease in break-in-progress state until the client eventually acknowledges the downgrade, but do not hold the overwrite request behind that ack. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: chain pending lease breaks before waking waitersNamjae Jeon1-27/+81
A pending open can require more than one lease break before the existing lease becomes compatible with the operation that triggered the break. smb2.lease.breaking3 expects the server to hold the pending normal open through RWH->RH and RH->R, while a later overwrite waiter must not collapse that second break directly to RH->NONE. Keep pending_break held for lease breaks until the current triggering operation is compatible with the lease state. Snapshot the truncate request per oplock_break() call so another waiter cannot overwrite the state of the active break. Use the requested oplock level when deciding whether to chain another break. A second lease open only needs RWH->RH, while a normal none-oplock open can continue down to R and then NONE. For non-truncating metadata operations, break leases only down to read caching. Operations such as delete-on-close need to drop handle caching, but should not send a second R->NONE break after the client acknowledges RH->R. Also send STATUS_PENDING for levelII/read-lease break waiters. An async SMB2 create becomes cancelable only after the server sends an NT_STATUS_PENDING interim response. A waiter that blocks behind an already active lease break must receive the interim response before sleeping on pending_break, otherwise the client can process a later lease break while the create request is still not marked pending. Avoid duplicate interim responses when an overwrite first breaks a write oplock and then scans levelII/read leases. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: compute lease break-in-progress flag on responseNamjae Jeon1-12/+29
SMB2_LEASE_FLAG_BREAK_IN_PROGRESS is a transient create response flag, not persistent lease state. Do not store the flag in lease->flags when a same-key open is granted during a pending break. Instead, derive it from lease opens that are still waiting for a break ACK while building the lease create response, and keep lease->flags for persistent lease flags such as the parent lease key. This clears the flag naturally after the break ACK completes and fixes reopen responses that report BREAK_IN_PROGRESS after the lease is no longer breaking. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: treat unnamed DATA stream as base fileNamjae Jeon1-3/+11
The SMB path suffix :: names the unnamed data stream of the base file, not an alternate data stream backed by a DosStream xattr. Canonicalize an empty stream name with an explicit type to a NULL stream name after parsing. This keeps the base filename produced by strsep() and lets open continue through the normal base-file path instead of looking for a non-existent empty stream xattr. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: align SMB2 oplock break ack handlingNamjae Jeon1-58/+46
Handle SMB2 oplock break acknowledgments according to the server-side validation rules in MS-SMB2. Return STATUS_INVALID_DEVICE_STATE when an ACK arrives while the open is not breaking, reject SMB2_OPLOCK_LEVEL_LEASE with STATUS_INVALID_PARAMETER, allow BATCH acknowledgments to EXCLUSIVE, and make invalid ACK levels fail with STATUS_INVALID_OPLOCK_PROTOCOL after lowering the oplock to NONE. Update the successful response from the final granted oplock level instead of relying on the oplock transition helpers, which could turn invalid ACKs into successful responses. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: share SMB2 lease state across opensNamjae Jeon3-119/+151
Model SMB2 leases as per-client/per-key objects instead of keeping a separate lease copy in every oplock_info. The lease table now stores lease objects and each lease tracks the opens that reference it. This makes same ClientGuid/LeaseKey opens observe a single lease state, so lease upgrades, breaks, ACKs, and close teardown do not diverge across per-open copies. Keep one reference for the lease table entry and one reference for each open, and remove the table entry when the last open is detached. Update lease break ACK handling to refresh all open oplock levels from the shared lease state. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: clean up lease response flags and directory leasesNamjae Jeon1-8/+15
Do not echo reserved v1 lease flags back to clients. For lease v2 responses, only return BREAK_IN_PROGRESS and PARENT_LEASE_KEY_SET when they are meaningful, and preserve the parent lease key in the response. Allow directory leases whenever the request is a valid lease v2 request, and initialize v2 lease epochs from the first server-granted state change. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: fix lease break and ack state handlingNamjae Jeon2-96/+34
Do not skip valid lease states containing WRITE_CACHING when breaking level-II/read leases for writes and truncates. Handle lease break acknowledgments according to the SMB2 rule that the acknowledged state must be a subset of the server's break target. Apply the acknowledged state directly and keep the break pending on failed ACKs. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: use connection ClientGUID for lease lookupNamjae Jeon3-9/+11
MS-SMB2 defines the lease table lookup key as Connection.ClientGuid. Use the connection ClientGUID consistently when checking for same-client leases and duplicate lease keys. Also preserve directory and parent lease metadata when copying an existing lease state to a new open. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: validate SMB2 lease create contextsNamjae Jeon2-6/+51
Validate SMB2 lease context lengths, requested lease state bits, and v2 flags before using the context. Return errors via ERR_PTR so CREATE can distinguish a missing lease context from a malformed one. Also ignore lease v2 contexts for SMB 2.1, where they are not valid. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb/server: fix debug log endianness in smb2_cancel()ChenXiaoSong1-1/+2
Convert to CPU byte order to avoid incorrect debug log on big-endian architectures. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: track the connection owning a byte-range lockNamjae Jeon3-8/+26
SMB2_LOCK adds each granted byte-range lock to both the file lock list and the lock list of the connection which handled the request. The final close and durable handle paths, however, remove the connection list entry while holding fp->conn->llist_lock. With SMB3 multichannel, the connection handling the LOCK request can be different from the connection which opened the file. The entry can therefore be removed under a different spinlock from the one protecting the list it belongs to. A concurrent traversal can then access freed struct ksmbd_lock and struct file_lock objects. Record the connection owning each lock's clist entry and hold a reference to it while the entry is linked. Use that connection and its llist_lock for unlock, rollback, close, and durable preserve. Durable reconnect assigns the new connection as the owner when publishing the locks again. Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel") Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
11 dayssmb: client: refactor ACL setting control flow in id_mode_to_cifs_acl()Ralph Boehme1-6/+10
Refactor the control flow in id_mode_to_cifs_acl() to reduce nesting and prevent error code overwriting. Instead of wrapping the call to ops->set_acl() in a conditional block, introduce early exits (goto id_mode_to_cifs_acl_exit) when build_sec_desc() fails or ops->set_acl is NULL. This ensures that any actual error returned by build_sec_desc() is not overwritten with -EOPNOTSUPP. Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix query directory replay double-freeHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_query_directory_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix change notify replay double-freeHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_notify_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix query_info() replay double-freeHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_query_info_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix double-free in SMB2_close() replayHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_close_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix double-free in SMB2_ioctl() replayHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_ioctl_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix double-free in SMB2_open() replayHenrique Carvalho1-0/+2
A response-bearing attempt can return a replayable error and free its response buffer. If SMB2_open_init() fails before the next send, cleanup retains the previous buffer type and frees that response again. Reset response bookkeeping before each attempt to prevent the stale free. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
13 dayssmb: client: fix double-free in SMB2_flush() replayZhao Zhang1-0/+2
SMB2_flush() keeps its response buffer bookkeeping across replay attempts. If a replayable flush response is received and the retry then fails before cifs_send_recv() stores a replacement response, flush_exit will free the stale response pointer a second time. Reinitialize resp_buftype and rsp_iov at the top of the replay loop so cleanup only acts on response state produced by the current attempt. This fixes a double-free without changing replay handling for successful requests. Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set") Cc: stable@vger.kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Assisted-by: Codex:GPT-5.4 Acked-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Zhao Zhang <zzhan461@ucr.edu> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
13 daysMerge tag 'v7.2-rc-part1-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds35-273/+1344
Pull smb server updates from Steve French: - Use after free fixes - Out of bounds read fix - Add SMB compression support both at rest and over the wire: support decompression of compressed SMB2 requests, initially allow compressed SMB2 READ responses, and implement get/set compression operations for per-file compression state. - Credentials fixes: for various FSCTLs, setinfo, delete on close and for alternate data streams - Fix access checks and permission checks in DUPLICAT_EXTENTS and SET_ZERO_DATA fsctls, find_file_posix_info, FILE_LINK_INFORMATION and smb2_set_info_sec - Reject non valid session in compound request - Serialize QUERY_DIRECTORY - Prevent path traversal bypass by restricting caseless retry - Path lookup fix - Two minor cleanup fixes * tag 'v7.2-rc-part1-ksmbd-fixes' of git://git.samba.org/ksmbd: (31 commits) ksmbd: fix path resolution in ksmbd_vfs_kern_path_create ksmbd: use opener credentials for FSCTL mutations ksmbd: use opener credentials for ADS I/O ksmbd: require source read access for duplicate extents ksmbd: run set info with opener credentials ksmbd: use opener credentials for delete-on-close ksmbd: serialize QUERY_DIRECTORY requests per file ksmbd: add permission checks for FSCTL_DUPLICATE_EXTENTS_TO_FILE ksmbd: enforce FILE_READ_ATTRIBUTES on SMB_FIND_FILE_POSIX_INFORMATION ksmbd: reject non-VALID session in compound request branch ksmbd: compress SMB2 READ responses ksmbd: negotiate and decode SMB2 compression cifs: negotiate chained SMB2 compression capabilities smb: add common SMB2 compression transform helpers smb: move LZ77 compression into common code ksmbd: add per-handle permission check to FILE_LINK_INFORMATION ksmbd: add a permission check for FSCTL_SET_ZERO_DATA ksmbd: add a WRITE_DAC/WRITE_OWNER check to SMB2 SET_INFO SECURITY ksmbd: fix use-after-free of a deferred file_lock on SMB2_CLOSE then SMB2_CANCEL smb: server: remove code guarded by nonexistent config option ...
13 daysMerge tag 'v7.2-rc-part1-smb3-client-fixes' of ↵Linus Torvalds18-228/+721
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - Three cleanup patches - Fix error return value in smb2_aead_req_alloc - Three compression fixes - Update i_blocks after write (fixes various xfstests) - Fix races in cifsd thread creation - Fix potential out of bounds read parsing security descriptors - Witness protocol fix - Fix umount bug - Mount fix - Fix cached directory entries on unlink/rmdir/rename * tag 'v7.2-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: Use more common code in SMB2_tcon() smb: client: Use more common error handling code in smb3_reconfigure() smb/client: Fix error code in smb2_aead_req_alloc() smb/client: clean up a type issue in cifs_xattr_get() smb/client: allow FS_IOC_SETFLAGS to clear compression smb/client: use writable handle for FS_IOC_SETFLAGS compression smb/client: always return a value for FS_IOC_GETFLAGS smb/client: update i_blocks after contiguous writes smb: client: fix races in cifsd thread creation cifs: validate full SID length in security descriptors smb: client: resolve SWN tcon from live registrations cifs: remove all cifs files before kill super smb: client: fix conflicting option validation for new mount API cifs: invalidate cfid on unlink/rename/rmdir
2026-06-16ksmbd: fix path resolution in ksmbd_vfs_kern_path_createDavide Ornaghi3-40/+21
The SMB2 open lookup is rooted at the share with LOOKUP_BENEATH, but the create/mkdir/hardlink sink is not: ksmbd_vfs_kern_path_create() builds an absolute path with convert_to_unix_name() and resolves it from AT_FDCWD via start_creating_path(), so a ".." component is walked from the real filesystem root and escapes the export. An authenticated client races a missing path component so the rooted open lookup returns -ENOENT (taking the create branch) while the same component is present (a directory) when the create walk runs; the create then resolves ".." out of the share. Root the create walk at the share like the lookup and rename paths already are: resolve the parent with vfs_path_parent_lookup(..., LOOKUP_BENEATH, &share_conf->vfs_path) and create the final component with start_creating_noperm(). convert_to_unix_name() then has no callers and is removed. Fixes: 265fd1991c1d ("ksmbd: use LOOKUP_BENEATH to prevent the out of share access") Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for FSCTL mutationsNamjae Jeon2-7/+20
SET_SPARSE, SET_ZERO_DATA and SET_COMPRESSION operate on an open SMB handle but call VFS xattr, fallocate or fileattr helpers with the current ksmbd worker credentials. Those helpers can revalidate inode permissions, ownership and LSM policy independently of the SMB handle access mask. Run each operation with the credentials captured in the target file when the handle was opened. Keep credential handling local to these single-file FSCTLs rather than applying session credentials to the complete IOCTL handler, which also contains handle-less and multi-handle operations. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for ADS I/ONamjae Jeon1-2/+9
Alternate data streams are stored as xattrs. Unlike regular file I/O, their read and write paths therefore call VFS xattr helpers which recheck inode permissions and LSM policy using the current task credentials. Run ADS I/O with the credentials captured when the SMB handle was opened. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: require source read access for duplicate extentsNamjae Jeon1-0/+4
FSCTL_DUPLICATE_EXTENTS_TO_FILE passes the source file directly to vfs_clone_file_range() or vfs_copy_file_range() without checking the SMB access mask granted to the source handle. A handle opened with attribute access can consequently be used to copy file contents into an attacker-readable destination. Require FILE_READ_DATA on the source handle before either VFS operation, matching other ksmbd data-copy paths. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: run set info with opener credentialsNamjae Jeon1-5/+3
SMB2 SET_INFO handlers call path-based VFS helpers after checking the access mask granted to the SMB handle. Those helpers perform their owner, inode permission and LSM checks using the current ksmbd worker credentials. Run the complete SET_INFO dispatch with the credentials captured when the handle was opened. This also removes the separate security information credential setup and keeps all SET_INFO classes under one credential scope. Direct override_creds() is used because it can nest with the request credential overrides already used by rename and link helpers. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: use opener credentials for delete-on-closeNamjae Jeon2-2/+9
Delete-on-close can be completed by deferred or durable handle teardown, where no request work is available. Both the base-file unlink and the ADS xattr removal consequently run with the ksmbd worker credentials and can bypass filesystem permission checks. Run both operations with the credentials captured in struct file when the handle was opened. This preserves the authenticated user's fsuid, fsgid, supplementary groups and capability restrictions at final close. Cc: stable@vger.kernel.org Reported-by: Musaab Khan <musaab.khan@protonmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: serialize QUERY_DIRECTORY requests per fileNamjae Jeon3-0/+7
smb2_query_dir() stores a pointer to its stack-allocated private data in the ksmbd_file readdir_data. Concurrent QUERY_DIRECTORY requests using the same file handle can overwrite this pointer while an iterate_dir() callback is still using it, resulting in a stack use-after-free. Add a per-file mutex and hold it while accessing the shared directory enumeration state. The lock covers scan restart, dot entry state, readdir_data setup and iteration, and response construction. This prevents another request from replacing readdir_data.private before the current request has finished using it and also serializes the shared file position. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-30527 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add permission checks for FSCTL_DUPLICATE_EXTENTS_TO_FILEGil Portnoy1-0/+11
The FSCTL_DUPLICATE_EXTENTS_TO_FILE arm of smb2_ioctl() overwrites the destination file's data via vfs_clone_file_range() with neither the share-level KSMBD_TREE_CONN_FLAG_WRITABLE check nor a per-handle fp->daccess check that the other write-bearing arms carry. A client can overwrite destination data on a read-only share, or from a handle opened with only FILE_WRITE_ATTRIBUTES (which still yields an FMODE_WRITE filp). FILE_WRITE_ATTRIBUTES-only destination handle overwrote the file's data via the clone. Add both checks, matching the FSCTL_SET_SPARSE permission fix; require FILE_WRITE_DATA since this writes data. Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: enforce FILE_READ_ATTRIBUTES on SMB_FIND_FILE_POSIX_INFORMATIONGil Portnoy1-0/+6
find_file_posix_info() in smb2_query_info() returns file metadata (owner uid, group gid, mode, inode, size, allocation size, hard-link count and all four timestamps) but performs no per-handle access check. Every sibling query handler gates on the handle's granted access first -- get_file_basic_info(), get_file_all_info(), get_file_network_open_info() and get_file_attribute_tag_info() all reject a handle lacking FILE_READ_ATTRIBUTES_LE with -EACCES. The POSIX handler is gated only by the connection-scoped tcon->posix_extensions flag, which is not a per-handle authorization, so a handle opened with only FILE_WRITE_DATA is correctly denied FileBasicInformation yet is allowed the strict-superset POSIX info. Mirror the FILE_READ_ATTRIBUTES_LE gate the sibling info handlers already use. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: reject non-VALID session in compound request branchGil Portnoy1-0/+5
smb2_check_user_session() takes a shortcut for any operation that is not the first in a COMPOUND request: it reuses work->sess (the session bound by the first operation) and validates only the SessionId, then returns "valid". It never re-checks work->sess->state == SMB2_SESSION_VALID, and a SessionId of 0xFFFFFFFFFFFFFFFF (ULLONG_MAX, the MS-SMB2 related-operation value) skips even the id comparison. The standalone path (ksmbd_session_lookup_all() plus the SESSION_SETUP state machine) does enforce the VALID state; the compound branch bypasses all of it. A SESSION_SETUP carrying only an NTLM Type-1 (NtLmNegotiate) blob publishes a fresh SMB2_SESSION_IN_PROGRESS session whose sess->user is still NULL (->user is assigned later, by ntlm_authenticate()). Used as operation 1 of a COMPOUND with operation 2 = TREE_CONNECT (related, SessionId=ULLONG_MAX, \\host\IPC$), the tree-connect then runs on that IN_PROGRESS session and reaches ksmbd_ipc_tree_connect_request(), which dereferences user_name(sess->user) with sess->user == NULL (transport_ipc.c:687/701/704) -> remote NULL-pointer dereference and a kernel Oops that wedges the ksmbd worker for all clients. Reject any non-first compound operation that lands on a session which is not SMB2_SESSION_VALID, mirroring the validity the standalone lookup path enforces. SESSION_SETUP itself legitimately runs on an IN_PROGRESS session, but it is never carried as a non-first compound operation, so multi-leg authentication is unaffected by this check. Fixes: 5005bcb42191 ("ksmbd: validate session id and tree id in the compound request") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: compress SMB2 READ responsesNamjae Jeon6-0/+157
Handle SMB2_READFLAG_REQUEST_COMPRESSED for non-RDMA reads. Flatten the response iov, emit chained or unchained LZ77 transforms when compression is beneficial, and retain the generated buffer until the work item is released. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: negotiate and decode SMB2 compressionNamjae Jeon7-13/+210
Parse the SMB 3.1.1 compression capabilities context and negotiate LZ77 with optional chained Pattern_V1 support. Advertise compression on tree connections and decode compressed requests before normal SMB dispatch. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16cifs: negotiate chained SMB2 compression capabilitiesNamjae Jeon2-9/+37
Advertise LZ77 and Pattern_V1 with chained transform support in the SMB 3.1.1 compression negotiate context. Validate the server's returned algorithm list and flags, then retain the negotiated capabilities for a future compressed transform receive implementation. This patch only negotiates capabilities. It does not request compressed READ responses or add a compressed transform receive path. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: add common SMB2 compression transform helpersNamjae Jeon4-8/+410
Implement common validation, compression and decompression for SMB2 compression transforms. Support unchained LZ77 and chained NONE, LZ77 and Pattern_V1 payloads. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: move LZ77 compression into common codeNamjae Jeon7-44/+176
Move the LZ77 codec in cifs.ko to smb/common/ so both the SMB client and ksmbd can use it. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add per-handle permission check to FILE_LINK_INFORMATIONGil Portnoy1-0/+5
The FILE_LINK_INFORMATION arm of smb2_set_info_file() calls smb2_create_link() with no per-handle fp->daccess check. On the ReplaceIfExists path smb2_create_link() unlinks an existing file at the target name (ksmbd_vfs_remove_file) and creates a hardlink (ksmbd_vfs_link); neither helper checks daccess. A handle opened with FILE_READ_DATA only (no FILE_DELETE, no FILE_WRITE_DATA) can therefore delete an arbitrary file in the share and plant a hardlink over its name. The sibling delete/move arms in the same switch already gate: FILE_RENAME_INFORMATION and FILE_DISPOSITION_INFORMATION both require FILE_DELETE_LE; FILE_FULL_EA_INFORMATION requires FILE_WRITE_EA_LE. Gate the link arm the same way as its closest analogue (rename), since it mutates the namespace and, on replace, deletes an existing entry. This is a sibling of commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE"). Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add a permission check for FSCTL_SET_ZERO_DATAGil Portnoy1-0/+6
FSCTL_SET_ZERO_DATA in smb2_ioctl() destroys file data via ksmbd_vfs_zero_data() -> vfs_fallocate(PUNCH_HOLE/ZERO_RANGE) after checking only the share-level KSMBD_TREE_CONN_FLAG_WRITABLE, with no per-handle access check. A handle opened with only FILE_WRITE_ATTRIBUTES still yields an FMODE_WRITE filp (FILE_WRITE_ATTRIBUTES is part of FILE_WRITE_DESIRE_ACCESS_LE, so smb2_create_open_flags() opens it O_WRONLY), so the vfs_fallocate FMODE_WRITE check does not stop it; only the missing fp->daccess gate would. Reproduced on mainline 7.1-rc7 with KASAN by an authenticated SMB client: a FILE_WRITE_ATTRIBUTES-only handle zeroed 4096 bytes of file data it had no FILE_WRITE_DATA right to (6/6; a FILE_READ_DATA-only handle was correctly denied). This is the unfixed sibling of commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE"). Because SET_ZERO_DATA writes data (not an attribute), require FILE_WRITE_DATA. Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: add a WRITE_DAC/WRITE_OWNER check to SMB2 SET_INFO SECURITYGil Portnoy1-0/+3
commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE") added a fp->daccess gate to fsctl_set_sparse and noted that "similar handle-level checks exist in other functions but are missing here." The SMB2 SET_INFO SECURITY arm is one of the missing ones, and the most security-relevant: smb2_set_info_sec() calls set_info_sec() with no per-handle access check. set_info_sec() (fs/smb/server/smbacl.c) re-permissions the file: it rewrites owner/group/mode via notify_change(), rewrites the POSIX ACL via set_posix_acl(), and on KSMBD_SHARE_FLAG_ACL_XATTR shares removes and rewrites the Windows security descriptor via ksmbd_vfs_set_sd_xattr(). Every other persistent-mutation arm of the sibling handler smb2_set_info_file() checks fp->daccess first (FILE_WRITE_DATA / FILE_DELETE / FILE_WRITE_EA / FILE_WRITE_ATTRIBUTES); the SECURITY arm — which mutates the access control itself — is the only one with no gate. A client can therefore open a handle with FILE_WRITE_ATTRIBUTES only (no FILE_WRITE_DAC / FILE_WRITE_OWNER) and use SMB2_SET_INFO with InfoType SMB2_O_INFO_SECURITY to rewrite the file's DACL and owner, granting itself access the handle's daccess never carried. Unlike the FSCTL data arms this is a metadata/xattr operation, so there is no FMODE_WRITE VFS backstop — the missing fp->daccess check is the entire gate. Setting a security descriptor is the WRITE_DAC / WRITE_OWNER operation, so require at least one of those on the handle before re-permissioning the file. -EACCES is mapped to STATUS_ACCESS_DENIED by smb2_set_info(). Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix use-after-free of a deferred file_lock on SMB2_CLOSE then SMB2_CANCELGil Portnoy1-7/+7
Commit f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL") made smb2_cancel() skip a work whose state is KSMBD_WORK_CANCELLED, so its cancel_fn cannot be fired a second time. But KSMBD_WORK has three states (ACTIVE, CANCELLED, CLOSED), and the same freeing producer path is reached for CLOSED too: SMB2_CLOSE on the locking handle -> set_close_state_blocked_works() sets the deferred work's state to KSMBD_WORK_CLOSED and wakes the smb2_lock() worker. The worker takes the non-ACTIVE early-exit, locks_free_lock()s the file_lock and, because the state is not KSMBD_WORK_CANCELLED, takes the STATUS_RANGE_NOT_LOCKED branch with "goto out2" -- which, like the cancelled branch, skips release_async_work(). The work stays on conn->async_requests with a live cancel_fn = smb2_remove_blocked_lock pointing at the freed file_lock. A subsequent SMB2_CANCEL for the same AsyncId then passes the KSMBD_WORK_CANCELLED-only guard (its state is KSMBD_WORK_CLOSED), so smb2_cancel() fires cancel_fn again over the freed file_lock -- the same use-after-free fixed, via SMB2_CLOSE instead of a first SMB2_CANCEL: BUG: KASAN: slab-use-after-free in __locks_delete_block __locks_delete_block locks_delete_block ksmbd_vfs_posix_lock_unblock smb2_remove_blocked_lock smb2_cancel <- 2nd SMB2_CANCEL fires cancel_fn handle_ksmbd_work Allocated by ...: locks_alloc_lock <- smb2_lock Freed by ...: locks_free_lock <- smb2_lock (non-ACTIVE early-exit) ... cache file_lock_cache of size 192 Reproduced on mainline 7.1-rc7 (which already contains f580d27e8928) with KASAN by an authenticated SMB client; the double-SMB2_CANCEL control is silent on that kernel, so the splat is attributable to the CLOSE trigger. Only an ACTIVE deferred work may have its cancel_fn fired: both terminal states (CANCELLED and CLOSED) reach the smb2_lock() early-exit that frees the file_lock and skips release_async_work(). Guard on KSMBD_WORK_ACTIVE so any non-active work is skipped. Fixes: f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL") Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: server: remove code guarded by nonexistent config optionEthan Nelson-Moore1-3/+0
A small piece of code in fs/smb/server/smb_common.c depends on CONFIG_SMB_INSECURE_SERVER, which has never been defined in the mainline kernel, but was present in old out-of-tree versions of ksmbd. Remove this dead code. Discovered while searching for CONFIG_* symbols referenced in code but not defined in any Kconfig file. Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: fix incorrect file size in get_file_compression_info()ChenXiaoSong1-1/+1
Before this patch, we got the wrong file size: - client: touch /mnt/file - client: smbinfo setcompression default /mnt/file - client: dd if=/dev/zero of=/mnt/file bs=1 count=1000 - client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 4096 Compression Format: 2 (LZNT1) After this patch, we get the correct file size: - client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 1000 Compression Format: 2 (LZNT1) Note that the actual compressed file size must be got by other methods. For Btrfs, use the following command to get actual compressed file size: - server: compsize /export/file Processed 1 file, 0 regular extents (0 refs), 1 inline. Type Perc Disk Usage Uncompressed Referenced TOTAL 4% 47B 1000B 1000B zlib 4% 47B 1000B 1000B Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: get compression format in get_file_compression_info()ChenXiaoSong1-1/+6
I have added `filecompressioninfo` subcommand to `smbinfo` in cifs-utils.git. Example: 1. client: smbinfo setcompression lznt1 /mnt/file 2. client: smbinfo filecompressioninfo /mnt/file Compressed File Size: 104857600 Compression Format: 2 (LZNT1) Compression Unit Shift: 0 Chunk Shift: 0 Cluster Shift: 0 Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: implement FSCTL_SET_COMPRESSION ioctl handlerChenXiaoSong4-1/+100
Example: 1. client: smbinfo setcompression no /mnt/file 2. client: smbinfo getcompression /mnt/file Compression: 0 (NONE) 3. client: smbinfo setcompression lznt1 /mnt/file 4. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) 5. client: smbinfo setcompression default /mnt/file 6. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: implement FSCTL_GET_COMPRESSION ioctl handlerChenXiaoSong4-1/+53
Example: 1. server: chattr +c /export/file 2. client: smbinfo getcompression /mnt/file Compression: 2 (LZNT1) Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb/server: get compression file attribute on openChenXiaoSong3-1/+23
Example: 1. server: chattr +c /export/file 2. client: lsattr /mnt/file --------c------------- /mnt/file Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: move compression definitions into common/fscc.hChenXiaoSong4-12/+18
These definitions will also be used by ksmbd, move them into common header file. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16smb: remove duplicate server/smbfsctl.hChenXiaoSong3-97/+12
Rename the following places: - FSCTL_COPYCHUNK -> FSCTL_SRV_COPYCHUNK - FSCTL_COPYCHUNK_WRITE -> FSCTL_SRV_COPYCHUNK_WRITE - FSCTL_REQUEST_RESUME_KEY -> FSCTL_SRV_REQUEST_RESUME_KEY server/smbfsctl.h contains the following additional definitions compared to common/smbfsctl.h: - IO_REPARSE_TAG_LX_SYMLINK_LE - IO_REPARSE_TAG_AF_UNIX_LE - IO_REPARSE_TAG_LX_FIFO_LE - IO_REPARSE_TAG_LX_CHR_LE - IO_REPARSE_TAG_LX_BLK_LE Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: prevent path traversal bypass by restricting caseless retryNamjae Jeon1-1/+1
ksmbd_vfs_path_lookup() enforces LOOKUP_BENEATH to restrict path resolution within the share root. When a crafted path attempts to escape the share boundary using parent-directory components ('..'), vfs_path_parent_lookup() detects this and immediately fails, returning -EXDEV. However, a bug exists in __ksmbd_vfs_kern_path() under caseless mode. The function fails to intercept the -EXDEV error and erroneously falls through to the caseless retry logic, which is intended only for genuinely missing files. During this retry process, the path is reconstructed, leading to an unintended LOOKUP_BENEATH bypass that allows write-capable users to create zero-length files or directories outside the exported share. Fix this by ensuring that the execution only proceeds to the caseless lookup retry when the error is specifically -ENOENT. Any other errors, such as -EXDEV from a path traversal attempt, must be returned immediately. Cc: stable@vger.kernel.org Reported-by: Y s65 <yu4ys@outlook.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix UAF of struct file_lock in SMB2_LOCK deferred-lock cancellationDavide Ornaghi1-18/+16
When a blocking byte-range lock request is deferred in the FILE_LOCK_DEFERRED path, ksmbd registers the asynchronous work into the connection's async_requests list via setup_async_work(). The cancel callback smb2_remove_blocked_lock() holds a reference to the flock. If the lock waiter is subsequently woken up but the work state is no longer KSMBD_WORK_ACTIVE (e.g., due to a concurrent cancellation), the cleanup path calls locks_free_lock(flock) without dequeuing the work from the async_requests list. Concurrently, smb2_cancel() walks the list under conn->request_lock and invokes the cancel callback, which then dereferences the already freed 'flock'. This leads to a slab-use-after-free inside __wake_up_common. Fix this by restructuring the cleanup logic after the worker returns from ksmbd_vfs_posix_lock_wait(). Move list_del(&smb_lock->llist) and release_async_work(work) to the top of the cleanup block. This guarantees that the async work is completely dequeued and serialized under conn->request_lock before locks_free_lock(flock) is called, rendering the flock unreachable for any concurrent smb2_cancel(). Cc: stable@vger.kernel.org Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix use-after-free in same_client_has_lease()Guangshuo Li1-0/+6
same_client_has_lease() returns an opinfo pointer from ci->m_op_list after dropping ci->m_lock without taking a reference. smb_grant_oplock() then dereferences that pointer in copy_lease() and when checking breaking_cnt. A concurrent close can remove the old lease from ci->m_op_list and drop the last reference before the caller uses the returned pointer, leading to a use-after-free. Take a reference when same_client_has_lease() selects an existing lease, drop any previous match while scanning, and release the returned reference in smb_grant_oplock() after copying the lease state. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-16ksmbd: fix out-of-bounds read in smb_check_perm_dacl()Hem Parekh1-1/+3
The permission-check ACE walk in smb_check_perm_dacl() validates the ACE header size and caps sid.num_subauth at SID_MAX_SUB_AUTHORITIES, but it never checks that ace->size is actually large enough to contain num_subauth sub-authorities before compare_sids() dereferences them. CIFS_SID_BASE_SIZE covers the SID header up to but excluding the sub_auth[] array, and offsetof(struct smb_ace, sid) is the ACE header, so the existing guards only guarantee the 8-byte SID base, i.e. zero sub-authorities. compare_sids() then reads ace->sid.sub_auth[i] for i < min(local_sid->num_subauth, ace->sid.num_subauth). The local comparison SIDs (sid_everyone, sid_unix_NFS_mode, and the id_to_sid() result) always have at least one sub-authority, and an attacker controls the ACE revision and authority bytes (which lie within the in-bounds SID base), so they can match one of those SIDs and force the sub_auth read. A crafted ACE with size == 16 and num_subauth >= 1 placed at the tail of the security descriptor therefore causes a heap out-of-bounds read of up to SID_MAX_SUB_AUTHORITIES * sizeof(__le32) bytes past the pntsd allocation. The security descriptor is loaded by ksmbd_vfs_get_sd_xattr() into a buffer sized exactly to the on-disk data (kzalloc(sd_size) in ndr_decode_v4_ntacl()), so the read lands past the allocation. The malformed descriptor can be stored verbatim via SMB2_SET_INFO (the DACL is not normalised before being written to the security.NTACL xattr) and the read fires on a subsequent SMB2_CREATE access check, making this reachable by an authenticated client on a share that uses ACL xattrs. Add the missing num_subauth-versus-ace_size check, mirroring the identical guards already present in the sibling parsers parse_dacl() and smb_inherit_dacl(). Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") Cc: stable@vger.kernel.org Signed-off-by: Hem Parekh <hemparekh1596@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-15Merge tag 'pull-dcache' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache updates from Al Viro: - d_alloc_parallel() API change (Neil's with my changes) - NORCU fixes - Reorganization and simplification of dentry eviction logic - Simplifying rcu_read_lock() scopes in fs/dcache.c - Secondary roots work - getting rid of NFS fake root dentries and dealing with remaining shrink_dcache_for_umount() and shrink_dentry_list() races - making cursors NORCU (surprisingly easy) * tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (22 commits) make cursors NORCU nfs: get rid of fake root dentries wind ->s_roots via ->d_sib instead of ->d_hash shrink_dentry_tree(): unify the calls of shrink_dentry_list() shrinking rcu_read_lock() scope in d_alloc_parallel() d_walk(): shrink rcu_read_lock() scope document dentry_kill() adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Document rcu_read_lock() use in select_collect2() Shift rcu_read_{,un}lock() inside fast_dput() simplify safety for lock_for_kill() slowpath fold lock_for_kill() and __dentry_kill() into common helper fold lock_for_kill() into shrink_kill() shrink_dentry_list(): start with removing from shrink list d_prune_aliases(): make sure to skip NORCU aliases kill d_dispose_if_unused() make to_shrink_list() return whether it has moved dentry to list select_collect(): ignore dentries on shrink lists if they have positive refcounts find_acceptable_alias(): skip NORCU aliases with zero refcount fix a race between d_find_any_alias() and final dput() of NORCU dentries ...
2026-06-15Merge tag 'vfs-7.2-rc1.super' of ↵Linus Torvalds5-22/+34
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs superblock updates from Christian Brauner: "This retires sget(). CIFS plus the two ext4 KUnit tests (extents-test, mballoc-test) were the last in-tree callers, and all three convert cleanly to sget_fc(). That lets sget() and its prototype come out, taking ~60 lines that only existed to be kept in lockstep with sget_fc() on every publish-path change" * tag 'vfs-7.2-rc1.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: retire sget() smb: client: convert cifs_smb3_do_mount() to sget_fc() ext4: convert mballoc KUnit test to sget_fc() ext4: convert extents KUnit test to sget_fc()
2026-06-15Merge tag 'vfs-7.2-rc1.openat2' of ↵Linus Torvalds2-15/+18
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull openat2 updates from Christian Brauner: "Features: - Add O_EMPTYPATH to openat(2)/openat2(2). To get an operable file descriptor from an O_PATH file descriptor it is possible to use openat(fd, ".", O_DIRECTORY) for directories, but other file types require going through open("/proc/<pid>/fd/<nr>") and thus depend on a functioning procfs. With O_EMPTYPATH an empty path string is accepted and LOOKUP_EMPTY is set at path resolution time, allowing to reopen the file behind the file descriptor directly. Selftests are included. - Add an OPENAT2_REGULAR flag for openat2(2) which refuses to open anything but regular files with the new EFTYPE error code. This implements the "ability to only open regular files" feature requested by userspace via uapi-group.org and protects services from being redirected to fifos, device nodes, and friends. All atomic_open implementations were audited for OPENAT2_REGULAR handling. Explicit checks were added to ceph, gfs2, nfs (v4), and cifs/smb - these are the filesystems whose atomic_open can encounter an existing non-regular file and would otherwise call finish_open() on it or return a misleading error code. The remaining implementations (9p, fuse, vboxsf, nfs v2/v3) only call finish_open() on freshly created files and use finish_no_open() for lookup hits, letting the VFS catch non-regular files via the do_open() safety net. Cleanups: - Migrate the openat2 selftests to the kselftest harness and move them under selftests/filesystems/. The tests were written in the early days of selftests' TAP support and the modern kselftest harness is much easier to follow and maintain. The contents of the tests are unchanged and the new emptypath tests are ported on top. - Make the LAST_XXX last-type constants private to fs/namei.c. The only user outside of fs/namei.c was ksmbd which only needs to know whether the last component is a regular one, so vfs_path_parent_lookup() now performs the LAST_NORM check internally. The ints are replaced with a dedicated enum last_type" * tag 'vfs-7.2-rc1.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: replace ints with enum last_type for LAST_XXX vfs: make LAST_XXX private to fs/namei.c selftests: openat2: port emptypath_test to kselftest harness kselftest/openat2: test for OPENAT2_REGULAR flag openat2: new OPENAT2_REGULAR flag support openat2: introduce EFTYPE error code selftest: add tests for O_EMPTYPATH vfs: add O_EMPTYPATH to openat(2)/openat2(2) selftests: openat2: migrate to kselftest harness selftests: openat2: switch from custom ARRAY_LEN to ARRAY_SIZE selftests: openat2: move helpers to header selftests: move openat2 tests to selftests/filesystems/
2026-06-15Merge tag 'vfs-7.2-rc1.casefold' of ↵Linus Torvalds4-6/+81
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs casefolding updates from Christian Brauner: "This exposes the case folding behavior of local filesystems so that file servers - nfsd, ksmbd, and user space file servers - can report the actual behavior to clients instead of guessing. Filesystems report case-insensitive and case-nonpreserving behavior via new file_kattr flags in their fileattr_get implementations. fat, exfat, ntfs3, hfs, hfsplus, xfs, cifs, nfs, vboxsf, and isofs are wired up. Local filesystems that are not explicitly handled default to the usual POSIX behavior of case-sensitive and case-preserving. nfsd uses this to report case folding via NFSv3 PATHCONF and to implement the NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING attributes - both have been part of the NFS protocols for decades to support clients on non-POSIX systems - and ksmbd reports it via FS_ATTRIBUTE_INFORMATION. Exposing the information through the fileattr uapi covers user space file servers. The immediate motivation is interoperability: Windows NFS clients hard-require servers to report case-insensitivity for Win32 applications to work correctly, and a client that knows the server is case-insensitive can avoid issuing multiple LOOKUP/READDIR requests searching for case variants. The Linux NFS client already grew support for case-insensitive shares years ago in support of the Hammerspace NFS server - negative dentry caching must be disabled (a lookup for "FILE.TXT" failing must not cache a negative entry when "file.txt" exists) and directory change invalidation must drop cached case-folded name variants. Such servers often operate in multi-protocol environments where a single file service instance caters to both NFS and SMB clients, and nfsd needs to report case folding properly to participate as a first-class citizen there. A follow-up series brings fixes for the initial work: the nfsd case-info probe now uses kernel credentials, maps -ESTALE to NFS3ERR_STALE, and has its cost capped across READDIR entries; the nfs client avoids transiently zeroed case capability bits during the probe and skips the pathconf probe when neither field is consumed; the FS_CASEFOLD_FL semantics are clarified in the UAPI header; and the tools UAPI headers are synced" * tag 'vfs-7.2-rc1.casefold' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits) nfsd: Cap case-folding probe cost across READDIR entries nfsd: Map -ESTALE from case probe to NFS3ERR_STALE nfsd: Use kernel credentials for case-info probe fs: Clarify FS_CASEFOLD_FL semantics in UAPI header nfs: Skip pathconf probe when neither field is consumed nfs: Avoid transient zeroed case capability bits during probe tools headers UAPI: Sync case-sensitivity flags from linux/fs.h ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING nfsd: Report export case-folding via NFSv3 PATHCONF isofs: Implement fileattr_get for case sensitivity vboxsf: Implement fileattr_get for case sensitivity nfs: Implement fileattr_get for case sensitivity cifs: Implement fileattr_get for case sensitivity xfs: Report case sensitivity in fileattr_get hfsplus: Report case sensitivity in fileattr_get hfs: Implement fileattr_get for case sensitivity ntfs3: Implement fileattr_get for case sensitivity exfat: Implement fileattr_get for case sensitivity fat: Implement fileattr_get for case sensitivity ...
2026-06-15Merge tag 'vfs-7.2-rc1.inode' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode updates from Christian Brauner: "This extends the lockless ->i_count handling. iput() could already decrement any value greater than one locklessly but acquiring a reference always required taking inode->i_lock. Now acquiring a reference is lockless as long as the count was already at least 1, i.e., only the 0->1 and 1->0 transitions take the lock. This avoids the lock for the common cases of nfs calling into the inode hash and btrfs using igrab(). Cleanup-wise icount_read_once() is added to line up with inode_state_read_once() and the open-coded ->i_count loads across the tree are converted, and ihold() is relocated and tidied up. On top of that some stale lock ordering annotations are retired from the inode hash code: iunique() no longer takes the hash lock since the inode hash became RCU-searchable and s_inode_list_lock is no longer taken under the hash lock either" * tag 'vfs-7.2-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: retire stale lock ordering annotations from inode hash fs: allow lockless ->i_count bumps as long as it does not transition 0->1 fs: relocate and tidy up ihold() fs: add icount_read_once() and stop open-coding ->i_count loads
2026-06-14smb: client: Use more common code in SMB2_tcon()Markus Elfring1-6/+5
Use an additional label so that a bit of common code can be better reused at the end of this function implementation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb: client: Use more common error handling code in smb3_reconfigure()Markus Elfring1-4/+3
Use an additional label so that a bit of exception handling can be better reused at the end of this function implementation. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: Fix error code in smb2_aead_req_alloc()Dan Carpenter1-3/+5
The "*num_sgs" variable is a u32 so "ERR_PTR(*num_sgs)" doesn't work. We would have to do something similar to the previous line where it's cast to int and then long. However, it's simpler to store the return in an int ret variable. This bug would eventually result in a crash when dereference the invalid error pointer. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Cc: stable@kernel.org Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: clean up a type issue in cifs_xattr_get()Dan Carpenter1-2/+2
The cifs_xattr_get() function returns type int, not ssize_t so declare "rc" as int as well. This has no effect on runtime. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: allow FS_IOC_SETFLAGS to clear compressionHuiwen He8-24/+45
The CIFS FS_IOC_SETFLAGS path can set FS_COMPR_FL now, but it cannot clear it again. This can be reproduced on a share backed by a filesystem that supports compression, for example btrfs exported by Samba: [compress_share] vfs objects = btrfs $ touch test.bin $ chattr +c test.bin $ lsattr test.bin $ chattr -c test.bin The final chattr -c fails with EOPNOTSUPP, and leaves the remote object with the compressed attribute still set, because the client always sends FSCTL_SET_COMPRESSION with COMPRESSION_FORMAT_DEFAULT. That is correct for setting FS_COMPR_FL, but clearing FS_COMPR_FL requires sending COMPRESSION_FORMAT_NONE. Fix this by passing the requested compression state through the set_compression operation. The SMB1 and SMB2 helpers no longer hard-code COMPRESSION_FORMAT_DEFAULT. When FS_COMPR_FL is set, send COMPRESSION_FORMAT_DEFAULT. When it is cleared, send COMPRESSION_FORMAT_NONE. If the server accepts the request, update the cached FILE_ATTRIBUTE_COMPRESSED bit under i_lock so FS_IOC_GETFLAGS reports the new state. Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: use writable handle for FS_IOC_SETFLAGS compressionHuiwen He1-4/+112
Setting the compressed flag on a CIFS mount can fail with -EACCES: [compress_share] vfs objects = btrfs $ touch test.bin $ chattr +c test.bin chattr: Permission denied while setting flags on test.bin This can be reproduced against a Samba share backed by a filesystem that supports compression, such as btrfs. FS_IOC_SETFLAGS is issued on the file handle opened by userspace. chattr opens the target read-only before setting FS_COMPR_FL, so the SMB client currently sends FSCTL_SET_COMPRESSION on a handle that may not have FILE_WRITE_DATA access. Samba requires FILE_WRITE_DATA for FSCTL_SET_COMPRESSION and rejects the request. Use the current handle only if it already has FILE_WRITE_DATA. Otherwise try an existing writable handle for the inode. If none is available, open a temporary FILE_WRITE_DATA handle for the compression request. After FSCTL_SET_COMPRESSION succeeds, update the cached compressed attribute immediately, matching how smb2_set_sparse() updates FILE_ATTRIBUTE_SPARSE_FILE after a successful FSCTL_SET_SPARSE. Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: always return a value for FS_IOC_GETFLAGSHuiwen He1-7/+5
Currently, repeated lsattr calls on a regular CIFS file without the compressed attribute may show random flags: $ touch test.bin $ lsattr test.bin s-S-ia-A-EjI---------m test.bin $ lsattr test.bin ------d-cEjI---------m test.bin The lsattr reproducer depends on the previous contents of its userspace buffer, so it may not reproduce on every setup. A deterministic reproducer is to initialize the ioctl argument before FS_IOC_GETFLAGS on a file without the compressed attribute: int flags = 0x7fffffff; ioctl(fd, FS_IOC_GETFLAGS, &flags); On an affected kernel, flags remains 0x7fffffff. With the fix, it is set to 0. This happens because when the cached inode does not have the compressed bit set, the CIFS fallback path in FS_IOC_GETFLAGS returns success without calling put_user() to write the zero flags value into the user buffer. As a result, the caller observes stale contents from its own buffer. Fix this by always writing the visible flags value back to the user buffer before returning success, even when the value is zero. Fixes: 64a5cfa6db94 ("Allow setting per-file compression via SMB2/3") Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb/client: update i_blocks after contiguous writesHuiwen He4-5/+61
When a lease allows CIFS to use cached inode attributes, getattr may return the locally cached attributes instead of revalidating them from the server. After local writes extend a file, the write path updates the file size, but i_blocks can remain based on the old allocation size. For example, while the file is still open after two contiguous writes, the local block count can remain smaller than the written range: after first write: st_size = 4096, st_blocks = 7 after second write: st_size = 12288, st_blocks = 21 after close: st_size = 12288, st_blocks = 24 This can make a fully written file look sparse: i_blocks * 512 < i_size and can cause swap activation to reject a valid write-created swapfile as having holes. This results in xfstests skipping swap-related tests on CIFS mounts: generic/472 [not run] swapfiles are not supported generic/494 [not run] swapfiles are not supported generic/497 [not run] swapfiles are not supported generic/569 [not run] swapfiles are not supported generic/636 [not run] swapfiles are not supported generic/643 [not run] swapfiles are not supported Update the local i_blocks estimate after successful writes, but only when the write starts at or before the currently known allocated range. This lets sequential writes grow i_blocks while avoiding treating write-past-EOF holes as allocated. Skip the local estimate for files that are already marked sparse, since their allocation needs to come from the server rather than from a contiguous-write estimate. Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb: client: fix races in cifsd thread creationFredric Cover1-11/+16
The cifsd demultiplex thread can run and access tcp_ses before the parent thread has finished populating tcp_ses, which the worker thread accesses locklessly. Also, the kthread_run macro may start the thread before returning the thread pointer. Because the pointer is part of the structure that the thread can access, if the kernel is preempted after the thread is spawned, but before the thread pointer is populated and the thread attempts to exit, it will sleep, waiting for a SIGKILL signal. Fix this by moving creation of the thread to after all of tcp_ses'es fields are populated, and spawning the thread last, using a split kthread_create/wake_up_process logic. Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14cifs: validate full SID length in security descriptorsQihang1-67/+129
parse_sid() only verified that the fixed SID header fit in the returned security descriptor, but did not verify that the full SID body described by num_subauth was present. A malicious server can return a truncated owner or group SID whose header lies within the descriptor buffer while sub_auth[] extends past the end of the allocation, leading to an out-of-bounds read when the client later parses or copies that SID. Validate the full SID body in parse_sid(), centralize owner/group SID lookup and bounds checking in sid_from_sd(), and use that validation in parse_sec_desc(), build_sec_desc(), and copy_sec_desc() before sub_auth[] is accessed. Signed-off-by: Qihang <q.h.hack.winter@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb: client: resolve SWN tcon from live registrationsMichael Bommarito2-54/+262
cifs_swn_notify() looks up a witness registration by id under cifs_swnreg_idr_mutex, drops the mutex, and then uses the registration's cached tcon pointer. That pointer is not a lifetime reference, and it is not a stable representative once cifs_get_swn_reg() lets multiple tcons for the same net/share name share one registration id. A same-share second mount can keep the cifs_swn_reg alive after the first tcon unregisters and is freed. The registration then still points at the freed first tcon, so taking tc_lock or incrementing tc_count through swnreg->tcon only moves the use-after-free earlier. Taking tc_lock while holding cifs_swnreg_idr_mutex also violates the documented CIFS lock order. Fix this by making the registration store only the stable witness identity: id, net name, share name, and notify flags. When a notify arrives, copy that identity under cifs_swnreg_idr_mutex, drop the mutex, then find and pin a live witness tcon that currently matches the net/share pair under the normal cifs_tcp_ses_lock -> tc_lock order. The notification path uses that pinned tcon directly and drops the reference when done. Registration and unregister messages now use the live tcon passed by the caller instead of a cached tcon in the registration. The final unregister send is folded into cifs_swn_unregister() while the registration is still protected by cifs_swnreg_idr_mutex. This removes the previous find/drop/reacquire raw-pointer window. The release path only removes the idr entry and frees the stable identity strings. This preserves the intended one-registration/many-tcon behavior: a registration id represents a net/share pair, and notify handling acts on a live representative selected at use time. It also preserves CLIENT_MOVE ordering for the representative tcon because the old-IP unregister is sent before cifs_swn_register() sends the new-IP register. Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14cifs: remove all cifs files before kill superJian Zhang1-0/+3
Cifs files may be put into fileinfo_put_wq during umounting cifs. After umount done, cifsFileInfo_put_final is called, which cause following BUG: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 134.222152] list_lru_add+0x64/0x1a0 [ 134.222399] ? cifs_put_tcon+0x171/0x340 [cifs] [ 134.222772] d_lru_add+0x44/0x60 [ 134.222997] dput+0x1fc/0x210 [ 134.223213] cifsFileInfo_put_final+0x11a/0x140 [cifs] [ 134.223576] process_one_work+0x17c/0x320 [ 134.223843] worker_thread+0x188/0x280 [ 134.224084] ? __pfx_worker_thread+0x10/0x10 [ 134.224366] kthread+0xcc/0x100 [ 134.224576] ? __pfx_kthread+0x10/0x10 [ 134.224827] ret_from_fork+0x30/0x50 [ 134.225063] ? __pfx_kthread+0x10/0x10 [ 134.225328] ret_from_fork_asm+0x1b/0x30 This can be reproduce by following: unshare -n bash -c " mkdir -p ${CIFS_MNT} ip netns attach root 1 ip link add eth0 type veth peer veth0 netns root ip link set eth0 up ip -n root link set veth0 up ip addr add 192.168.0.2/24 dev eth0 ip -n root addr add 192.168.0.1/24 dev veth0 ip route add default via 192.168.0.1 dev eth0 ip netns exec root sysctl net.ipv4.ip_forward=1 ip netns exec root iptables -t nat -A POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE mount -t cifs ${CIFS_PATH} ${CIFS_MNT} -o vers=3.0,sec=ntlmssp,credentials=${CIFS_CRED},rsize=65536,wsize=65536,cache=none,echo_interval=1 touch ${CIFS_MNT}/a.txt ip netns exec root iptables -t nat -D POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE " umount ${CIFS_MNT} Fixes: 340cea84f691 ("cifs: open files should not hold ref on superblock") Signed-off-by: Jian Zhang <zhangjian496@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14smb: client: fix conflicting option validation for new mount APIHenrique Carvalho1-49/+53
Apply conflicting option validation consistently across all the new mount API paths, for both mount and remount. Some checks were only applied during initial mount validation, while others were handled during option parsing, causing mount and remount/reconfigure to behave differently. Move the conflicting option checks into smb3_handle_conflicting_options() and call it from the common validation paths, including for multichannel/max_channels handling. Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-14cifs: invalidate cfid on unlink/rename/rmdirShyam Prasad N1-1/+29
Today we do not invalidate the cached_dirent or the entire parent cfid when a dentry in a dir has been removed/moved. This change invalidates the parent cfid so that we don't serve directory contents from the cache. Cc: <stable@vger.kernel.org> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-06-05VFS: use wait_var_event for waiting in d_alloc_parallel()NeilBrown1-2/+1
Parallel lookup starts with a call of d_alloc_parallel(). That primitive either returns a matching hashed dentry or allocates a new one in the in-lookup state and returns it to the caller. Once the caller is done with lookup, it indicates so either by call of d_{splice_alias,add}() or by call of d_done_lookup(); at that point dentry leaves the in-lookup state. If d_alloc_parallel() finds a matching in-lookup dentry, it must wait for that dentry to leave the in-lookup state, one way or another. Currently by supplying wait_queue_head to d_alloc_parallel(). If d_alloc_parallel() creates a new in-lookup dentry, the address of that wait_queue_head is stored in ->d_wait of new dentry and stays there while it's in the in-lookup; subsequent d_alloc_parallel() will wait on the queue found in the matching in-lookup dentry. Transition out of in-lookup state wakes waiters on that queue (if any). That works, but the calling conventions are inconvenient - the caller must supply wait_queue_head and make sure that it survives at least until the new in-lookup dentry leaves the in-lookup state. That amounts to boilerplate in the d_alloc_parallel() callers that are followed by a call of d_lookup_done() in the same function; in cases like nfs asynchronous unlink it gets worse than that. This patch changes d_alloc_parallel() to use wake_up_var_locked() to wake up waiters, and wait_var_event_spinlock() to wait. dentry->d_lock is used for synchronisation as it is already held and the relevant times. That eliminates the need of caller-supplied wait_queue_head, simplifying the calling conventions. Better yet, we only need one bit of information stored in dentry itself: whether there are any waiters to be woken up, and that can be easily stored in ->d_flags; ->d_wait goes away. The reason we need that bit (DCACHE_LOOKUP_WAITERS) is that with wait_var machinery the queues are shared with all kinds of stuff and there's no way tell if any of the waiters have anything to do with our dentry; most of the time none of them will be relevant, so we need to avoid the pointless wakeups. Another benefit of the new scheme comes from the fact that wakeups have to be done outside of write-side critical areas of ->i_dir_seq; with the old scheme we need to carry the value picked from ->d_wait from __d_lookup_unhash() to the place where we actually wake the waiters up. Now we can just leave DCACHE_LOOKUP_WAITERS in ->d_flags until we get to doing wakeups - that's done within the same ->d_lock scope, so we are fine; new bit is accessed only under ->d_lock and it's seen only on dentries with DCACHE_PAR_LOOKUP in ->d_flags. __d_lookup_unhash() no longer needs to re-init ->d_lru. That was previously shared (in a union) with ->d_wait but ->d_wait is now gone so it no longer corrupts ->d_lru. Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> # saner handling of flags Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-06-03smb: client: convert cifs_smb3_do_mount() to sget_fc()Christian Brauner5-22/+34
The CIFS mount path already runs through fs_context: smb3_get_tree() calls smb3_get_tree_common() with a struct fs_context * in hand. But the fc is dropped on the way to sget(). Plumb it through to sget_fc() so the legacy sget() interface can go. cifs_smb3_do_mount() now takes (struct fs_context *, struct smb3_fs_context *). The old (fs_type, flags) pair is reconstructed from fc->fs_type and fc->sb_flags. The flags argument was always passed as 0 by the sole caller anyway. The cifs_dbg diagnostic now prints fc->sb_flags directly. cifs_match_super() and cifs_set_super() were the two void-data callbacks for sget(). The match callback now takes (struct super_block *, struct fs_context *) and reads struct cifs_mnt_data out of fc->sget_key. The set callback is gone entirely: sget_fc() pre-populates sb->s_fs_info from fc->s_fs_info before invoking set() so set_anon_super_fc() (which just allocates an anon bdev) is sufficient. Before sget_fc() we stash cifs_sb in fc->s_fs_info, the per-mount data in fc->sget_key and force fc->sb_flags to SB_NODIRATIME | SB_NOATIME to reproduce the previous hard-coded behaviour (alloc_super() reads fc->sb_flags). The original sb_flags is saved and restored around the call so the rest of the mount path sees the same fc semantics as before. mnt_data.flags keeps its historical value of 0 so the CIFS_MS_MASK comparison in compare_mount_options() returns the same (always-equal) result. No functional change. With this in place sget() has no remaining CIFS caller. Link: https://patch.msgid.link/20260529-work-sget-v2-3-57bbe08604e4@kernel.org Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-05-31ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCELGil Portnoy1-0/+11
A deferred byte-range lock (an SMB2_LOCK that blocks) registers an async work on conn->async_requests via setup_async_work(), with cancel_fn = smb2_remove_blocked_lock and cancel_argv[0] pointing at the struct file_lock. When the request is cancelled, the worker frees the file_lock with locks_free_lock() and takes the cancelled early-exit, which "goto out"s and never reaches release_async_work() -- the only site that unlinks the work from conn->async_requests and clears cancel_fn/cancel_argv. The work therefore stays matchable on async_requests with a live cancel_fn pointing at the freed file_lock, until connection teardown finally runs release_async_work(). smb2_cancel() fires cancel_fn unconditionally with no state guard, so a second SMB2_CANCEL for the same AsyncId, arriving in that window, re-runs smb2_remove_blocked_lock() on the freed file_lock -- a slab use-after-free: BUG: KASAN: slab-use-after-free in __locks_delete_block __locks_delete_block locks_delete_block ksmbd_vfs_posix_lock_unblock smb2_remove_blocked_lock smb2_cancel <- 2nd SMB2_CANCEL fires cancel_fn handle_ksmbd_work Allocated by ...: locks_alloc_lock <- smb2_lock Freed by ...: locks_free_lock <- smb2_lock (cancelled branch) ... cache file_lock_cache of size 192 Reproduced on mainline with KASAN by an authenticated SMB client. Skip a work whose state is already KSMBD_WORK_CANCELLED so its cancel callback cannot be fired a second time. Cc: stable@vger.kernel.org Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-31ksmbd: fix durable reconnect double-bind race in ksmbd_reopen_durable_fdGil Portnoy1-3/+9
Two concurrent same-user DHnC reconnects can both observe fp->conn == NULL before either sets it. ksmbd_reopen_durable_fd() checks fp->conn to guard against a handle already being reconnected, but the check and the binding assignment are not atomic: both threads pass the guard, both call ksmbd_conn_get() on the same fp, and both eventually reach kfree(fp->owner.name) -- a double-free of the owner.name slab object. The double-bound ksmbd_file also causes a write-UAF on the 344-byte ksmbd_file_cache object when a concurrent smb2_close() spins on fp->f_lock after the object has been freed by the losing reconnect path. KASAN on 7.1-rc5 (48-thread concurrent reconnect, 3000 cycles): BUG: KASAN: double-free in ksmbd_reopen_durable_fd+0x268/0x308 BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xac/0x150 Write of size 4 at offset 24 into freed ksmbd_file_cache object Five double-bind windows observed; 63 total KASAN reports triggered. Fix: validate and claim fp->conn under write_lock(&global_ft.lock) so the check-and-claim is atomic. ksmbd_lookup_durable_fd() already treats fp->conn != NULL as "in use" and skips such an fp; setting fp->conn before dropping the lock closes the race. ksmbd_conn_get() is a non-sleeping refcount increment, safe under the rwlock. The rollback path on __open_id() failure also clears fp->conn/tcon under the lock so concurrent readers see a consistent state. Fixes: b1f1e80620de ("ksmbd: centralize ksmbd_conn final release to plug transport leak") Assisted-by: Henry (Claude):claude-opus-4 Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-31ksmbd: fix NULL-deref of opinfo->conn in oplock/lease break notifiersGil Portnoy1-3/+12
smb2_oplock_break_noti() and smb2_lease_break_noti() read opinfo->conn into a local with neither READ_ONCE() nor a NULL check. Both run from oplock_break() after opinfo_get_list() has dropped ci->m_lock, so a concurrent SMB2 LOGOFF (session_fd_check()) can set op->conn = NULL under ci->m_lock within that window. ksmbd_conn_r_count_inc(conn) then writes through NULL at offset 0xc4 -- a remotely triggerable oops. Guard both reads the way compare_guid_key() already does: read opinfo->conn with READ_ONCE() and return early if it is NULL, before allocating the work struct so nothing leaks. A NULL conn means the client is gone and the break is moot, so return 0; oplock_break() treats that as success and runs the normal teardown. Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Assisted-by: Henry (Claude):claude-opus-4 Signed-off-by: Gil Portnoy <dddhkts1@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-30Merge tag 'v7.1-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2-4/+16
Pull smb client fixes from Steve French: - fix uninitialized variable in smb2_writev_callback() - detect short folioq copy in cifs_copy_folioq_to_iter() * tag 'v7.1-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix uninitialized variable in smb2_writev_callback smb: client: detect short folioq copy in cifs_copy_folioq_to_iter()
2026-05-29vfs: make LAST_XXX private to fs/namei.cJori Koolstra1-12/+3
The only user of LAST_XXX outside of fs/namei.c is fs/smb/server/vfs.c; ksmbd_vfs_path_lookup() calls vfs_path_parent_lookup() and expects a LAST_NORM last type (or it will be ENOENT). ksmbd_vfs_rename() also calls vfs_path_parent_lookup() but forgets the LAST_NORM check. It does not really make sense to have vfs_path_parent_lookup() expose the last_type because it is only needed to ensure it is LAST_NORM. So let's do this check in vfs_path_parent_lookup() instead and keep the LAST_XXX internal to fs/namei.c. This changes the ENOENT errno in ksmbd_vfs_path_lookup() to EINVAL, which matches better with how this is handled by callers of filename_parentat(). Signed-off-by: Jori Koolstra <jkoolstra@xs4all.nl> Link: https://patch.msgid.link/20260528175854.57626-1-jkoolstra@xs4all.nl Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-05-27smb: client: fix uninitialized variable in smb2_writev_callbackSteve French1-1/+1
compiling with W=2 pointed out that "written may be used uninitialized" Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref") Cc: stable@vger.kernel.org Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-27smb: client: detect short folioq copy in cifs_copy_folioq_to_iter()Jeremy Erazo1-3/+15
cifs_copy_folioq_to_iter() copies a requested number of bytes from a folio queue into the destination iterator. Since the encrypted SMB2 READ path was changed to pass the server-declared payload length (data_len) instead of the larger folioq buffer length, the caller can ask for fewer bytes than the folio queue holds. In that case the helper continues walking the remaining folios after data_size has reached zero and calls copy_folio_to_iter() with len = 0, which is unnecessary work. The helper also returns 0 (success) when the folio queue is exhausted before data_size bytes have been copied. The caller has no way to distinguish that from a full copy and the reported transfer count ends up larger than the amount of data placed in the iterator. Add an early exit when data_size reaches zero, and return an error when the folio queue is exhausted before all requested bytes have been copied. Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-26ksmbd: fix FSCTL permission bypass by adding a permission check for ↵Sean Shen1-0/+11
FSCTL_SET_SPARSE FSCTL_SET_SPARSE in fsctl_set_sparse() modifies the file's sparse attribute and saves it through xattr without any permission checks. This exposes two issues: 1) A client on a read-only share can change the sparse attribute on files it opened, even though the share is read-only. Other FSCTL write operations already check test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE), but FSCTL_SET_SPARSE does not. 2) Even on writable shares, clients without FILE_WRITE_DATA or FILE_WRITE_ATTRIBUTES access should not modify the sparse attribute. Similar handle-level checks exist in other functions but are missing here. Add both share-level writable check and per-handle access check. Use goto out on error to avoid leaking file references. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Sean Shen <grayhat@foxmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-26ksmbd: release ksmbd_inode ref via ksmbd_inode_put on lookup pathsAleksandr Golovnya1-3/+3
ksmbd_query_inode_status() and ksmbd_lookup_fd_inode() both take a reference on a ksmbd_inode via __ksmbd_inode_lookup() (which performs atomic_inc_not_zero()) and later release it using a bare atomic_dec(&ci->m_count). Unlike ksmbd_inode_put(), a bare atomic_dec() does not check whether the reference count has reached zero, so if the caller happens to drop the last reference, the ksmbd_inode is leaked: it stays in the global inode hash table with m_count == 0, future __ksmbd_inode_lookup() calls reject it via atomic_inc_not_zero(), and ksmbd_inode_free() is never invoked. The race is: T1: __ksmbd_inode_lookup() -> atomic_inc_not_zero(): m_count = 2 T2: ksmbd_inode_put() -> atomic_dec_and_test(): m_count = 1 (not freed) T1: atomic_dec(&ci->m_count) -> m_count = 0 return (LEAK) In ksmbd_lookup_fd_inode() the matched-fp path (which now also uses ksmbd_inode_put()) cannot currently reach m_count == 0 because the matched ksmbd_file holds its own reference on ci, but converting it to the proper API keeps the three call sites consistent and avoids future regressions if the locking changes. Because ksmbd_inode_put() may free the ksmbd_inode if this drops the last reference, the call must happen after up_read(&ci->m_lock) on the two affected paths in ksmbd_lookup_fd_inode(). On the no-match path this is a pure reordering; on the matched path ksmbd_fp_get() is moved above the unlock so that the returned ksmbd_file is pinned before the inode reference is released. Signed-off-by: Aleksandr Golovnya <cofedish@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-26ksmbd: OOB read regression in smb_check_perm_dacl() ACE-walk loopsAli Ganiyev1-4/+4
Commit d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") introduced a transposed bounds check: if (offsetof(struct smb_ace, sid) + aces_size < CIFS_SID_BASE_SIZE) Since offsetof(..sid) is 8 and CIFS_SID_BASE_SIZE is 8, this evaluates to `aces_size < 0`. Because `aces_size` is always non-negative, this check becomes dead code and never breaks the loop. Worse, that commit removed the old 4-byte guard, meaning the loop now reads `ace->size` (offset 2) even when `aces_size` is 0-3 bytes. This re-opens a 2-byte heap out-of-bounds (OOB) read past the pntsd allocation during subsequent SMB2_CREATE operations. Fix this by properly transposing the comparison to require at least 16 bytes (8-byte offset + 8-byte SID base), matching the correct form used in smb_inherit_dacl(). Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") Cc: stable@vger.kernel.org Signed-off-by: Ali Ganiyev <ali.qaniyev@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-23Merge tag 'v7.1-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds3-22/+75
Pull smb server fixes from Steve French: - fix for creating tmpfiles - fix durable reconnect error path - validate SID in security descriptor when inheriting DACL * tag 'v7.1-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close ksmbd: validate SID in parent security descriptor during ACL inheritance ksmbd: fix durable reconnect error path file lifetime
2026-05-22Merge tag 'v7.1-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds7-5/+29
Pull smb client fixes from Steve French: - Fix missing lock - Fix dentry in use after unmounting - cifs.upcall security fix - require CAP_NET_ADMIN for swn netlink - change allocation in DUP_CTX_STR to GFP_KERNEL - minor smbdirect debug fix - handle_read_data() folio fix * tag 'v7.1-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: change allocation requirements in DUP_CTX_STR macro smb: client: require net admin for CIFS SWN netlink smb: smbdirect: divide, not multiply, milliseconds by 1000 cifs: Fix busy dentry used after unmounting smb: client: use data_len for SMB2 READ encrypted folioq copy smb: client: reject userspace cifs.spnego descriptions smb: client: protect tc_count increment in smb2_find_smb_sess_tcon_unlocked()
2026-05-21smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when closeChenXiaoSong1-4/+12
Reproducer: 1. server: systemctl start ksmbd 2. client: mount -t cifs //${server_ip}/export /mnt 3. client: C program: openat(AT_FDCWD, "/mnt", O_RDWR | O_TMPFILE, 0600) Do not treat `FILE_DELETE_ON_CLOSE_LE` as delete pending while files remain open. This patch fixes xfstests generic/004. Cc: stable@vger.kernel.org Link: https://chenxiaosong.com/en/smb-xfstests-generic-004.html Co-developed-by: Huiwen He <hehuiwen@kylinos.cn> Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Tested-by: Steve French <stfrench@microsoft.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21ksmbd: validate SID in parent security descriptor during ACL inheritanceJunyi Liu1-16/+50
Introduce smb_validate_ntsd_sid() helper to safely validate Owner SID and Group SID inside the NT Security Descriptor (smb_ntsd) retrieved from the parent directory. Cc: stable@vger.kernel.org Signed-off-by: Junyi Liu <moss80199@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21ksmbd: fix durable reconnect error path file lifetimeJunyi Liu1-2/+13
After a durable reconnect succeeds, ksmbd_reopen_durable_fd() republishes the same ksmbd_file into the session volatile-id table. If smb2_open() then takes a later error path, cleanup first calls ksmbd_fd_put(work, fp) and then unconditionally calls ksmbd_put_durable_fd(dh_info.fp). In this case fp and dh_info.fp are the same object. The first put drops the reconnect lookup reference, but the final durable put can run __ksmbd_close_fd(NULL, fp). Because the final close is not session-aware, it can free the file object without removing the volatile-id entry that was just published into the session table. Use the session-aware put for the final reconnect drop when the reconnect had already succeeded and the error path is cleaning up the republished file. Earlier reconnect failures, before fp is assigned to dh_info.fp, keep using the durable-only put path. Fixes: 1baff47b81f9 ("ksmbd: fix use-after-free in smb2_open during durable reconnect") Signed-off-by: Junyi Liu <moss80199@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21smb: client: change allocation requirements in DUP_CTX_STR macroFredric Cover1-1/+1
Currently, the macro DUP_CTX_STR allocates new_ctx->field using GFP_ATOMIC. DUP_CTX_STR is only used in smb3_fs_context_dup(), which is never called in an atomic context. Using GFP_ATOMIC puts unnecessary pressure on emergency memory pools. Change GFP_ATOMIC to GFP_KERNEL. Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21smb: client: require net admin for CIFS SWN netlinkMichael Bommarito1-1/+5
CIFS_GENL_CMD_SWN_NOTIFY is the userspace witness-notify command. The intended sender is the cifs.witness helper, but the generic-netlink operation currently has no capability flag, so any local process can send RESOURCE_CHANGE or CLIENT_MOVE notifications to the in-kernel witness handler. The same family exposes CIFS_GENL_MCGRP_SWN without multicast-group capability flags. Register messages sent to that group include the witness registration id and, for NTLM-authenticated mounts, the username, domain, and password attributes copied from the CIFS session. An unprivileged local process should not be able to join that group and receive those messages. Require CAP_NET_ADMIN for incoming SWN_NOTIFY commands with GENL_ADMIN_PERM, and require CAP_NET_ADMIN over the network namespace for joining the SWN multicast group with GENL_MCAST_CAP_NET_ADMIN. The cifs.witness service runs with the privileges needed for both operations. Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21smb: smbdirect: divide, not multiply, milliseconds by 1000Alexander A. Klimov1-1/+1
Unless smbdirect_connection_legacy_debug_proc_show() wants to debug-log keep_alive_interval as microseconds, a magnitude higher precision than available by the way, keepalive_interval_msec should not be multiplied by 1000. Fixes: cc55f65dd352 ("smb: client: make use of common smbdirect_socket_parameters") Reviewed-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21cifs: Fix busy dentry used after unmountingZhihao Cheng1-0/+2
Since commit 340cea84f691c ("cifs: open files should not hold ref on superblock"), cifs file only holds the dentry ref_cnt, the cifs file close work(cfile->deferred) could be executed after unmounting, which will trigger a warning in generic_shutdown_super: BUG: Dentry 00000000a14a6845{i=c,n=file} still in use (1) [unmount of cifs cifs] The detailed processs is: process A process B kworker fd = open(PATH) vfs_open file->__f_path = *path // dentry->d_lockref.count = 1 cifs_open cifs_new_fileinfo cfile->dentry = dget(dentry) // dentry->d_lockref.count = 2 close(fd) __fput cifs_close queue_delayed_work(deferredclose_wq, cfile->deferred) dput(dentry) // dentry->d_lockref.count = 1 smb2_deferred_work_close _cifsFileInfo_put list_del(&cifs_file->flist) umount cleanup_mnt deactivate_super cifs_kill_sb cifs_close_all_deferred_files_sb cifs_close_all_deferred_files // cannot find cfile, skip _cifsFileInfo_put kill_anon_super generic_shutdown_super shrink_dcache_for_umount umount_check WARN ! // dentry->d_lockref.count = 1 cifsFileInfo_put_final dput(cifs_file->dentry) // dentry->d_lockref.count = 0 Fix it by flushing 'deferredclose_wq' before calling kill_anon_super. Fetch a reproducer in https://bugzilla.kernel.org/show_bug.cgi?id=221548. Fixes: 340cea84f691c ("cifs: open files should not hold ref on superblock") Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-21openat2: new OPENAT2_REGULAR flag supportDorjoy Chowdhury1-3/+15
This flag indicates the path should be opened if it's a regular file. This is useful to write secure programs that want to avoid being tricked into opening device nodes with special semantics while thinking they operate on regular files. This is a requested feature from the uapi-group[1]. The previously introduced EFTYPE error code is returned when the path doesn't refer to a regular file. For example, if openat2 is called on path /dev/null with OPENAT2_REGULAR in the flag param, it will return -EFTYPE. When used in combination with O_CREAT, either the regular file is created, or if the path already exists, it is opened if it's a regular file. Otherwise, -EFTYPE is returned. When OPENAT2_REGULAR is combined with O_DIRECTORY, -EINVAL is returned as it doesn't make sense to open a path that is both a directory and a regular file. The UAPI bit lives in the upper 32 bits of open_how::flags (((__u64)1 << 32)) so that open(2) and openat(2) -- whose @flags argument is a C int -- cannot physically express it. This is a structural guarantee, not a runtime mask: the bit is unrepresentable in 32 bits. Because the rest of the VFS open path narrows to 32 bits in several places (op->open_flag, f->f_flags, the unsigned open_flag argument of i_op->atomic_open()), build_open_flags() translates OPENAT2_REGULAR into a kernel-internal lower-32-bit carrier __O_REGULAR (bit 4, unused as an O_* on every architecture) before the assignment to op->open_flag. __O_REGULAR then rides through the existing channels exactly like __FMODE_EXEC. do_dentry_open() strips it so it cannot leak back to userspace via fcntl(F_GETFL). Four BUILD_BUG_ON_MSG() invariants in build_open_flags() prevent any future bit collision or accidental low-32 redefinition: - VALID_OPEN_FLAGS fits in 32 bits. - OPENAT2_REGULAR lives in the upper 32 bits. - OPENAT2_REGULAR does not alias any open()/openat() flag. - __O_REGULAR does not alias any user-visible flag. [1]: https://uapi-group.org/kernel-features/#ability-to-only-open-regular-files Christian Brauner <brauner@kernel.org> says: Move OPENAT2_REGULAR to the upper 32 bits of open_how::flags with a kernel-internal __O_REGULAR carrier so that open(2)/openat(2) cannot encode the flag; add BUILD_BUG_ON_MSG() invariants and register __O_REGULAR in the fcntl_init() allocation-uniqueness BUILD_BUG_ON() (bit count 21 -> 22). Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com> Link: https://patch.msgid.link/20260328172314.45807-2-dorjoychy111@gmail.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Aleksa Sarai <aleksa@amutable.com> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-05-19Merge tag 'v7.1-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds3-5/+15
Pull smb server fixes from Steve French: - Fix two null pointer dereferences and a memory leak * tag 'v7.1-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix null pointer dereference in compare_guid_key() ksmbd: fix null pointer dereference in proc_show_files() ksmbd: fix SID memory leak in set_posix_acl_entries_dacl() on overflow
2026-05-19smb: client: use data_len for SMB2 READ encrypted folioq copyJeremy Erazo1-2/+2
In handle_read_data() the encrypted/folioq branch (buf_len <= data_offset, reached via receive_encrypted_read for transform PDUs > CIFSMaxBufSize + MAX_HEADER_SIZE) copies the READ payload using buffer_len rather than data_len: rdata->result = cifs_copy_folioq_to_iter(buffer, buffer_len, cur_off, &rdata->subreq.io_iter); ... rdata->got_bytes = buffer_len; buffer_len comes from the SMB3 transform header OriginalMessageSize field (OriginalMessageSize - read_rsp_size); it represents the size of the decrypted message after the SMB2 header. data_len comes from the SMB2 READ response DataLength field; it represents the actual READ payload size and may be smaller than buffer_len when the decrypted message contains padding or other trailing bytes after the READ payload. The existing check `data_len > buffer_len - pad_len` only enforces an upper bound, so a server that emits OriginalMessageSize larger than read_rsp_size + pad_len + data_len passes the check and the kernel copies buffer_len bytes per response, ignoring the server-asserted DataLength. Two observable failures with a crafted server (DataLength=4, buffer_len=20000): - the kernel returns 20000 bytes per sub-request to userspace and sets got_bytes = buffer_len, even though the response claimed only 4 bytes of payload; - on a partial netfs sub-request whose iterator is sized to data_len, the over-large copy_folio_to_iter() short-reads, cifs_copy_folioq_to_iter() returns -EIO via the n != len path, and the entire netfs read collapses to -EIO even though the leading sub-requests succeeded. Use data_len for the copy length and for got_bytes so the kernel honours the server-asserted READ payload size. For well-formed servers (where buffer_len == pad_len + data_len) the change is behaviour-equivalent. Cc: stable@vger.kernel.org Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-19smb: client: reject userspace cifs.spnego descriptionsAsim Viladi Oglu Manizada1-0/+16
cifs.spnego key descriptions contain authority-bearing fields such as pid, uid, creduid, and upcall_target that cifs.upcall treats as kernel-originating inputs. However, userspace can also create keys of this type through request_key(2) or add_key(2), allowing those fields to be supplied without CIFS origin. Only accept cifs.spnego descriptions while CIFS is using its private spnego_cred to request the key. Fixes: f1d662a7d5e5 ("[CIFS] Add upcall files for cifs to use spnego/kerberos") Assisted-by: avom-custom-harness:gpt-5.5-qwen3.6-mod-mix Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Asim Viladi Oglu Manizada <manizada@pm.me> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-19smb: client: protect tc_count increment in smb2_find_smb_sess_tcon_unlocked()Henrique Carvalho1-0/+2
Commit 96c4af418586 ("cifs: Fix locking usage for tcon fields") refactored cifs code to change cifs_tcp_ses_lock for tc_lock around tc_count changes. There was missing lock around tc_count increment inside smb2_find_smb_sess_tcon_unlocked(). Cc: stable@vger.kernel.org Fixes: 96c4af418586 ("cifs: Fix locking usage for tcon fields") Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-18cifs: Fix undefined variablesDavid Howells1-3/+3
Fix a couple of undefined variables introduced by the patch to fix tearing on ->remote_i_size and ->zero_point. For some reason, make W=1 with gcc doesn't give undefined variable warnings (but clang does). Fixes: 2c8f4742bb76 ("netfs: Fix potential for tearing in ->remote_i_size and ->zero_point") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605031459.eX5UbO3K-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202605021450.ca5QGqLH-lkp@intel.com/ cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: Christian Brauner <brauner@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-05-18Merge tag 'vfs-7.1-rc5.fixes' of ↵Linus Torvalds7-42/+74
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "This contains a fixes for the current development cycle. Note that AI related review sometimes delays fixes a bit because we find more fixes for the fixes. I might try and send smaller but more fixes PRs if this trend keeps up. - Fix various netfslib bugs - Fix an out-of-bounds write when listing idmappings - Fix the return values in jfs_mkdir() and orangefs_mkdir() - Fix a writeback writeback array overflow in fuse - Fix a forced iversion increment on lazytime timestamp updates - Reject a negative timeval component in kern_select() - Fix error return when vfs_mkdir() fails in the cachefiles code - Fix wrong error code returned for pidns ioctls" * tag 'vfs-7.1-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) cachefiles: Fix error return when vfs_mkdir() fails afs: Fix the locking used by afs_get_link() netfs, afs: Fix write skipping in dir/link writepages netfs: Fix netfs_read_folio() to wait on writeback netfs: Fix folio->private handling in netfs_perform_write() netfs: Fix partial invalidation of streaming-write folio netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages() netfs: Fix leak of request in netfs_write_begin() error handling netfs: Fix early put of sink folio in netfs_read_gaps() netfs: Fix write streaming disablement if fd open O_RDWR netfs: Fix read-gaps to remove netfs_folio from filled folio netfs: Fix potential deadlock in write-through mode netfs: Fix streaming write being overwritten netfs: Defer the emission of trace_netfs_folio() netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone netfs: Fix overrun check in netfs_extract_user_iter() netfs: fix error handling in netfs_extract_user_iter() netfs: Fix potential uninitialised var in netfs_extract_user_iter() netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call netfs: Fix zeropoint update where i_size > remote_i_size ...
2026-05-14smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter()David Howells1-1/+1
Fix smbdirect_map_sges_from_iter() to use pre-decrement, not post-decrement so that it cleans up the correct slots. Fixes: e5fbdde43017 ("cifs: Add a function to build an RDMA SGE list from an iterator") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stefan Metzmacher <metze@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14smb: client: avoid integer overflow in SMB2 READ length checkJeremy Erazo2-7/+12
SMB2 READ response validation in cifs_readv_receive() and handle_read_data() checks data_offset + data_len against the received buffer length. Both values are attacker-controlled fields from the server response and are stored as unsigned int, so the addition can wrap before the bounds check: fs/smb/client/transport.c:1259 if (!use_rdma_mr && (data_offset + data_len > buflen)) fs/smb/client/smb2ops.c:4839 else if (buf_len >= data_offset + data_len) A malicious SMB server can use this to bypass validation. In the non-encrypted receive path the client attempts an oversized socket read and stalls for the SMB response timeout (180 seconds) before reconnecting. In the SMB3 encrypted path, runtime testing shows the malformed length can reach copy_to_iter() in handle_read_data() with attacker-controlled size, where usercopy hardening stops the oversized copy before bytes reach userspace. Guard both call sites with check_add_overflow(), which is already used elsewhere in this subsystem (smb2pdu.c). On overflow, treat the response as malformed and reject with -EIO. Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14cifs: client: stage smb3_reconfigure() updates and restore ctx on failureDaeMyung Kang1-53/+108
smb3_reconfigure() moves strings out of cifs_sb->ctx before the multichannel update, so a later failure can leave the live context with NULL strings or options that do not match the session. Stage the new ctx separately, commit it only on success, and restore the snapshot on failure. Also make smb3_sync_session_ctx_passwords() all-or-nothing. Commit session passwords before channel updates so newly added channels authenticate with the staged credentials. Fixes: ef529f655a2c ("cifs: client: allow changing multichannel mount options on remount") Reported-by: RAJASI MANDAL <rajasimandalos@gmail.com> Closes: https://lore.kernel.org/lkml/CAEY6_V1+dzW3OD5zqXhsWyXwrDTrg5tAMGZ1AJ7_GAuRE+aevA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/xkr2dlvgibq5j6gkcxd3yhhnj4atgxw2uy4eug2pxm7wy7nbms@iq6cf5taa65v/ Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14smb/client: fix possible infinite loop and oob read in symlink_data()Ye Bin1-0/+3
On 32-bit architectures, the infinite loop is as follows: len = p->ErrorDataLength == 0xfffffff8 u8 *next = p->ErrorContextData + len next == p On 32-bit architectures, the out-of-bounds read is as follows: len = p->ErrorDataLength == 0xfffffff0 u8 *next = p->ErrorContextData + len next == (u8 *)p - 8 Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Cc: stable@vger.kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-13ksmbd: fix null pointer dereference in compare_guid_key()Jeremy Laratro1-1/+5
session_fd_check() walks the per-inode m_op_list during durable-handle session teardown and sets op->conn = NULL for every opinfo whose conn matched the closing session's connection. The matching opinfo, however, stays linked in its per-ClientGuid lease_table_list entry's lb->lease_list because destroy_lease_table() only runs on full TCP-connection teardown, not on SESSION_LOGOFF. If the same TCP connection then negotiates a fresh session with the same ClientGuid (ClientGuid is bound to NEGOTIATE, not the session, and is unchanged across LOGOFF + SETUP) and issues a SMB2 CREATE with a lease context on a different inode, find_same_lease_key() walks lb->lease_list, reaches the stale opinfo, and calls compare_guid_key(), which unconditionally dereferences opinfo->conn->ClientGUID. The conn pointer is NULL and the kernel panics. Reproducer requires only a successful SMB2 SESSION_SETUP and a share configured with 'durable handles = yes'. KASAN report on mainline 70390501d194: general protection fault, probably for non-canonical address 0xdffffc0000000069: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000348-0x000000000000034f] Workqueue: ksmbd-io handle_ksmbd_work RIP: 0010:bcmp+0x5b/0x230 Call Trace: compare_guid_key+0x4b/0xd0 find_same_lease_key+0x324/0x690 smb2_open+0x6aea/0x8e60 handle_ksmbd_work+0x796/0xee0 ... Faulting address 0x348 is the offset of ClientGUID within struct ksmbd_conn, confirming opinfo->conn was NULL. Read opinfo->conn once and bail out if it has been cleared by a concurrent session_fd_check(). A half-detached opinfo cannot be the owner of an active lease, so returning 0 is the correct match result. Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Laratro <research@aradex.io> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-13ksmbd: fix null pointer dereference in proc_show_files()Jeremy Laratro1-1/+1
When a SMB2 client opens a file with a durable v2 handle and then issues SMB2 SESSION_LOGOFF, session_fd_check() clears fp->tcon = NULL on the reconnectable file pointer but leaves the fp registered in global_ft.idr until the durable scavenger fires (up to fp->durable_timeout seconds later). During that window any read of /proc/fs/ksmbd/files (mode 0400) panics the kernel because proc_show_files() walks global_ft.idr and unconditionally dereferences fp->tcon->id with no NULL guard. Reproducer requires only a successful SMB2 SESSION_SETUP and a share configured with 'durable handles = yes'. KASAN report on mainline 70390501d194: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:proc_show_files+0x118/0x740 Call Trace: proc_show_files+0x118/0x740 seq_read_iter+0x4ef/0xe10 proc_reg_read_iter+0x1b7/0x280 ... Guard the dereference. A durable-disconnected fp legitimately has no tcon; report its tree id as 0 rather than oopsing. Fixes: b38f99c1217a ("ksmbd: add procfs interface for runtime monitoring and statistics") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Laratro <research@aradex.io> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-13ksmbd: fix SID memory leak in set_posix_acl_entries_dacl() on overflowFerry Meng1-3/+9
Commit 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16 DACL size overflow") added check_add_overflow() guards that break out of the ACE-building loops in set_posix_acl_entries_dacl() when the accumulated DACL size would wrap past 65535. However, each iteration allocates a struct smb_sid via kmalloc_obj() at the top of the loop and relies on the kfree(sid) call at the end of the loop body (the 'pass_same_sid' label in the first loop, and the explicit kfree at the tail of the second loop) to release it. The newly introduced 'break' statements bypass those kfree() calls, leaking the sid buffer every time an overflow is detected. A malicious or malformed file with enough POSIX ACL entries to trip the overflow check will leak one or more struct smb_sid allocations on every request that touches the file's DACL, providing a trivial kernel memory exhaustion vector. Free sid before breaking out of the loops to plug the leak. Fixes: 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16 DACL size overflow") Cc: stable@vger.kernel.org Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-12SMB3.1.1: add missing QUERY_DIR info levelsSteve French2-2/+6
New Infolevels for QUERY_DIR (and QUERY_INFO) levels 78 through 81 are now being used by Windows clients and were added to the documentation. Add defines for them (and correct some typos in documentation). See MS-SMB2 2.2.33 and MS-FSCC 2.4 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-12netfs: Fix potential for tearing in ->remote_i_size and ->zero_pointDavid Howells7-42/+74
Fix potential tearing in using ->remote_i_size and ->zero_point by copying i_size_read() and i_size_write() and using the same seqcount as for i_size. We need to make sure that netfslib and the filesystems that use it always hold i_lock whilst updating any of the sizes to prevent i_size_seqcount from getting corrupted. Fixes: 4058f742105e ("netfs: Keep track of the actual remote file size") Fixes: 100ccd18bb41 ("netfs: Optimise away reads above the point at which there can be no data") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-6-dhowells@redhat.com cc: Paulo Alcantara <pc@manguebit.org> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-11fs: add icount_read_once() and stop open-coding ->i_count loadsMateusz Guzik1-1/+1
Similarly to inode_state_read_once(), it makes the caller spell out they acknowledge instability of the returned value. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20260421182538.1215894-2-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-11ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATIONChuck Lever1-6/+24
FS_ATTRIBUTE_INFORMATION responses have always reported FILE_CASE_SENSITIVE_SEARCH and FILE_CASE_PRESERVED_NAMES unconditionally. Case-insensitive filesystems like exFAT, and casefolded directories on ext4 or f2fs, have no way to signal their actual semantics to SMB clients. Now that filesystems expose case behavior through ->fileattr_get, query it via vfs_fileattr_get() and translate the FS_XFLAG_CASEFOLD and FS_XFLAG_CASENONPRESERVING flags into the corresponding SMB attributes. Filesystems without ->fileattr_get continue reporting default POSIX behavior (case-sensitive, case-preserving). SMB's FS_ATTRIBUTE_INFORMATION reports per-share attributes from the share root, not per-file. Shares mixing casefold and non-casefold directories report the root directory's behavior. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260507-case-sensitivity-v14-15-e62cc8200435@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-11cifs: Implement fileattr_get for case sensitivityChuck Lever3-0/+57
Upper layers such as NFSD need a way to query whether a filesystem handles filenames in a case-sensitive manner. Report CIFS/SMB case handling behavior via FS_XFLAG_CASEFOLD and FS_XFLAG_CASENONPRESERVING. The authoritative source is the server itself: at mount time CIFS issues QueryFSInfo(FS_ATTRIBUTE_INFORMATION) and caches the reply on the tcon. That reply carries FILE_CASE_SENSITIVE_SEARCH and FILE_CASE_PRESERVED_NAMES, which reflect whatever case handling the share actually implements after SMB3.1.1 POSIX extensions negotiation. Translating those two bits into the VFS flags lets cifs_fileattr_get report what the server advertises rather than what the client was asked to pretend. QueryFSInfo is best-effort; the mount completes even if the server does not answer. MaxPathNameComponentLength is zero in that case and is used as the "no reply received" sentinel. When no reply is available, fall back to the nocase mount option so that the reported behavior agrees with the dentry comparison operations installed on the superblock. The callback is registered on cifs_dir_inode_ops so that NFSD, ksmbd, and other consumers querying case handling against a directory get a definitive answer, and on cifs_file_inode_ops to preserve FS_COMPR_FL reporting on regular files. cifs_set_ops() also installs cifs_namespace_inode_operations on DFS referral directories that carry IS_AUTOMOUNT; register the same callback there so the answer does not depend on whether the directory is a referral point. Registering fileattr_get routes FS_IOC_GETFLAGS through vfs_fileattr_get() and short-circuits the syscall's fallback to cifs_ioctl(). That fallback invoked CIFSGetExtAttr() under CONFIG_CIFS_POSIX and CONFIG_CIFS_ALLOW_INSECURE_LEGACY on servers advertising CIFS_UNIX_EXTATTR_CAP, surfacing the SMB1 Unix-extension immutable, append, and nodump bits. cifs_fileattr_get carries over only FS_COMPR_FL from cached cifsAttrs; the SMB1 extattr fetch is not reproduced. SMB1 is deprecated, and acquiring a netfid from within a dentry-only callback is not worth preserving a path tied to an insecure legacy dialect. Acked-by: Steve French <stfrench@microsoft.com> Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260507-case-sensitivity-v14-9-e62cc8200435@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-05-08Merge tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds7-24/+96
Pull smb client fixes from Steve French: - Fix for two ACL issues (security fix to validate dacloffset better and chmod fix) - Fix out of bounds reads (in check_wsl_eas and smb2_check_msg for symlinks) - Two Kerberos fixes including an important one when AES-256 encryption chosen - Fix open_cached_dir problem when directory leases disabled * tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: validate dacloffset before building DACL pointers smb/client: fix out-of-bounds read in smb2_compound_op() smb/client: fix out-of-bounds read in symlink_data() smb: client: Zero-pad short GSS session keys per MS-SMB2 smb: client: Use FullSessionKey for AES-256 encryption key derivation smb: client: use kzalloc to zero-initialize security descriptor buffer cifs: abort open_cached_dir if we don't request leases
2026-05-07smb: client: validate dacloffset before building DACL pointersMichael Bommarito1-3/+32
parse_sec_desc(), build_sec_desc(), and the chown path in id_mode_to_cifs_acl() all add the server-supplied dacloffset to pntsd before proving a DACL header fits inside the returned security descriptor. On 32-bit builds a malicious server can return dacloffset near U32_MAX, wrap the derived DACL pointer below end_of_acl, and then slip past the later pointer-based bounds checks. build_sec_desc() and id_mode_to_cifs_acl() can then dereference DACL fields from the wrapped pointer in the chmod/chown rewrite paths. Validate dacloffset numerically before building any DACL pointer and reuse the same helper at the three DACL entry points. Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb/client: fix out-of-bounds read in smb2_compound_op()Zisen Ye1-4/+8
If a server sends a truncated response but a large OutputBufferLength, and terminates the EA list early, check_wsl_eas() returns success without validating that the entire OutputBufferLength fits within iov_len. Then smb2_compound_op() does: memcpy(idata->wsl.eas, data[0], size[0]); Where size[0] is OutputBufferLength. If iov_len is smaller than size[0], memcpy can read beyond the end of the rsp_iov allocation and leak adjacent kernel heap memory. Link: https://lore.kernel.org/linux-cifs/d998240c-aca9-420d-9dbd-f5ba24af19e0@chenxiaosong.com/ Fixes: ea41367b2a60 ("smb: client: introduce SMB2_OP_QUERY_WSL_EA") Cc: stable@vger.kernel.org Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb/client: fix out-of-bounds read in symlink_data()Zisen Ye1-1/+2
Since smb2_check_message() returns success without length validation for the symlink error response, in symlink_data() it is possible for iov->iov_len to be smaller than sizeof(struct smb2_err_rsp). If the buffer only contains the base SMB2 header (64 bytes), accessing err->ErrorContextCount (at offset 66) or err->ByteCount later in symlink_data() will cause an out-of-bounds read. Link: https://lore.kernel.org/linux-cifs/297d8d9b-adf7-42fd-a1c2-5b1f230032bc@chenxiaosong.com/ Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Cc: Stable@vger.kernel.org Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb: client: Zero-pad short GSS session keys per MS-SMB2Piyush Sachdeva1-5/+18
Per MS-SMB2 section 3.2.5.3, Session.SessionKey is the first 16 bytes of the GSS cryptographic key, right-padded with zero bytes if the key is shorter than 16 bytes. SMB2_auth_kerberos() copies the GSS session key from the cifs.upcall response using kmemdup(msg->data, msg->sesskey_len, ...) and stores the GSS-reported length verbatim in ses->auth_key.len. generate_key() reads SMB2_NTLMV2_SESSKEY_SIZE bytes from this buffer when feeding the HMAC-SHA256 KDF for signing key derivation. If a GSS mechanism returns a session key shorter than 16 bytes (e.g. a deprecated single-DES Kerberos enctype with an 8-byte session key), the KDF call performs an out-of-bounds slab read and derives keys that do not match the server, which pads per the spec. Modern KDCs disable short-key enctypes by default, so this is latent rather than reachable in production, but it is still a kernel heap over-read. Allocate auth_key.response with kzalloc() at a length of max(msg->sesskey_len, SMB2_NTLMV2_SESSKEY_SIZE), copy the GSS key in, and rely on kzalloc()'s zero initialization for the spec-mandated padding. Set ses->auth_key.len to the padded length. Larger GSS keys (e.g. the 32-byte aes256-cts-hmac-sha1-96 session key) continue to be stored at their natural length, preserving the FullSessionKey path. Emit a cifs_dbg(VFS, ...) message when a short key is encountered to surface deprecated-enctype usage. NTLMv2 and NTLMSSP code paths produce a 16-byte session key by construction and are unaffected. Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb: client: Use FullSessionKey for AES-256 encryption key derivationPiyush Sachdeva2-10/+27
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM or AES-256-GCM), the SMB3 encryption and decryption keys must be derived using the full session key (Session.FullSessionKey) rather than just the first 16 bytes (Session.SessionKey). Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey must be set to the full cryptographic key from the GSS authentication context. The encryption and decryption key derivation (SMBC2SCipherKey, SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The signing key derivation continues to use Session.SessionKey (first 16 bytes) in all cases. Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the HMAC-SHA256 key input length for all derivations. When Kerberos with AES-256 provides a 32-byte session key, the KDF for encryption/decryption was using only the first 16 bytes, producing keys that did not match the server's, causing mount failures with sec=krb5 and require_gcm_256=1. Add a full_key_size parameter to generate_key() and pass the appropriate size from generate_smb3signingkey(): - Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes) - Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16 Also fix cifs_dump_full_key() to report the actual session key length for AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools like Wireshark receive the correct key for decryption. Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-03smb: client: use kzalloc to zero-initialize security descriptor bufferBjoern Doebel1-1/+1
Commit 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16") split struct smb_acl's __le32 num_aces field into __le16 num_aces and __le16 reserved. The reserved field corresponds to Sbz2 in the MS-DTYP ACL wire format, which must be zero [1]. When building an ACL descriptor in build_sec_desc(), we are using a kmalloc()'ed descriptor buffer and writing the fields explicitly using le16() writes now. This never writes to the 2 byte reserved field, leaving it as uninitialized heap data. When the reserved field happens to contain non-zero slab garbage, Samba rejects the security descriptor with "ndr_pull_security_descriptor failed: Range Error", causing chmod to fail with EINVAL. Change kmalloc() to kzalloc() to ensure the entire buffer is zero-initialized. Fixes: 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16") Cc: stable@vger.kernel.org Signed-off-by: Bjoern Doebel <doebel@amazon.de> Assisted-by: Kiro:claude-opus-4.6 [1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/20233ed8-a6c6-4097-aafa-dd545ed24428 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-03cifs: abort open_cached_dir if we don't request leasesShyam Prasad N1-0/+8
It is possible that SMB2_open_init may not set lease context based on the requested oplock level. This can happen when leases have been temporarily or permanently disabled. When this happens, we will have open_cached_dir making an open without lease context and the response will anyway be rejected by open_cached_dir (thereby forcing a close to discard this open). That's unnecessary two round-trips to the server. This change adds a check before making the open request to the server to make sure that SMB2_open_init did add the expected lease context to the open in open_cached_dir. Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: validate inherited ACE SID lengthShota Zaizen1-14/+52
smb_inherit_dacl() walks the parent directory DACL loaded from the security descriptor xattr. It verifies that each ACE contains the fixed SID header before using it, but does not verify that the variable-length SID described by sid.num_subauth is fully contained in the ACE. A malformed inheritable ACE can advertise more subauthorities than are present in the ACE. compare_sids() may then read past the ACE. smb_set_ace() also clamps the copied destination SID, but used the unchecked source SID count to compute the inherited ACE size. That could advance the temporary inherited ACE buffer pointer and nt_size accounting past the allocated buffer. Fix this by validating the parent ACE SID count and SID length before using the SID during inheritance. Compute the inherited ACE size from the copied SID so the size matches the bounded destination SID. Reject the inherited DACL if size accumulation would overflow smb_acl.size or the security descriptor allocation size. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Shota Zaizen <s@zaizen.me> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()Namjae Jeon1-0/+4
The kernel test robot reported W=1 build warnings for ksmbd_conn_get() and ksmbd_conn_put() due to missing parameter descriptions. Add the @conn description to fix these warnings. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: fail share config requests when path allocation failsShuhao Fu1-4/+8
Non-pipe shares must have a duplicated backing path before they can be published. share_config_request() currently calls kstrndup() for that path, but if the allocation fails it leaves ret unchanged. If veto list parsing succeeds and share->name exists, the partially built share is still inserted into the global share table with share->path left NULL. A later share-root SMB2 create uses tree_conn->share_conf->path as the lookup root. If the share was published with path == NULL, that request passes a NULL pathname into do_getname_kernel()/strlen() and can crash the ksmbd worker. Set ret = -ENOMEM when path duplication fails so the incomplete share is destroyed before publication. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: close durable scavenger races against m_fp_list lookupsDaeMyung Kang1-26/+76
ksmbd_durable_scavenger() has two related races against any walker that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode() (used by ksmbd_vfs_rename) and the share-mode checks in fs/smb/server/smb_common.c. (1) fp->node list-head reuse. Durable-preserved handles can remain linked on f_ci->m_fp_list after session teardown so share-mode checks still see them while the handle is reconnectable. The scavenger collected expired handles by adding fp->node to a local scavenger_list after removing them from the global durable idr. Because fp->node is the same list_head used by m_fp_list, list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links and corrupts both lists. CONFIG_DEBUG_LIST can report this on the share-mode walk path. (2) Refcount race against m_fp_list walkers. The scavenger qualifies an expired durable handle with atomic_read(&fp->refcount) > 1 and fp->conn under global_ft.lock, removes fp from global_ft, then drops global_ft.lock before unlinking fp from m_fp_list and freeing it. During that gap fp is still linked on m_fp_list with f_state == FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and takes a live reference; the scavenger then unlinks and frees fp while the holder owns a reference, leading to UAF on the holder's subsequent ksmbd_fd_put() and on any field reads performed by a concurrent share-mode walker that iterates m_fp_list without taking ksmbd_fp_get() (smb_check_perm_dleases-like paths). Fix both: * Stop reusing fp->node as a scavenger-private list node. Remove one expired handle from global_ft under global_ft.lock, take an explicit transient reference, drop the lock, unlink fp->node from m_fp_list under f_ci->m_lock, then drop both the durable lifetime and transient references with atomic_sub_and_test(2, &fp->refcount). If the scavenger is the last putter the close runs there; otherwise an in-flight holder that already raced through the m_fp_list lookup owns the final close via its ksmbd_fd_put() path. The one-at-a-time disposal can rescan the durable idr when multiple handles expire in the same pass, but durable scavenging is a background expiration path and the final full scan recomputes min_timeout before the next wait. * Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right after idr_remove(), so a delayed final close from a holder that snatched fp does not re-issue idr_remove() on a persistent id that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have already handed out to a brand-new durable handle. * Bypass the per-conn open_files_count decrement in __put_fd_final() when fp is detached from any session table (fp->conn cleared by session_fd_check() at durable preserve -- paired with the volatile_id clear at unpublish, so checking fp->conn alone is sufficient). The walker that owns the final close runs from an unrelated work->conn whose stats.open_files_count never tracked this durable fp; without this guard the holder would underflow that unrelated counter. The two races are folded into one patch because patch (1) alone cleans up the corrupted list but leaves a deterministic UAF window for m_fp_list walkers that the transient-reference and persistent_id discipline in (2) close; bisecting onto an intermediate state would land on a UAF that pre-patch chaos merely made less reproducible. Validation: * CONFIG_DEBUG_LIST coverage for the list_head reuse path. * KASAN-enabled direct SMB2 durable-handle coverage that exercised ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode() returns while durable handles expired under concurrent rename lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING reports. * checkpatch --strict * make -j$(nproc) M=fs/smb/server Fixes: d484d621d40f ("ksmbd: add durable scavenger timer") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: harden file lifetime during session teardownDaeMyung Kang3-25/+164
__close_file_table_ids() is the per-session teardown that closes every fp belonging to a session (or to one tree connect on that session) by walking the session's volatile-id idr. The current loop has three related problems on busy or racing workloads: * Sleeping under ft->lock. The session-teardown skip callback, session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner() -> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a rw_semaphore). Running the callback inside write_lock(&ft->lock) trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a durable-fd workload. * Refcount accounting blind to f_state. The unconditional atomic_dec_and_test(&fp->refcount) does not distinguish FP_INITED (idr-owned reference still intact) from FP_CLOSED (an earlier ksmbd_close_fd() already consumed the idr-owned reference while leaving fp in the idr because a holder kept refcount non-zero). When the latter races with teardown the same path over-decrements into a holder reference and ksmbd_fd_put() later UAFs that holder. * FP_NEW window. Between __open_id() publishing fp into the session idr and ksmbd_update_fstate(..., FP_INITED) committing the transition at the end of smb2_open(), an fp is in FP_NEW and an intervening teardown that takes a transient reference and unpublishes the volatile id leaves the original idr-owned reference orphaned -- the opener is unaware that fp has been unpublished, returns success to the client, and the fp leaks at refcount = 1. Refactor __close_file_table_ids() to take a transient reference on fp and unpublish fp from the session idr *under ft->lock* before calling skip() outside the lock. A transient ref protects lifetime but not concurrent field mutation, so the idr_remove() is what keeps __ksmbd_lookup_fd() through this session's idr from granting a new ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon / fp->volatile_id / op->conn / lock_list links are about to be rewritten by session_fd_check(). Durable reconnect is unaffected because it reaches fp through the global durable table (ksmbd_lookup_durable_fd -> global_ft). Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition under ft->lock so teardown and ksmbd_close_fd() never both consume the idr-owned reference. See ksmbd_mark_fp_closed() for the per-state accounting. For the FP_NEW path to be safe, the opener has to learn that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT when an FP_NEW -> FP_INITED transition finds f_state already advanced or the volatile id cleared (both committed by teardown under ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID and drops the original reference via ksmbd_fd_put(). The list removal cannot be left for a deferred final putter because fp->volatile_id has already been cleared and __ksmbd_remove_fd() will intentionally skip both idr_remove() and list_del_init(). Move the m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so that an FP_NEW fp that happened to be added to m_fp_list (smb2_open() adds fp->node before ksmbd_update_fstate() runs) is still cleaned up on the deferred putter path; list_del_init() on an empty node is a no-op and remains safe for fps that were never added. Add a defensive guard in session_fd_check() that refuses non-FP_INITED fps so that even if a teardown reaches an FP_NEW fp it falls into the close branch (where the n_to_drop = 1 accounting keeps the opener's reference alive) instead of the durable-preserve branch (which mutates fp->conn / fp->tcon). Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload (open/write storm on one tcon, 50 tree disconnects on the other) and reported no list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep or kmemleak reports. Reverting only the __close_file_table_ids() hunk while keeping a forced-is_reconnectable() harness produced the expected sleep-in-atomic at vfs_cache.c:1095, confirming the ft->lock-out-of-sleepable-skip discipline. KASAN-enabled direct SMB2 coverage with durable handles enabled exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(), the FP_NEW failure path, tree_conn_fd_check(), and a non-zero session_fd_check() durable-preserve return. This produced no KASAN, DEBUG_LIST, ODEBUG, or WARNING reports. Fixes: f44158485826 ("cifsd: add file operations") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: centralize ksmbd_conn final release to plug transport leakDaeMyung Kang5-30/+156
ksmbd_conn_free() is one of four sites that can observe the last refcount drop of a struct ksmbd_conn. The other three fs/smb/server/connection.c ksmbd_conn_r_count_dec() fs/smb/server/oplock.c __free_opinfo() fs/smb/server/vfs_cache.c session_fd_check() end the conn with a bare kfree(), skipping ida_destroy(&conn->async_ida) and conn->transport->ops->free_transport(conn->transport). Whenever one of them is the last putter, the embedded async_ida and the entire transport struct leak -- for TCP, that is also the struct socket and the kvec iov. __free_opinfo() being a final putter is not theoretical. opinfo_put() queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and have ksmbd_conn_free() run in the handler thread before any of them fire. ksmbd_conn_free() then observes refcnt > 0 and short-circuits; the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn) branch and the transport is lost. A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare: each iteration holds 8 files open via sleep processes, force-closes TCP with "ss -K sport = :445", kills the holders, lazy-umounts; repeated 10 times, then ksmbd shutdown and kmemleak scan. state conn_alloc conn_free tcp_free opi_rcu kmemleak ---------- ---------- --------- -------- ------- -------- pre-patch 20 20 10 160 7 with patch 20 20 20 160 0 Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the bare-kfree paths skipping transport cleanup; kmemleak backtraces point into struct tcp_transport / iov. With this patch tcp_free matches conn_free at 20/20 and kmemleak is clean. Move the per-struct final release into __ksmbd_conn_release_work() and route the three bare-kfree final-put sites through a new ksmbd_conn_put(). Those sites now pair ida_destroy() and free_transport() with kfree(conn) regardless of which holder happens to release the last reference. stop_sessions() only triggers the transport shutdown and does not itself drop the last conn reference, so it is unaffected. The centralized release reaches sock_release() -> tcp_close() -> lock_sock_nested() (might_sleep) from every final putter, including __free_opinfo() invoked from an RCU softirq callback, which trips CONFIG_DEBUG_ATOMIC_SLEEP. Defer the release to a dedicated ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any non-sleeping context. Make ksmbd_file own a strong connection reference while fp->conn is non-NULL so durable-preserve and final-close paths cannot dereference a stale connection. ksmbd_open_fd() and ksmbd_reopen_durable_fd() take the reference via ksmbd_conn_get() (the latter also reorders the fp->conn / fp->tcon assignments before __open_id() so the published fp is never observed with fp->conn == NULL); session_fd_check() and __ksmbd_close_fd() drop it via ksmbd_conn_put(). With that invariant, session_fd_check() can take a local conn pointer once and use it across the m_op_list and lock_list iterations even though op->conn puts may otherwise drop the last reference. At module exit the workqueue is flushed and destroyed after rcu_barrier(), so any release queued by a trailing RCU callback is drained before the inode hash and module text go away. Fixes: ee426bfb9d09 ("ksmbd: add refcnt to ksmbd_conn struct") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: fix MR registration for coalesced SG listsYi Kuo1-9/+12
ib_dma_map_sg() modifies the provided scatterlist and returns the number of mapped entries, which can be fewer than the requested mr->sgt.nents if the DMA controller coalesces contiguous memory segments. Passing the original, uncoalesced count to ib_map_mr_sg() causes memory registration failures if coalescing actually occurs. Capture the actual mapped count returned by ib_dma_map_sg() and pass it to ib_map_mr_sg() to ensure correct MR registration. Also update the ib_dma_map_sg() error logging to drop the error pointer formatting, since the return value is an integer count rather than an error code. Ensure a proper error code (-EIO) is assigned when DMA mapping or MR registration fails. Fixes: de5ef8ec3c46 ("smb: smbdirect: introduce smbdirect_mr.c with client mr code") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221408 Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Yi Kuo <yi@yikuo.dev> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: introduce and use include/linux/smbdirect.hStefan Metzmacher7-204/+3
This makes it easier to rebuild cifs.ko and ksmbd.ko against a running kernel. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPLStefan Metzmacher13-30/+33
This is a better solution than EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") as it makes it possible to rebuild smbdirect.ko against a running kernel and then load the existing cifs.ko and ksmbd.ko from the running kernel. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01Merge tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds3-9/+46
Pull smb server fixes from Steve French: - Fix shutdown (stop sessions) - Fix readdir unsupported info level * tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: rewrite stop_sessions() with restartable iteration smb: server: handle readdir_info_level_struct_sz() error
2026-04-29cifs: change_conf needs to be called for session setupShyam Prasad N1-0/+11
Today we skip calling change_conf for negotiates and session setup requests. This can be a problem for mchan as the immediate next call after session setup could be due to an I/O that is made on the mount point. For single channel, this is not a problem as there will be several calls after setting up session. This change enforces calling change_conf when the total credits contain enough for reservations for echoes and oplocks. We expect this to happen during the last session setup response. This way, echoes and oplocks are not disabled before the first request to the server. So if that first request is an open, it does not need to disable requesting leases. Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-29smb: client: change allocation requirements in smb2_compound_opFredric Cover1-1/+1
Currently, smb2_compound_op() allocates struct smb2_compound_vars *vars using GFP_ATOMIC, although smb2_compound_op() can sleep when it calls compound_send_recv() before vars is freed. Allocate vars using GFP_KERNEL. Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-29ksmbd: rewrite stop_sessions() with restartable iterationDaeMyung Kang2-8/+39
stop_sessions() walks conn_list with hash_for_each() and, for every entry, drops conn_list_lock across the transport ->shutdown() call before re-acquiring the read lock to continue the loop. The hash walk relies on cross-iteration state (the current bucket and the hlist position), which is not preserved across unlock/relock: if another thread performs a list mutation during the unlocked window, the ongoing iteration becomes unreliable and can re-visit connections that have already been handled or skip connections that have not. The outer `if (!hash_empty(conn_list)) goto again;` retry masks the symptom in the common case but does not address the unsafe iteration itself. Reframe the loop so it never relies on iterator state across unlock/relock. Under conn_list_lock held for read, pick the first connection whose ->shutdown() has not yet been issued by this path, pin it by taking an extra reference, record that fact on the connection and mark it EXITING while still inside the locked walk, then drop the lock. Then call ->shutdown() outside the lock, drop the pin (freeing the connection if the handler already released its reference), and restart from the top. Use a new per-connection flag, conn->stop_called, as the "shutdown issued from stop_sessions()" marker rather than reusing the status state. ksmbd_conn_set_exiting() is also invoked by ksmbd_sessions_deregister() on sibling channels of a multichannel session without issuing a transport shutdown, so treating KSMBD_SESS_EXITING as "already handled here" would skip connections that still need shutdown() to wake their handler out of recv(), leaving the outer retry waiting indefinitely for the hash to drain. stop_sessions() is serialised by init_lock in ksmbd_conn_transport_destroy(), so writing stop_called under the read lock has no other writer. Set EXITING inside the locked walk so the selection, the stop_called marker, and the status transition all happen together, and guard against regressing a connection that has already advanced to KSMBD_SESS_RELEASING on its own (for example, if the handler exited its receive loop for an unrelated reason between teardown steps). When the pin drop is the last put, release the transport and pair ida_destroy(&target->async_ida) with the ida_init() done in ksmbd_conn_alloc(), so stop_sessions() retiring a connection on its own does not leak the xarray backing of the embedded async_ida. The outer retry with msleep() is kept to wait for handler threads to reach ksmbd_conn_free() and drain the hash. Observed with an instrumented build that logs one line per visit and widens the unlocked window before ->shutdown() by 200 ms, under five concurrent cifs mounts (nosharesock, one connection each): * Current code: the same connection address is revisited many times during a single stop_sessions() call and ->shutdown() is invoked well beyond the number of live connections before the hash finally drains. * Rewritten code: each live connection produces exactly one ->shutdown() call; the function returns as soon as the hash is empty. Functional teardown via `ksmbd.control --shutdown` with the same five mounts completes cleanly on the rewritten path. Performance is observably unchanged. Tearing down N concurrent nosharesock cifs connections with `ksmbd.control --shutdown` + `rmmod ksmbd` takes essentially the same wall time before and after the rewrite: N before after 10 4.93s 5.34s 30 7.34s 7.03s 50 7.31s 7.01s (3-run avg: 7.04s vs 7.25s) 100 6.98s 6.78s 200 6.77s 6.89s and the number of ->shutdown() calls equals the number of live connections on both paths when the race is not widened. The teardown is dominated by the msleep(100)-based outer retry waiting for handler threads to run ksmbd_conn_free(), not by the iteration itself; the restartable loop's worst-case O(N^2) visit cost is in the microseconds even at N=200 and sits far below the msleep(100) granularity. Applied alone on top of ksmbd-for-next-next, this patch does not introduce a new leak site. Under the same reproducer (10x concurrent-holders + ss -K + ksmbd.control --shutdown + rmmod), the tree still shows the pre-existing per-connection transport leak count that arises when the last refcount drop lands in one of ksmbd_conn_r_count_dec(), __free_opinfo() or session_fd_check() - all of which end with a bare kfree() today. kmemleak backtraces for the unreferenced objects point into the TCP accept path (sk_clone -> inet_csk_clone_lock, sock_alloc_inode) and none involve stop_sessions(). Plugging those bare-kfree sites is the responsibility of the follow-up patch. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-29smb: server: handle readdir_info_level_struct_sz() errorMarios Makassikis1-1/+7
early exit in smb2_populate_readdir_entry() if the requested info_level is unknown. Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-26smb/client: remove unused smb3_parse_opt()David Disseldorp2-32/+0
Commit abdb1742a3123 ("cifs: get rid of mount options string parsing") removed the last caller. Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-23Merge tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds39-214/+164
Pull more smb server updates from Steve French: - move fs/smb/common/smbdirect to fs/smb/smbdirect - change signature calc to use AES-CMAC library, simpler and faster - invalid signature fix - multichannel fix - open create options fix - fix durable handle leak - cap maximum lock count to avoid potential denial of service - four connection fixes: connection free and session destroy IDA fixes, refcount fix, connection leak fix, max_connections off by one fix - IPC validation fix - fix out of bounds write in getting xattrs - fix use after free in durable handle reconnect - three ACL fixes: fix potential ACL overflow, harden num_aces check, and fix minimum ACE size check * tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd: smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/ smb: server: stop sending fake security descriptors ksmbd: scope conn->binding slowpath to bound sessions only ksmbd: fix CreateOptions sanitization clobbering the whole field ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount ksmbd: destroy async_ida in ksmbd_conn_free() ksmbd: destroy tree_conn_ida in ksmbd_session_destroy() ksmbd: Use AES-CMAC library for SMB3 signature calculation ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id() ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment ksmbd: use check_add_overflow() to prevent u16 DACL size overflow ksmbd: fix use-after-free in smb2_open during durable reconnect ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl() smb: server: fix max_connections off-by-one in tcp accept path ksmbd: require minimum ACE size in smb_check_perm_dacl() ksmbd: validate response sizes in ipc_validate_msg() smb: server: fix active_num_conn leak on transport allocation failure
2026-04-23Merge tag 'v7.1-rc1-part3-smb3-client-fixes' of ↵Linus Torvalds24-387/+433
git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Four bug fixes: OOB read in ioctl query info, 3 ACL fixes - SMB1 Unix extensions mount fix - Four crypto improvements: move to AES-CMAC library, simpler and faster - Remove drop_dir_cache to avoid potential crash, and move to /procfs - Seven SMB3.1.1 compression fixes * tag 'v7.1-rc1-part3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature() smb: client: Make generate_key() return void smb: client: Remove obsolete cmac(aes) allocation smb: client: Use AES-CMAC library for SMB3 signature calculation smb: common: add SMB3_COMPRESS_MAX_ALGS smb: client: compress: add code docs to lz77.c smb: client: compress: LZ77 optimizations smb: client: compress: increase LZ77_MATCH_MAX_DIST smb: client: compress: fix counting in LZ77 match finding smb: client: compress: fix buffer overrun in lz77_compress() smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl smb: client: fix (remove) drop_dir_cache module parameter smb: client: require a full NFS mode SID before reading mode bits smb: client: validate the whole DACL before rewriting it in cifsacl smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path cifs: update internal module version number smb: client: compress: fix bad encoding on last LZ77 flag smb: client: fix dir separator in SMB1 UNIX mounts
2026-04-22smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/Stefan Metzmacher27-42/+42
This also removes the smbdirect_ prefix from the files. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-cifs/CAHk-=whmue3PVi88K0UZLZO0at22QhQZ-yu+qO2TOKyZpGqecw@mail.gmail.com/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()Eric Biggers1-7/+5
Since the crypto library API is now being used instead of crypto_shash, all structs for MAC computation are now just fixed-size structs allocated on the stack; no dynamic allocations are ever required. Besides being much more efficient, this also means that the 'allocate_crypto' argument to smb2_calc_signature() and smb3_calc_signature() is no longer used. Remove this unused argument. Acked-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: Make generate_key() return voidEric Biggers1-29/+16
Since the crypto library API is now being used instead of crypto_shash, generate_key() can no longer fail. Make it return void and simplify the callers accordingly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: Remove obsolete cmac(aes) allocationEric Biggers10-95/+2
Since the crypto library API is now being used instead of crypto_shash, the "cmac(aes)" crypto_shash that is being allocated and stored in 'struct cifs_secmech' is no longer used. Remove it. That makes the kconfig selection of CRYPTO_CMAC and the module softdep on "cmac" unnecessary. So remove those too. Finally, since this removes the last use of crypto_shash from the smb client, also remove the remaining crypto_shash-related helper functions. Note: cifs_unicode.c was relying on <linux/unaligned.h> being included transitively via <crypto/internal/hash.h>. Since the latter include is removed, make cifs_unicode.c include <linux/unaligned.h> explicitly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: Use AES-CMAC library for SMB3 signature calculationEric Biggers4-74/+30
Convert smb3_calc_signature() to use the AES-CMAC library instead of a "cmac(aes)" crypto_shash. The result is simpler and faster code. With the library there's no need to allocate memory, no need to handle errors except for key preparation, and the AES-CMAC code is accessed directly without inefficient indirect calls and other unnecessary API overhead. For now a "cmac(aes)" crypto_shash is still being allocated in 'struct cifs_secmech'. Later commits will remove that, simplifying the code even further. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: common: add SMB3_COMPRESS_MAX_ALGSEnzo Matsumiya1-0/+2
Set it to number of currently defined algorithms (6 as of now). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: compress: add code docs to lz77.cEnzo Matsumiya1-8/+73
Document parts of the code, especially the apparently non-sense parts. Other: - change pointer increment constants to sizeof() values Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: compress: LZ77 optimizationsEnzo Matsumiya2-65/+104
This patch implements several micro-optimizations on lz77_compress() with the goal of reducing the number of instructions per [input] byte (a.k.a. IPB). Changes: - change hashtable to be u32 (instead of u64) -- change the hash function to reflect that (adds lz77_hash() and lz77_read32() helpers) - batch-write literals instead of 1 by 1 -- now that we have a well defined hot path (match finding) and a cold path (encode literals + match), batch writing makes a significant difference - implement adaptive skipping of input bytes -- skip input bytes more aggressively if too few matches are being found - name some constants for more meaningful context Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: compress: increase LZ77_MATCH_MAX_DISTEnzo Matsumiya1-2/+1
Increase max distance (i.e. window size) from 1k to 8k. This allows better compression and is just as fast. Other: - drop LZ77_MATCH_MIN_DIST as it's nused -- main loop already checks if dist > 0 Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: compress: fix counting in LZ77 match findingEnzo Matsumiya1-7/+10
- lz77_match_len() increments @cur before checking for equality, leading to off-by-one match len in some cases. Fix by moving pointers increment to inside the loop. Also rename @wnd arg to @match (more accurate name). - both lz77_match_len() and lz77_compress() checked for "buf + step < end" when the correct is "<=" for such cases. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: compress: fix buffer overrun in lz77_compress()Enzo Matsumiya3-15/+33
@dst buffer is allocated with same size as @src, which, for good compression cases, works fine. However, when compression goes bad (e.g. random bytes payloads), the compressed size can increase significantly, and even by stopping the main loop at 7/8 of @slen, writing leftover literals could write past the end of @dst because of LZ77 metadata. To fix this, add lz77_compressed_alloc_size() helper to compute the correct allocation size for @dst, accounting for metadata and worst cast scenario (all literals). While this is overprovisioning memory, it's not only correct, but also allows lz77_compress() main loop to run without ever checking @dst limits (i.e. a perf improvement). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_daclMichael Bommarito1-3/+3
After validate_dacl() was factored out in commit 149822e5541c, the local end_of_dacl in parse_dacl() is only read by the dump_ace() call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the variable is assigned but never used, which gcc -W=1 flags as -Wunused-but-set-variable. Remove the local and compute the end-of-dacl pointer inline at the single call site inside the existing CIFS_DEBUG2 guard. No functional change: when CIFS_DEBUG2 is enabled the argument value is identical to what the removed local carried; when CIFS_DEBUG2 is disabled the code was already dead. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/ Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: fix (remove) drop_dir_cache module parameterEnzo Matsumiya6-45/+59
Being a module parameter, it's possible to do: # modprobe cifs drop_dir_cache=1 Which will lead to a crash, because cifs_tcp_ses_list hasn't been initialized yet: [ 168.242624] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 168.242952] #PF: supervisor read access in kernel mode [ 168.243175] #PF: error_code(0x0000) - not-present page [ 168.243394] PGD 0 P4D 0 [ 168.243524] Oops: Oops: 0000 [#1] SMP NOPTI [ 168.243703] CPU: 2 UID: 0 PID: 1105 Comm: modprobe Not tainted 7.0.0-lku #5 PREEMPT(lazy) [ 168.244054] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014 [ 168.244557] RIP: 0010:cifs_param_set_drop_dir_cache+0x7c/0x100 [cifs] ... [ 168.248785] Call Trace: [ 168.248915] <TASK> [ 168.249023] parse_args+0x285/0x3a0 [ 168.249204] ? __pfx_unknown_module_param_cb+0x10/0x10 [ 168.249448] load_module+0x192b/0x1bb0 [ 168.249637] ? __pfx_unknown_module_param_cb+0x10/0x10 [ 168.249882] ? kernel_read_file+0x27d/0x2b0 [ 168.250088] init_module_from_file+0xce/0xf0 [ 168.250291] idempotent_init_module+0xfb/0x2f0 [ 168.250496] __x64_sys_finit_module+0x5a/0xa0 [ 168.250694] do_syscall_64+0xe0/0x5a0 [ 168.250863] ? exc_page_fault+0x65/0x160 [ 168.251050] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 168.251284] RIP: 0033:0x7fcaa12b774d Instead of fixing this with some kind of "is module initialized" approach, this patch instead moves that functionality to procfs, setting a write op for the existing open_dirs entry, where writing a 0 to it will drop the cached directory entries. Also make it available only when CONFIG_CIFS_DEBUG=y. A small change needed now is to not call flush_delayed_work() on invalidate_all_cached_dirs() when called from procfs (can't sleep in that context). So add a @sync arg to invalidate_all_cached_dirs() to control when to flush the delayed works. Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents") Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: require a full NFS mode SID before reading mode bitsMichael Bommarito1-0/+1
parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS mode SID and reads sid.sub_auth[2] to recover the mode bits. That assumes the ACE carries three subauthorities, but compare_sids() only compares min(a, b) subauthorities. A malicious server can return an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still matches sid_unix_NFS_mode and then drives the sub_auth[2] read four bytes past the end of the ACE. Require num_subauth >= 3 before treating the ACE as an NFS mode SID. This keeps the fix local to the special-SID mode path without changing compare_sids() semantics for the rest of cifsacl. Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-22smb: client: validate the whole DACL before rewriting it in cifsaclMichael Bommarito1-31/+85
build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a server-supplied dacloffset and then use the incoming ACL to rebuild the chmod/chown security descriptor. The original fix only checked that the struct smb_acl header fits before reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate header-field OOB read, but the rewrite helpers still walk ACEs based on pdacl->num_aces with no structural validation of the incoming DACL body. A malicious server can return a truncated DACL that still contains a header, claims one or more ACEs, and then drive replace_sids_and_copy_aces() or set_chmod_dacl() past the validated extent while they compare or copy attacker-controlled ACEs. Factor the DACL structural checks into validate_dacl(), extend them to validate each ACE against the DACL bounds, and use the shared validator before the chmod/chown rebuild paths. parse_dacl() reuses the same validator so the read-side parser and write-side rewrite paths agree on what constitutes a well-formed incoming DACL. Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>