| Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull smb server updates from Steve French:
"This is mostly a correctness and compatibility update for ksmbd's
SMB2/3 lease, oplock, durable handle, compound request, CREATE,
rename, stream and share-mode handling.
A large part of the series fixes cases found by smbtorture where ksmbd
diverged from the SMB2/3 protocol requirements.
The main changes are:
- Rework SMB2 lease state handling so lease state is shared per
ClientGuid/LeaseKey across opens, with better validation of lease
create contexts, ACK handling, epochs, break-in-progress reporting,
v2 lease notification routing, and chained lease breaks
- Fix several oplock break corner cases, including ACK validation,
timeout downgrade behavior, level-II break handling on unlink,
share-conflict lease breaks, and read-control/stat-open behavior
- Fix durable handle behavior around delete-on-close, stale
reconnects, reconnect context parsing, oplock/lease break
invalidation, and durable v2 AppInstanceId replacement
- Fix compound request handling so related commands propagate failed
statuses correctly, preserve response framing across chained
errors, keep compound FIDs across READ/WRITE/FLUSH, and send
interim STATUS_PENDING where clients expect cancellable compound
I/O
- Tighten CREATE and stream semantics, including create attribute
validation, allocation size reporting, explicit create security
descriptors, unnamed DATA stream handling, stream directory
validation, and stream delete sharing against the base file
- Fix rename and metadata behavior, including parent directory
sharing checks, denying directory rename with open children, and
preserving SMB ChangeTime across rename for open handles
- Fix two important safety issues: a multichannel byte-range lock
list owner race that could lead to use-after-free, and an NTLMv2
session key update before authentication proof validation
- Fix a concurrent SMB2 NEGOTIATE preauth use-after-free, a UBSAN
warning in compression capability parsing, a false hung-task
warning in the durable handle scavenger, endian debug logging,
Smatch indentation warnings, and kernel-doc warnings
- Increase the default SMB3 transaction size from 1MB to 4MB to
better match modern read/write negotiation and improve sequential
I/O behavior"
* tag 'v7.2-rc-part2-smb3-server-fixes' of git://git.samba.org/ksmbd: (50 commits)
ksmbd: fix kernel-doc warnings in smb2_lease_break_noti()
ksmbd: fix inconsistent indenting warnings
ksmbd: validate NTLMv2 response before updating session key
ksmbd: increase SMB3_DEFAULT_TRANS_SIZE from 1MB to 4MB
ksmbd: fix UBSAN array-index-out-of-bounds in decode_compress_ctxt()
ksmbd: sleep interruptibly in the durable handle scavenger
ksmbd: start file id allocation at 1
ksmbd: treat read-control opens as stat opens only for leases
ksmbd: validate :: stream type against directory create
ksmbd: break conflicting-open leases only as far as needed
ksmbd: break handle caching for share conflicts
ksmbd: normalize ungrantable lease states
ksmbd: return oplock protocol error for level II ack
ksmbd: avoid level II oplock break notification on unlink
ksmbd: downgrade oplock after break timeout
ksmbd: apply create security descriptor first
ksmbd: return requested create allocation size
ksmbd: tighten create file attribute validation
ksmbd: reject empty-attribute synchronize-only create
ksmbd: honor stream delete sharing for base file
...
|
|
QueryDirectory responses today are stored in one of two fixed
sized buffers: smallbuf (448 bytes) or bigbuf (16KB). These are
borrowed from server struct and are not sufficient for large-sized
query dir operations.
With this change we will now define a new buffer type specifically
for cifs_search_info to hold variable sized responses. These will
be allocated by kmalloc and freed by kfree.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs_setsize() updates the local inode size after SetEOF succeeds. It also
used the new EOF as a local i_blocks estimate, but extending EOF does not
prove that the intervening range was allocated.
For example, after writing 1 MiB and then extending EOF to 10 MiB, the
client can report the file as fully allocated even though the server still
reports a much smaller AllocationSize:
$ dd if=/dev/zero of=test bs=1M count=1
$ truncate -s 10M test && stat -c 'size=%s blocks=%b' test
$ stat --cached=never -c 'size=%s blocks=%b' test
client stat: size=10485760 blocks=20480
server stat: size=10485760 blocks=2056
client stat after revalidation: size=10485760 blocks=2056
A later attribute revalidation may correct i_blocks, but callers such as
xfstests generic/495 invoke swapon immediately after truncate. The swapfile
hole check can therefore observe the inflated local i_blocks value and
accept a sparse file.
Do not grow i_blocks from cifs_setsize() on EOF extension. Only clamp it
on shrink; allocation growth must come from write completion or from
server-reported AllocationSize.
With this change, EOF extension no longer makes a sparse file appear
fully allocated before the next attribute revalidation, and xfstests
generic/495 no longer accepts it through the inflated local i_blocks value.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_set_sparse() converts every FSCTL_SET_SPARSE failure to false and
marks sparse support as broken for the share. Callers therefore report
EOPNOTSUPP even for errors such as ENOSPC or EACCES, and later sparse
operations remain disabled until the share is unmounted.
Return the SMB2 ioctl error directly. Set broken_sparse_sup only for
EOPNOTSUPP, which indicates that the server does not support the request.
Update smb3_punch_hole() to propagate the error returned by
smb2_set_sparse().
Fixes: 3d1a3745d8ca ("Add sparse file support to SMB2/SMB3 mounts")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
receive_encrypted_standard() allocates next_buffer before checking
whether the number of compound PDUs already reached MAX_COMPOUND. If
the limit check fails, the function returns immediately and the newly
allocated next_buffer is not assigned to server->smallbuf/server->bigbuf,
making it leaked.
Move the MAX_COMPOUND check before allocating next_buffer.
Fixes: b24df3e30cbf ("cifs: update receive_encrypted_standard to handle compounded responses")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently, __reconnect_target_locked() prints the error pointer
'hostname' using PTR_ERR() and %ld, printing only the error code.
Fix this by using by using %pe to print the error symbolic code instead,
and remove PTR_ERR().
Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
FALLOC_FL_ALLOCATE_RANGE identifies the default fallocate
allocation mode and is defined as zero.
Use the symbolic name instead of a literal zero in
smb3_fallocate() to make the mode dispatch clearer. This
does not change behavior.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Add various defines for AAPL open context, e.g. for queries of
server capabilities.
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Ownership (chown) and group (chgrp) modifications were being ignored when
mounting with SMB3 POSIX Extensions unless CIFS_MOUNT_CIFS_ACL or
CIFS_MOUNT_MODE_FROM_SID were also explicitly set.
Fix this by checking for posix_extensions in cifs_setattr_nounix() when
updating UID and GID, ensuring that id_mode_to_cifs_acl() is called to map
and set the ownership/group information on the server.
Cc: stable@vger.kernel.org
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In id_mode_to_cifs_acl(), aclflag was initialized to CIFS_ACL_DACL by default.
This forced the client to request setting the DACL even when only an ownership
(chown) or group (chgrp) change was being performed.
Let build_sec_desc() do the proper flag calculation by initializing aclflag
to 0. build_sec_desc() sets the appropriate bits (CIFS_ACL_OWNER, CIFS_ACL_GROUP,
or CIFS_ACL_DACL) depending on what actually changed. During ownership transfer,
CIFS_ACL_DACL is only set if replace_sids_and_copy_aces() actually replaces the
SIDs inside any of the DACL's ACEs.
If build_sec_desc() results in aclflag being 0 (meaning no changes were mapped),
exit early to avoid sending an empty security descriptor update to the server.
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
kernel test robot report missing kernel-doc descriptions for the 'wait_ack'
and 'inc_epoch' parameters of smb2_lease_break_noti():
Warning: fs/smb/server/oplock.c:937 function parameter 'wait_ack' not
described in 'smb2_lease_break_noti'
Warning: fs/smb/server/oplock.c:937 function parameter 'inc_epoch' not
described in 'smb2_lease_break_noti'
Document both parameters to silence the warnings.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Detected by Smatch.
fs/smb/server/oplock.c:1446 smb_grant_oplock()
warn: inconsistent indenting
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_auth_ntlmv2() derives the NTLMv2 session key into
sess->sess_key before it verifies the NTLMv2 response.
ksmbd_decode_ntlmssp_auth_blob() then continues into KEY_XCH even
when ksmbd_auth_ntlmv2() failed.
With SMB3 multichannel binding, the failed authentication operates on
an existing session and the session setup error path does not expire
binding sessions. A client can send a binding session setup with a
bad NT proof and KEY_XCH and still modify sess->sess_key before
STATUS_LOGON_FAILURE is returned.
Relevant path:
smb2_sess_setup()
-> conn->binding = true
-> ntlm_authenticate()
-> session_user()
-> ksmbd_decode_ntlmssp_auth_blob()
-> ksmbd_auth_ntlmv2()
-> calc_ntlmv2_hash()
-> hmac_md5_usingrawkey(..., sess->sess_key)
-> crypto_memneq() returns mismatch
-> KEY_XCH arc4_crypt(..., sess->sess_key, ...)
-> out_err without expiring the binding session
Derive the base session key into a local buffer and copy it to
sess->sess_key only after the proof matches. Return immediately on
authentication failure so KEY_XCH is only processed after successful
authentication.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Fixes: f9929ef6a2a5 ("ksmbd: add support for key exchange")
Cc: stable@vger.kernel.org
Signed-off-by: Haofeng Li <lihaofeng@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This patch raises `SMB3_DEFAULT_TRANS_SIZE` to 4MB to align it with
`smb2 max read/write`. This allows better I/O negotiation with modern
clients and improves sequential read/write performance on high-speed
networks.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
decode_compress_ctxt() walks CompressionAlgorithms[] using the client
supplied CompressionAlgorithmCount. That field is declared in
struct smb2_compression_capabilities_context as a fixed 4-element array,
but the number of algorithms is actually variable and clients such as
Windows advertise more than four (e.g. LZ77, LZ77+Huffman, LZNT1,
Pattern_V1 and LZ4).
The on-wire context length is already validated, so the access is within
the received buffer, but indexing the statically sized [4] array makes
UBSAN report an out-of-bounds access:
UBSAN: array-index-out-of-bounds in smb2pdu.c:1122:48
index 4 is out of range for type '__le16 [4]'
Call Trace:
smb2_handle_negotiate+0xda7/0xde0 [ksmbd]
ksmbd_smb_negotiate_common+0x27b/0x3e0 [ksmbd]
smb2_negotiate_request+0x14/0x20 [ksmbd]
handle_ksmbd_work+0x181/0x500 [ksmbd]
Walk the algorithms through a pointer so the fixed-array bounds check is
not applied, while keeping the existing length validation that bounds the
loop to the data actually received.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The durable handle scavenger kthread waits up to DURABLE_HANDLE_MAX_TIMEOUT
(300 seconds) between scans using wait_event_timeout(), which sleeps in
TASK_UNINTERRUPTIBLE. When there are no durable handles pending expiry the
task stays in D state far longer than 120 seconds, so the hung task
detector prints a bogus "task ksmbd-durable-s blocked for more than 120
seconds" warning with a backtrace, even though the thread is only idle.
Use wait_event_interruptible_timeout() so the thread sleeps in
TASK_INTERRUPTIBLE, which the hung task detector ignores. This also suits
the already-freezable kthread. Treat a negative return (e.g. -ERESTARTSYS)
like a timeout when recomputing the next wake interval.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd allocates both the volatile id (per-session file table) and the
persistent id (global file table) with idr_alloc_cyclic() starting at 0.
The first open after the module loads therefore gets volatile id 0 and
persistent id 0, and ksmbd returns an SMB2 FileId of {0, 0} in the create
response.
Clients treat an all-zero FileId as a null handle. smbtorture's
smb2_util_handle_empty() considers {0, 0} empty, so tests that guard the
close with it (e.g. smb2.oplock.statopen1, smb2.lease.statopen*) never
close that first handle. The leaked open keeps the inode's oplock count
non-zero, so a later batch oplock request on the same file is downgraded
to level II and the test fails.
Start the id allocation at 1 (KSMBD_START_FID) so no handle is ever
assigned a {0, 0} FileId, matching the behaviour of other SMB servers.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A second open that requests only metadata-level access must not break
the existing caching state. ksmbd already skips the break for such opens
via fp->attrib_only (FILE_READ_ATTRIBUTES,
FILE_WRITE_ATTRIBUTES and FILE_SYNCHRONIZE).
An open requesting only READ_CONTROL (reading the security descriptor)
must be treated differently depending on the existing caching state.
smbtorture smb2.lease.statopen4 expects a read-control open NOT to break
a caching lease, while smb2.oplock.statopen1 expects the same open to
break a batch oplock. So READ_CONTROL is a stat open for leases but not
for oplocks.
Extend the stat-open break-skip in smb_grant_oplock() to also cover a
read-control-only open, but only when the existing holder is a lease.
The global fp->attrib_only flag (used for share-mode, rename and truncate
decisions) is left unchanged so oplock behaviour is preserved.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.streams.dir opens <dir>::$DATA with FILE_DIRECTORY_FILE and expects
STATUS_NOT_A_DIRECTORY, then opens <dir>::$DATA without it and expects
STATUS_FILE_IS_A_DIRECTORY.
Commit "treat unnamed DATA stream as base file" canonicalizes the ::$DATA
suffix to a NULL stream name so the open continues through the base-file
path. That skipped the stream/directory type validation, which was
guarded by "if (stream_name)", so opening a directory's ::$DATA stream
with FILE_DIRECTORY_FILE incorrectly returned STATUS_OK and a plain open
of it no longer reported STATUS_FILE_IS_A_DIRECTORY.
parse_stream_name() still records the explicit $DATA type in s_type even
when it clears stream_name. Run the data-stream vs directory validation
whenever s_type is DATA_STREAM, not only when stream_name is set, so the
canonicalized ::$DATA open is rejected with the correct status.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.lease.oplock and smb2.lease.breaking1 hold a lease and then issue a
single conflicting open on the same file. The held lease must break one
step to drop write caching (RWH->RH, RW->R) and then stop, so
lease_break_info.count is 1 and the lease keeps its read/handle caching.
ksmbd instead cascaded the break all the way down to none
(e.g. RWH->RH->R->none), so the break count was 2 or 3 and the reported
lease state ended at 0. Commit "chain pending lease breaks before waking
waiters" forces break_level to SMB2_OPLOCK_LEVEL_NONE for any non-lease
open against a handle-caching lease, which drives oplock_break()'s retry
loop down to none even when only one open is contending.
Drop that break_level override so a conflicting open breaks a lease only
to its own compatible level (level II, i.e. RH/R).
A deeper break is still required when a truncating open is also waiting
behind the same lease break. smb2.lease.breaking3 keeps a normal open
pending through RWH->RH and an overwrite open pending behind it, and
expects the lease to continue RH->R->none before either open completes.
The overwrite waiter sets open_trunc on the lease while it blocks on the
pending break, so extend the retry loop to chain another break while that
truncating waiter still needs the lease at none. The per-break open_trunc
snapshot stays cleared, so the cascade steps down (RH->R->none) instead of
collapsing straight to none, and the normal open stays pending until the
lease is fully broken.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.lease.break_twice first opens a file with an RHW lease and then tries
a second open with restrictive sharing. That open must fail with a sharing
violation, but the existing lease should be broken from RHW to RW because
only handle caching conflicts with the requested sharing.
ksmbd used the normal write-cache break calculation for this path, so RHW
was broken to RH. The following successful open then did not generate the
expected second break from RW to R.
Pass share-conflict context into the lease break helper and, for lease
breaks caused by sharing, drop only SMB2_LEASE_HANDLE_CACHING from the
current lease state. Other break paths keep the existing write/truncate
break behavior.
A share-conflict break must also remain a single break. The triggering
open fails with a sharing violation and is never granted, so there is no
target oplock level to converge on. The lease break retry loop, however,
keeps breaking while the lease level is still above req_op_level, which
broke RHW all the way down to R in one open (lease_break_info.count became
2 instead of 1). Skip the again loop for share-conflict breaks so the
sharing open produces exactly one RHW->RW break and the later successful
open produces the separate RW->R break the test expects.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.lease.request verifies which SMB2 lease state combinations are granted
by the server. Requests for H-only, W-only, and HW leases are valid lease
state bitmasks, but they are not grantable combinations and should be
returned as lease state none.
ksmbd only checked that the requested bits were inside the SMB2 lease state
mask. As a result it could grant H-only, W-only, or HW requests and return
non-zero lease states where the client expects no lease.
Keep the bitmask validation, but normalize ungrantable combinations to zero
before allocating or looking up the lease. The grantable combinations
remain unchanged: R, RH, RW, and RHW.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2 level II to none oplock breaks do not require an acknowledgment from
the client. smb2.oplock.levelii500 intentionally acknowledges such a break
and expects the server to reject it with STATUS_INVALID_OPLOCK_PROTOCOL.
ksmbd drops the local level II oplock to none immediately after sending the
break notification because it does not wait for an ACK. When the client
then sends the invalid ACK, smb20_oplock_break_ack() sees that the oplock
is not in OPLOCK_ACK_WAIT state and returns STATUS_INVALID_DEVICE_STATE
before checking the current oplock level.
If the oplock is already none when an unexpected SMB2 oplock break ACK
arrives, report STATUS_INVALID_OPLOCK_PROTOCOL. Keep the existing
STATUS_INVALID_DEVICE_STATE response for other unexpected non-wait states.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_util_unlink() opens the target with FILE_DELETE_ON_CLOSE and then
closes that handle. Other clients can also mark a file for delete with
SMB2 SET_INFO FileDispositionInformation.
When these unlink paths break existing SMB2 level II oplocks, ksmbd sends
an unsolicited SMB2_OPLOCK_BREAK notification to none. This races with the
synchronous CREATE or SET_INFO response expected by the client, and
smbtorture reports NT_STATUS_INVALID_NETWORK_RESPONSE while running
smb2.oplock.exclusive2.
SMB2 level II oplock breaks do not require an acknowledgment in the delete
path. Keep lease handling unchanged, but drop plain SMB2 level II oplocks
locally for unlink requests without sending a break notification. Normal
write/truncate paths still send the level II to none notification,
preserving the behavior covered by smb2.oplock.levelII500.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.oplock.batch22a opens a file with a batch oplock and then issues a
second open that waits for the oplock break timeout. After the timeout the
second open should succeed, but the granted oplock level must be level II.
When the break times out, oplock_break() returns -ENOENT after invalidating
the previous opener. smb_grant_oplock() went straight to set_lev with the
original requested oplock level, so the second open could be granted a new
batch oplock. Downgrade the requested oplock to level II on the -ENOENT
break-timeout path before granting the oplock to the new open.
A break that completes because the previous owner closed its handle from
the oplock break handler must be distinguished from a real timeout.
smb2.oplock.batch7 closes the first handle during the break wait, and the
second open is then expected to be granted the originally requested batch
oplock. Return -EAGAIN from the non-lease break path when the previous
opener closed during the break wait, recheck sharing in smb_grant_oplock(),
and grant the requested oplock if the close removed the conflict. Real
break timeouts still return -ENOENT and keep the downgrade to level II.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.create.aclfile creates files with an SMB2_CREATE_SD_BUFFER create
context and expects the resulting security descriptor to match
the descriptor supplied by the client.
ksmbd currently tries to inherit the parent DACL first and only parses
the SMB2_CREATE_SD_BUFFER context when DACL inheritance fails.
If inheritance succeeds, the explicit security descriptor supplied on
create is ignored. This breaks create requests that include owner/group
information in the security descriptor.
Apply the create security descriptor first when the context is present.
Fall back to the existing inherited/default ACL path only when no create
security descriptor was supplied.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.create.blob sends an SMB2_CREATE_ALLOCATION_SIZE create context with
a 1MiB allocation size and expects the create response AllocationSize field
to match the requested size. smb2.create.open additionally compares the
AllocationSize returned in the CREATE response with the AllocationSize
returned by FILE_ALL_INFORMATION on the same handle.
ksmbd applies the allocation with fallocate(), but then fills both the
create response and handle-based information from stat.blocks << 9. On
filesystems such as ext4 this can include filesystem allocation rounding
and metadata effects, causing a response larger than the SMB2 allocation
size context and a disagreement between the two queries.
Remember the requested allocation size while processing the create context,
store the reported allocation size in struct ksmbd_file, and use it for
both the create response and handle-based allocation size responses. Update
the stored value when FILE_ALLOCATION_INFORMATION changes it, and fall back
to stat.blocks << 9 when no allocation size context was provided.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.create.gentest checks each create FileAttributes bit independently and
expects FILE_ATTRIBUTE_INTEGRITY_STREAM and FILE_ATTRIBUTE_NO_SCRUB_DATA to
be rejected with STATUS_INVALID_PARAMETER.
ksmbd validates create FileAttributes against FILE_ATTRIBUTE_MASK, which
includes those bits. It also rejects only requests that have no known
attribute bit at all, so a request containing both known and unknown bits
can pass validation.
Use a create-specific attribute mask that excludes INTEGRITY_STREAM and
NO_SCRUB_DATA, and reject any bit outside that mask.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.create.gentest checks each desired access bit independently and
expects an open that requests only SYNCHRONIZE with CreateDisposition
OPEN_IF and FileAttributes 0 to fail with STATUS_ACCESS_DENIED.
Rejecting all SYNCHRONIZE-only opens is too broad: SYNCHRONIZE does not
imply read, write, or delete data access, and
smb2.sharemode.sharemode-access expects a SYNCHRONIZE-only open to succeed
when it does not conflict with the existing share mode.
Limit the rejection to the gentest create shape: SYNCHRONIZE-only access,
OPEN_IF disposition, and no file attributes. Other synchronize-only opens
are handled by the normal permission and share-mode checks.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.streams.delete opens an alternate data stream without
FILE_SHARE_DELETE and then tries to delete the base file. Windows rejects
the base-file delete with STATUS_SHARING_VIOLATION while the stream handle
is open. ksmbd tracks stream opens on the same ksmbd_inode as the base
file, but the delete-on-close path only checked delete access on the base
handle before marking the inode delete-pending. As a result, deleting
the base file succeeded even though an open stream handle denied delete
sharing. Add a helper to detect open stream handles on the same inode that
do not allow FILE_SHARE_DELETE, and reject base-file delete pending and
DELETE opens with a sharing violation in that case.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.compound_async.write_write and smb2.compound_async.read_read expect
the last I/O request in a compound request to become cancellable before
its final response is received. smb clients mark a request cancellable
after receiving an interim STATUS_PENDING response.
ksmbd handled the last READ/WRITE synchronously and returned the final
response directly, so the client never observed STATUS_PENDING and
req->cancel.can_cancel remained false.
For the last READ or WRITE in a compound request, register the work briefly
as async and send a STATUS_PENDING interim response before continuing with
the normal synchronous completion. The final READ/WRITE response remains
unchanged.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_close_fd() marks an open file as FP_CLOSED and drops the file table
reference. If another in-flight request still holds a reference, the final
close is deferred until that request drops its reference.
The function currently returns -EINVAL in that deferred-final-close case
because fp is cleared when the reference count does not reach zero. That
turns a valid close into STATUS_FILE_CLOSED.
smb2.compound_find.compound_find_close sends QUERY_DIRECTORY and then
closes the same directory handle before receiving the find response.
The query holds a reference while it builds the response, so close must
mark the handle closed and return success even though final teardown is
delayed. Track whether the handle was successfully transitioned to
FP_CLOSED and return success when only the final close is deferred.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
set_smb2_rsp_status() resets the response iov and compound offsets before
building an error response. That is fine for a single request, but it
corrupts a compound response when an error is detected after an earlier
compound element has already been completed.
smb2.compound.invalid4 sends a READ as the first compound element and a
bogus command as the second one. The READ response must remain in
the compound response with STATUS_END_OF_FILE, followed by the bogus
command response with STATUS_INVALID_PARAMETER. Resetting the response
state for the second command breaks the compound framing and the client
reports NT_STATUS_INVALID_NETWORK_RESPONSE.
When setting an error for a chained command, update and pin only
the current compound response slot instead of resetting the whole response.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
FSCTL_CREATE_OR_GET_OBJECT_ID returned a dummy successful response without
checking whether the request handle was valid. That let an invalid related
compound handle succeed in smb2.compound.related5, although the client
expected STATUS_FILE_CLOSED.
Look up the file handle before building the object id response and fail
with STATUS_FILE_CLOSED when the handle is invalid or already closed.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In a related compound request, later commands can refer to the file handle
from an earlier command using the related FID value. If the earlier
command fails without producing a valid compound FID, the later related
commands must fail with the same status instead of operating on an invalid
or stale handle.
smb2.compound.related4 sends CREATE followed by IOCTL, CLOSE and SET_INFO.
The CREATE is expected to fail with STATUS_ACCESS_DENIED, and the remaining
related commands are expected to return STATUS_ACCESS_DENIED as well. ksmbd
only stored the compound FID on successful CREATE and did not remember
failed compound statuses.
Store the failed status in the work item and make related handle-based
requests fail immediately with that status only when the compound FID is
invalid. Also preserve and consume the related FID across successful
FLUSH, READ and WRITE requests whose responses do not carry a file id. Keep
a valid compound FID across non-close failures so later related commands
can continue to use the handle.
When extracting the FID from a successful READ, WRITE or FLUSH request, use
the request structure matching the SMB2 command: READ and WRITE place
PersistentFileId and VolatileFileId at a different offset than FLUSH, so a
single smb2_flush_req cast can save the wrong value as compound_fid and
make the following related request fail with STATUS_FILE_CLOSED
(smb2.compound_async.write_write after smb2.compound_async.flush_flush).
Only update the saved compound FID when the request carries a valid
volatile FID. otherwise an all-ones related FID would overwrite the CREATE
FID and break smb2.compound.related6.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Windows denies renaming a directory while a file below that directory is
still open. smb2.rename.rename_dir_openfile checks this by keeping a file
handle open under the directory and then attempting to rename the directory
handle. ksmbd did not check open children before calling vfs_rename(), so
the rename incorrectly succeeded.
For non-POSIX clients, scan the global open file table for active handles
whose dentries are below the directory being renamed. If any child is
open, fail the rename with -EACCES so the client receives
STATUS_ACCESS_DENIED.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When renaming a file, some existing opens on the parent directory must
block the rename with STATUS_SHARING_VIOLATION. This includes parent
directory handles opened with DELETE access and handles opened without
FILE_SHARE_DELETE.
ksmbd checked only the parent's desired access for FILE_DELETE. That
handled smb2.rename.share_delete_and_delete_access, but missed the case
where the parent directory was opened without delete access and without
delete sharing, so smb2.rename.no_share_delete_no_delete_access incorrectly
succeeded.
Attribute-only parent opens, however, must not block the rename.
smb2.rename.msword opens the parent directory with only SYNCHRONIZE and
FILE_READ_ATTRIBUTES, no share access, and then renames an already-open
child file. Windows allows this pattern.
Reject parent directory handles that request DELETE access, and reject
non-attribute-only parent opens that deny FILE_SHARE_DELETE, while allowing
attribute-only parent opens to coexist with child rename.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
inode ctime is updated when a file is renamed. ksmbd returned that
ctime directly as SMB2 ChangeTime for handle-based query information.
This makes ChangeTime change after a rename through an already-open
handle, while Windows keeps the handle's ChangeTime stable for this
case.
Store the SMB ChangeTime in struct ksmbd_file when the handle is opened
and use that value for create, close, and handle-based query information
responses. If a client explicitly sets FILE_BASIC_INFORMATION
ChangeTime, update the stored value as well.
This fixes smbtorture smb2.rename.simple_modtime, which expects
ChangeTime and LastWriteTime to remain unchanged after renaming an
already-open file.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The SMB2_CREATE_APP_INSTANCE_ID create context is used with durable v2
opens to identify another open from the same application instance. When
a new durable v2 open arrives with the same AppInstanceId as an existing
open, the server should close the previous open without sending an
oplock break notification.
ksmbd ignored this create context. A second durable v2 batch oplock open
with the same AppInstanceId therefore went through the normal competing
open path and sent an oplock break to the first opener. smbtorture
smb2.durable-v2-open.app-instance expects no oplock break and then
expects the old handle to be closed.
Parse and store AppInstanceId for durable v2 opens. Before creating the
new open, find an existing file with the same AppInstanceId and close it
through the normal close teardown path without issuing an oplock break.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2 create context DataLength describes only the create context data
payload. It does not include the create context header, name field, or
any local padding that exists in ksmbd's helper structures.
ksmbd validated durable reconnect contexts by comparing
DataOffset + DataLength against sizeof the whole helper structure. This
rejects a valid durable v2 reconnect context because the wire DH2C data
is 36 bytes while struct create_durable_handle_reconnect_v2 contains an
extra four byte pad.
Validate the durable context payload length against the corresponding
payload member instead. Also keep the reconnect context authoritative
when a later durable request context is present, matching the existing
durable v1 reconnect behavior.
This fixes smbtorture smb2.durable-v2-open.durable-v2-setinfo, where
the durable v2 reconnect after SET_INFO was rejected with
STATUS_INVALID_PARAMETER.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When a durable handle is preserved after a disconnect, its oplock state
can still block later opens. If another client opens the same file and
the preserved oplock or lease has to be broken, the old durable handle
must no longer be reconnectable after the break cannot be acknowledged.
ksmbd was treating a missing connection, or an oplock break timeout, as a
successful break only by downgrading the oplock state. The old durable
handle remained reconnectable, so a later durable reconnect for that
stale handle could succeed.
The open path can also see a detached durable handle before the break
notification helpers fully dispose of it. Invalidate such a preserved
durable handle directly when a competing open has to break its batch or
exclusive oplock, while leaving ordinary durable reconnects without a
competing open untouched.
If the old handle still reaches the reconnect path, reject it when the
same inode already has another active open. This matches the
smb2.durable-open.open2-lease/open2-oplock sequence where a later open
replaces the disconnected durable owner and the stale first handle must
not be reclaimed.
Also, reconnect lookup used only the persistent id. A new durable open
can get a persistent id that matches the stale reconnect request after
the old durable state is invalidated. Preserve the disconnected
handle's old volatile id and require durable reconnect contexts to match
it, so a stale reconnect cannot attach to a different durable open.
Windows allows the later open to proceed and rejects the old reconnect
with STATUS_OBJECT_NAME_NOT_FOUND. The smbtorture
smb2.durable-open.oplock test covers this case.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A durable handle opened with FILE_DELETE_ON_CLOSE is preserved across a
disconnect so it can be reclaimed by a durable reconnect.
smb2.durable-open.delete_on_close2 disconnects such a handle and then
reconnects it, expecting the reconnect to succeed.
When the client does not reconnect but instead opens the same name with a
new delete-on-close create, the preserved handle keeps the file present
with delete-on-close set. ksmbd then rejects the new open with
STATUS_ACCESS_DENIED on the file_present + FILE_DELETE_ON_CLOSE +
OPEN_IF/OVERWRITE_IF path. smb2.durable-open.delete_on_close1 expects this
open to create a fresh, empty file instead, i.e. the disconnected handle's
delete-on-close must take effect first.
Add ksmbd_close_disconnected_durable_delete_on_close(), which closes
disconnected (conn == NULL) durable handles that keep a delete-on-close
file present. The final close promotes S_DEL_ON_CLS to S_DEL_PENDING and
unlinks the file, so a re-resolved path is absent and the new open creates
it fresh. Call it from smb2_open() before the delete-on-close conflict
check, only for the conflicting open shapes. A live (connected) handle
still keeps the file and blocks the open as before.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_find_context_vals() assumes that callers only search create
contexts when the SMB2 CREATE request contains a non-empty create context
area. That is not always true. a client can send RequestedOplockLevel set
to SMB2_OPLOCK_LEVEL_LEASE without a lease create context.
In that case parse_lease_state() searches for a lease context and
smb2_find_context_vals() starts parsing from offset 0 with length 0,
returning -EINVAL. This makes the open fail with STATUS_INVALID_PARAMETER.
The smbtorture smb2.lease.duplicate_open test hits this while creating
a second file without a lease request.
Return NULL when the request has no create context area so the missing
context is treated the same as any other absent create context. The open
then continues without granting a lease.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
conn->preauth_info is shared connection state (struct
preauth_integrity_info, kmalloc-96) that is allocated and freed by the
SMB2 NEGOTIATE handler and read by the response send path.
smb2_handle_negotiate() allocates conn->preauth_info, and on a
deassemble_neg_contexts() failure kfrees it and sets it to NULL. Both the
allocation and the free/NULL happen under ksmbd_conn_lock(conn) (the
connection srv_mutex), which is held across the whole handler body.
The response send path smb3_preauth_hash_rsp(), called from the send:
block of __handle_ksmbd_work(), reads conn->preauth_info and dereferences
conn->preauth_info->Preauth_HashValue (via
ksmbd_gen_preauth_integrity_hash()) without taking conn_lock. When a
client drives two SMB2 NEGOTIATE requests on the same connection, one
worker can free conn->preauth_info on the failing-negotiate path while a
concurrent send-path worker is reading it, producing a slab
use-after-free read (KASAN-confirmed).
The send-path read tested conn->preauth_info for NULL but raced with the
free that occurs between the NULL check and the dereference, so the NULL
guard alone does not close the window.
Serialize the NEGOTIATE-branch read in smb3_preauth_hash_rsp() under
ksmbd_conn_lock(conn) and re-check conn->preauth_info inside the lock.
Because the negotiate handler holds conn_lock across its kfree + NULL
assignment, a reader that also takes conn_lock either runs fully before
the allocation or fully after the NULL store, and can never observe the
freed-but-not-yet-NULLed pointer. ksmbd_gen_preauth_integrity_hash()
takes no locks itself (it only computes a SHA-512 over the buffer), so
no lock-ordering inversion is introduced, and conn_lock is a sleepable
mutex which is safe on this send path (it already performs network I/O).
Fixes: aa7253c2393f ("ksmbd: fix memory leak in smb2_handle_negotiate")
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Most SMB responses need no more than four kvec entries, but every work
item currently allocates a separate four-entry array and frees it after
the response is sent.
Embed the common array in struct ksmbd_work and allocate a larger array
only when a response exceeds the inline capacity. This removes one
allocation and one free from the common request path while preserving
support for larger compound and read responses.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
v2 leases are scoped by ClientGuid. When the same client uses multiple
connections, smbtorture expects lease break notifications to be sent on
the connection associated with the client lease table, not necessarily
on the connection that owns the individual open being broken.
Keep a referenced connection in the lease table and use it for v2 lease
break notifications while it is still active. Fall back to the open's
connection if the table connection is being released.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The delete paths only marked the opened file delete pending or
delete-on-close. When another client still held a read/handle lease, no
lease break was sent before the delete state changed.
smb2.lease.unlink uses a create request with FILE_DELETE_ON_CLOSE and
expects the second client's unlink to break the first client's RH lease to
R with ACK_REQUIRED set. SetInfo(FileDispositionInformation) has the same
lease-breaking requirement.
Break level-II/read-handle leases before setting delete pending or
delete-on-close so clients are notified before the file is removed.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
v2 lease responses should continue from the client supplied epoch.
Initialize a new v2 lease from the requested epoch plus one so create
responses match the epoch returned by Windows and expected by smbtorture.
For a single chained break sequence, increment the epoch only for the first
break notification. Follow-up breaks such as RH->R and R->NONE in
smb2.lease.v2_breaking3 reuse the same epoch.
Record when a waiter slept behind pending_break and let the later
truncate/open overwrite break consume that marker to reuse the current
epoch instead of assigning a new one.
Do not increment the epoch when a same-client, same-key create asks for
the already granted RH state. The epoch changes only when the granted lease
state changes.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2.lease.breaking4 expects an overwrite against an RH lease to send
RH->NONE lease break notification but complete the triggering create
without waiting for the break ack.
Keep the lease in break-in-progress state until the client eventually
acknowledges the downgrade, but do not hold the overwrite request behind
that ack.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A pending open can require more than one lease break before the existing
lease becomes compatible with the operation that triggered the break.
smb2.lease.breaking3 expects the server to hold the pending normal open
through RWH->RH and RH->R, while a later overwrite waiter must not
collapse that second break directly to RH->NONE.
Keep pending_break held for lease breaks until the current triggering
operation is compatible with the lease state. Snapshot the truncate request
per oplock_break() call so another waiter cannot overwrite the state of
the active break.
Use the requested oplock level when deciding whether to chain another
break. A second lease open only needs RWH->RH, while a normal none-oplock
open can continue down to R and then NONE.
For non-truncating metadata operations, break leases only down to read
caching. Operations such as delete-on-close need to drop handle caching,
but should not send a second R->NONE break after the client acknowledges
RH->R.
Also send STATUS_PENDING for levelII/read-lease break waiters. An async
SMB2 create becomes cancelable only after the server sends
an NT_STATUS_PENDING interim response. A waiter that blocks behind an
already active lease break must receive the interim response before
sleeping on pending_break, otherwise the client can process a later lease
break while the create request is still not marked pending.
Avoid duplicate interim responses when an overwrite first breaks a write
oplock and then scans levelII/read leases.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2_LEASE_FLAG_BREAK_IN_PROGRESS is a transient create response flag,
not persistent lease state.
Do not store the flag in lease->flags when a same-key open is granted
during a pending break. Instead, derive it from lease opens that are still
waiting for a break ACK while building the lease create response, and keep
lease->flags for persistent lease flags such as the parent lease key.
This clears the flag naturally after the break ACK completes and fixes
reopen responses that report BREAK_IN_PROGRESS after the lease is no
longer breaking.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The SMB path suffix :: names the unnamed data stream of the base
file, not an alternate data stream backed by a DosStream xattr.
Canonicalize an empty stream name with an explicit type to a NULL
stream name after parsing. This keeps the base filename produced by
strsep() and lets open continue through the normal base-file path instead
of looking for a non-existent empty stream xattr.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Handle SMB2 oplock break acknowledgments according to the server-side
validation rules in MS-SMB2.
Return STATUS_INVALID_DEVICE_STATE when an ACK arrives while the open is
not breaking, reject SMB2_OPLOCK_LEVEL_LEASE with
STATUS_INVALID_PARAMETER, allow BATCH acknowledgments to EXCLUSIVE, and
make invalid ACK levels fail with STATUS_INVALID_OPLOCK_PROTOCOL after
lowering the oplock to NONE.
Update the successful response from the final granted oplock level instead
of relying on the oplock transition helpers, which could turn invalid ACKs
into successful responses.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Model SMB2 leases as per-client/per-key objects instead of keeping a
separate lease copy in every oplock_info. The lease table now stores
lease objects and each lease tracks the opens that reference it.
This makes same ClientGuid/LeaseKey opens observe a single lease state,
so lease upgrades, breaks, ACKs, and close teardown do not diverge across
per-open copies. Keep one reference for the lease table entry and one
reference for each open, and remove the table entry when the last open is
detached.
Update lease break ACK handling to refresh all open oplock levels from
the shared lease state.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Do not echo reserved v1 lease flags back to clients. For lease v2
responses, only return BREAK_IN_PROGRESS and PARENT_LEASE_KEY_SET when
they are meaningful, and preserve the parent lease key in the response.
Allow directory leases whenever the request is a valid lease v2 request,
and initialize v2 lease epochs from the first server-granted state change.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Do not skip valid lease states containing WRITE_CACHING when breaking
level-II/read leases for writes and truncates.
Handle lease break acknowledgments according to the SMB2 rule that the
acknowledged state must be a subset of the server's break target. Apply
the acknowledged state directly and keep the break pending on failed ACKs.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
MS-SMB2 defines the lease table lookup key as Connection.ClientGuid.
Use the connection ClientGUID consistently when checking for same-client
leases and duplicate lease keys.
Also preserve directory and parent lease metadata when copying an existing
lease state to a new open.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Validate SMB2 lease context lengths, requested lease state bits, and v2
flags before using the context. Return errors via ERR_PTR so CREATE can
distinguish a missing lease context from a malformed one.
Also ignore lease v2 contexts for SMB 2.1, where they are not valid.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Convert to CPU byte order to avoid incorrect debug log
on big-endian architectures.
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2_LOCK adds each granted byte-range lock to both the file lock list
and the lock list of the connection which handled the request. The
final close and durable handle paths, however, remove the connection
list entry while holding fp->conn->llist_lock.
With SMB3 multichannel, the connection handling the LOCK request can be
different from the connection which opened the file. The entry can
therefore be removed under a different spinlock from the one protecting
the list it belongs to. A concurrent traversal can then access freed
struct ksmbd_lock and struct file_lock objects.
Record the connection owning each lock's clist entry and hold a
reference to it while the entry is linked. Use that connection and its
llist_lock for unlock, rollback, close, and durable preserve. Durable
reconnect assigns the new connection as the owner when publishing the
locks again.
Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel")
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Refactor the control flow in id_mode_to_cifs_acl() to reduce nesting and
prevent error code overwriting.
Instead of wrapping the call to ops->set_acl() in a conditional block,
introduce early exits (goto id_mode_to_cifs_acl_exit) when build_sec_desc()
fails or ops->set_acl is NULL. This ensures that any actual error returned
by build_sec_desc() is not overwritten with -EOPNOTSUPP.
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_query_directory_init() fails before the next send,
cleanup retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_notify_init() fails before the next send, cleanup
retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_query_info_init() fails before the next send,
cleanup retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_close_init() fails before the next send, cleanup
retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_ioctl_init() fails before the next send, cleanup
retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A response-bearing attempt can return a replayable error and free its
response buffer. If SMB2_open_init() fails before the next send, cleanup
retains the previous buffer type and frees that response again.
Reset response bookkeeping before each attempt to prevent the stale free.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2_flush() keeps its response buffer bookkeeping across replay
attempts. If a replayable flush response is received and the retry then
fails before cifs_send_recv() stores a replacement response, flush_exit
will free the stale response pointer a second time.
Reinitialize resp_buftype and rsp_iov at the top of the replay loop so
cleanup only acts on response state produced by the current attempt.
This fixes a double-free without changing replay handling for successful
requests.
Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Acked-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Zhao Zhang <zzhan461@ucr.edu>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb server updates from Steve French:
- Use after free fixes
- Out of bounds read fix
- Add SMB compression support both at rest and over the wire: support
decompression of compressed SMB2 requests, initially allow compressed
SMB2 READ responses, and implement get/set compression operations for
per-file compression state.
- Credentials fixes: for various FSCTLs, setinfo, delete on close and
for alternate data streams
- Fix access checks and permission checks in DUPLICAT_EXTENTS and
SET_ZERO_DATA fsctls, find_file_posix_info, FILE_LINK_INFORMATION and
smb2_set_info_sec
- Reject non valid session in compound request
- Serialize QUERY_DIRECTORY
- Prevent path traversal bypass by restricting caseless retry
- Path lookup fix
- Two minor cleanup fixes
* tag 'v7.2-rc-part1-ksmbd-fixes' of git://git.samba.org/ksmbd: (31 commits)
ksmbd: fix path resolution in ksmbd_vfs_kern_path_create
ksmbd: use opener credentials for FSCTL mutations
ksmbd: use opener credentials for ADS I/O
ksmbd: require source read access for duplicate extents
ksmbd: run set info with opener credentials
ksmbd: use opener credentials for delete-on-close
ksmbd: serialize QUERY_DIRECTORY requests per file
ksmbd: add permission checks for FSCTL_DUPLICATE_EXTENTS_TO_FILE
ksmbd: enforce FILE_READ_ATTRIBUTES on SMB_FIND_FILE_POSIX_INFORMATION
ksmbd: reject non-VALID session in compound request branch
ksmbd: compress SMB2 READ responses
ksmbd: negotiate and decode SMB2 compression
cifs: negotiate chained SMB2 compression capabilities
smb: add common SMB2 compression transform helpers
smb: move LZ77 compression into common code
ksmbd: add per-handle permission check to FILE_LINK_INFORMATION
ksmbd: add a permission check for FSCTL_SET_ZERO_DATA
ksmbd: add a WRITE_DAC/WRITE_OWNER check to SMB2 SET_INFO SECURITY
ksmbd: fix use-after-free of a deferred file_lock on SMB2_CLOSE then SMB2_CANCEL
smb: server: remove code guarded by nonexistent config option
...
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Three cleanup patches
- Fix error return value in smb2_aead_req_alloc
- Three compression fixes
- Update i_blocks after write (fixes various xfstests)
- Fix races in cifsd thread creation
- Fix potential out of bounds read parsing security descriptors
- Witness protocol fix
- Fix umount bug
- Mount fix
- Fix cached directory entries on unlink/rmdir/rename
* tag 'v7.2-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: Use more common code in SMB2_tcon()
smb: client: Use more common error handling code in smb3_reconfigure()
smb/client: Fix error code in smb2_aead_req_alloc()
smb/client: clean up a type issue in cifs_xattr_get()
smb/client: allow FS_IOC_SETFLAGS to clear compression
smb/client: use writable handle for FS_IOC_SETFLAGS compression
smb/client: always return a value for FS_IOC_GETFLAGS
smb/client: update i_blocks after contiguous writes
smb: client: fix races in cifsd thread creation
cifs: validate full SID length in security descriptors
smb: client: resolve SWN tcon from live registrations
cifs: remove all cifs files before kill super
smb: client: fix conflicting option validation for new mount API
cifs: invalidate cfid on unlink/rename/rmdir
|
|
The SMB2 open lookup is rooted at the share with LOOKUP_BENEATH, but the
create/mkdir/hardlink sink is not: ksmbd_vfs_kern_path_create() builds an
absolute path with convert_to_unix_name() and resolves it from AT_FDCWD
via start_creating_path(), so a ".." component is walked from the real
filesystem root and escapes the export.
An authenticated client races a missing path component so the rooted open
lookup returns -ENOENT (taking the create branch) while the same component
is present (a directory) when the create walk runs; the create then
resolves ".." out of the share.
Root the create walk at the share like the lookup and rename paths already
are: resolve the parent with vfs_path_parent_lookup(..., LOOKUP_BENEATH,
&share_conf->vfs_path) and create the final component with
start_creating_noperm(). convert_to_unix_name() then has no callers and is
removed.
Fixes: 265fd1991c1d ("ksmbd: use LOOKUP_BENEATH to prevent the out of share access")
Cc: stable@vger.kernel.org
Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SET_SPARSE, SET_ZERO_DATA and SET_COMPRESSION operate on an open SMB
handle but call VFS xattr, fallocate or fileattr helpers with the current
ksmbd worker credentials. Those helpers can revalidate inode permissions,
ownership and LSM policy independently of the SMB handle access mask.
Run each operation with the credentials captured in the target file when
the handle was opened. Keep credential handling local to these single-file
FSCTLs rather than applying session credentials to the complete IOCTL
handler, which also contains handle-less and multi-handle operations.
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Alternate data streams are stored as xattrs. Unlike regular file I/O,
their read and write paths therefore call VFS xattr helpers which recheck
inode permissions and LSM policy using the current task credentials.
Run ADS I/O with the credentials captured when the SMB handle was opened.
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
FSCTL_DUPLICATE_EXTENTS_TO_FILE passes the source file directly to
vfs_clone_file_range() or vfs_copy_file_range() without checking the SMB
access mask granted to the source handle. A handle opened with attribute
access can consequently be used to copy file contents into an
attacker-readable destination.
Require FILE_READ_DATA on the source handle before either VFS operation,
matching other ksmbd data-copy paths.
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2 SET_INFO handlers call path-based VFS helpers after checking the
access mask granted to the SMB handle. Those helpers perform their owner,
inode permission and LSM checks using the current ksmbd worker credentials.
Run the complete SET_INFO dispatch with the credentials captured when the
handle was opened. This also removes the separate security information
credential setup and keeps all SET_INFO classes under one credential scope.
Direct override_creds() is used because it can nest with the request
credential overrides already used by rename and link helpers.
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Delete-on-close can be completed by deferred or durable handle teardown,
where no request work is available. Both the base-file unlink and the ADS
xattr removal consequently run with the ksmbd worker credentials and can
bypass filesystem permission checks.
Run both operations with the credentials captured in struct file when the
handle was opened. This preserves the authenticated user's fsuid, fsgid,
supplementary groups and capability restrictions at final close.
Cc: stable@vger.kernel.org
Reported-by: Musaab Khan <musaab.khan@protonmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_query_dir() stores a pointer to its stack-allocated private data in
the ksmbd_file readdir_data. Concurrent QUERY_DIRECTORY requests using the
same file handle can overwrite this pointer while an iterate_dir() callback
is still using it, resulting in a stack use-after-free.
Add a per-file mutex and hold it while accessing the shared directory
enumeration state. The lock covers scan restart, dot entry state,
readdir_data setup and iteration, and response construction. This prevents
another request from replacing readdir_data.private before the current
request has finished using it and also serializes the shared file position.
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-30527
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The FSCTL_DUPLICATE_EXTENTS_TO_FILE arm of smb2_ioctl() overwrites the
destination file's data via vfs_clone_file_range() with neither the
share-level KSMBD_TREE_CONN_FLAG_WRITABLE check nor a per-handle
fp->daccess check that the other write-bearing arms carry. A client can
overwrite destination data on a read-only share, or from a handle opened
with only FILE_WRITE_ATTRIBUTES (which still yields an FMODE_WRITE filp).
FILE_WRITE_ATTRIBUTES-only destination handle overwrote the file's data via
the clone. Add both checks, matching the FSCTL_SET_SPARSE permission fix;
require FILE_WRITE_DATA since this writes data.
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
find_file_posix_info() in smb2_query_info() returns file metadata (owner
uid, group gid, mode, inode, size, allocation size, hard-link count and all
four timestamps) but performs no per-handle access check. Every sibling
query handler gates on the handle's granted access first --
get_file_basic_info(), get_file_all_info(), get_file_network_open_info()
and get_file_attribute_tag_info() all reject a handle lacking
FILE_READ_ATTRIBUTES_LE with -EACCES. The POSIX handler is gated only by
the connection-scoped tcon->posix_extensions flag, which is not a
per-handle authorization, so a handle opened with only FILE_WRITE_DATA is
correctly denied FileBasicInformation yet is allowed the strict-superset
POSIX info. Mirror the FILE_READ_ATTRIBUTES_LE gate the sibling info
handlers already use.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_check_user_session() takes a shortcut for any operation that is not
the first in a COMPOUND request: it reuses work->sess (the session bound by
the first operation) and validates only the SessionId, then returns
"valid". It never re-checks work->sess->state == SMB2_SESSION_VALID, and a
SessionId of 0xFFFFFFFFFFFFFFFF (ULLONG_MAX, the MS-SMB2 related-operation
value) skips even the id comparison. The standalone path
(ksmbd_session_lookup_all() plus the SESSION_SETUP state machine) does
enforce the VALID state; the compound branch bypasses all of it.
A SESSION_SETUP carrying only an NTLM Type-1 (NtLmNegotiate) blob publishes
a fresh SMB2_SESSION_IN_PROGRESS session whose sess->user is still NULL
(->user is assigned later, by ntlm_authenticate()). Used as operation 1 of
a COMPOUND with operation 2 = TREE_CONNECT (related, SessionId=ULLONG_MAX,
\\host\IPC$), the tree-connect then runs on that IN_PROGRESS session and
reaches ksmbd_ipc_tree_connect_request(), which dereferences
user_name(sess->user) with sess->user == NULL (transport_ipc.c:687/701/704)
-> remote NULL-pointer dereference and a kernel Oops that wedges the ksmbd
worker for all clients.
Reject any non-first compound operation that lands on a session which is
not SMB2_SESSION_VALID, mirroring the validity the standalone lookup path
enforces. SESSION_SETUP itself legitimately runs on an IN_PROGRESS session,
but it is never carried as a non-first compound operation, so multi-leg
authentication is unaffected by this check.
Fixes: 5005bcb42191 ("ksmbd: validate session id and tree id in the compound request")
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Handle SMB2_READFLAG_REQUEST_COMPRESSED for non-RDMA reads.
Flatten the response iov, emit chained or unchained LZ77 transforms when
compression is beneficial, and retain the generated buffer until the work
item is released.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Parse the SMB 3.1.1 compression capabilities context and negotiate LZ77
with optional chained Pattern_V1 support.
Advertise compression on tree connections and decode compressed requests
before normal SMB dispatch.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Advertise LZ77 and Pattern_V1 with chained transform support in the
SMB 3.1.1 compression negotiate context. Validate the server's returned
algorithm list and flags, then retain the negotiated capabilities for a
future compressed transform receive implementation.
This patch only negotiates capabilities. It does not request compressed
READ responses or add a compressed transform receive path.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Implement common validation, compression and decompression for SMB2
compression transforms.
Support unchained LZ77 and chained NONE, LZ77 and Pattern_V1 payloads.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Move the LZ77 codec in cifs.ko to smb/common/ so both the SMB
client and ksmbd can use it.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The FILE_LINK_INFORMATION arm of smb2_set_info_file() calls
smb2_create_link() with no per-handle fp->daccess check. On the
ReplaceIfExists path smb2_create_link() unlinks an existing file at the
target name (ksmbd_vfs_remove_file) and creates a hardlink
(ksmbd_vfs_link); neither helper checks daccess. A handle opened with
FILE_READ_DATA only (no FILE_DELETE, no FILE_WRITE_DATA) can therefore
delete an arbitrary file in the share and plant a hardlink over its name.
The sibling delete/move arms in the same switch already gate:
FILE_RENAME_INFORMATION and FILE_DISPOSITION_INFORMATION both require
FILE_DELETE_LE; FILE_FULL_EA_INFORMATION requires FILE_WRITE_EA_LE. Gate
the link arm the same way as its closest analogue (rename), since it
mutates the namespace and, on replace, deletes an existing entry.
This is a sibling of commit cc57232cae23 ("ksmbd: fix FSCTL permission
bypass by adding a permission check for FSCTL_SET_SPARSE").
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
FSCTL_SET_ZERO_DATA in smb2_ioctl() destroys file data via
ksmbd_vfs_zero_data() -> vfs_fallocate(PUNCH_HOLE/ZERO_RANGE) after
checking only the share-level KSMBD_TREE_CONN_FLAG_WRITABLE, with no
per-handle access check. A handle opened with only FILE_WRITE_ATTRIBUTES
still yields an FMODE_WRITE filp (FILE_WRITE_ATTRIBUTES is part of
FILE_WRITE_DESIRE_ACCESS_LE, so smb2_create_open_flags() opens it
O_WRONLY), so the vfs_fallocate FMODE_WRITE check does not stop it; only
the missing fp->daccess gate would. Reproduced on mainline 7.1-rc7 with
KASAN by an authenticated SMB client: a FILE_WRITE_ATTRIBUTES-only handle
zeroed 4096 bytes of file data it had no FILE_WRITE_DATA right to
(6/6; a FILE_READ_DATA-only handle was correctly denied).
This is the unfixed sibling of commit cc57232cae23 ("ksmbd: fix FSCTL
permission bypass by adding a permission check for FSCTL_SET_SPARSE").
Because SET_ZERO_DATA writes data (not an attribute), require
FILE_WRITE_DATA.
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
commit cc57232cae23 ("ksmbd: fix FSCTL permission bypass by adding a
permission check for FSCTL_SET_SPARSE") added a fp->daccess gate to
fsctl_set_sparse and noted that "similar handle-level checks exist in other
functions but are missing here." The SMB2 SET_INFO SECURITY arm is one of
the missing ones, and the most security-relevant: smb2_set_info_sec() calls
set_info_sec() with no per-handle access check.
set_info_sec() (fs/smb/server/smbacl.c) re-permissions the file: it
rewrites owner/group/mode via notify_change(), rewrites the POSIX ACL via
set_posix_acl(), and on KSMBD_SHARE_FLAG_ACL_XATTR shares removes and
rewrites the Windows security descriptor via ksmbd_vfs_set_sd_xattr().
Every other persistent-mutation arm of the sibling handler
smb2_set_info_file() checks fp->daccess first (FILE_WRITE_DATA /
FILE_DELETE / FILE_WRITE_EA / FILE_WRITE_ATTRIBUTES); the SECURITY arm —
which mutates the access control itself — is the only one with no gate.
A client can therefore open a handle with FILE_WRITE_ATTRIBUTES only (no
FILE_WRITE_DAC / FILE_WRITE_OWNER) and use SMB2_SET_INFO with InfoType
SMB2_O_INFO_SECURITY to rewrite the file's DACL and owner, granting itself
access the handle's daccess never carried. Unlike the FSCTL data arms this
is a metadata/xattr operation, so there is no FMODE_WRITE VFS backstop —
the missing fp->daccess check is the entire gate.
Setting a security descriptor is the WRITE_DAC / WRITE_OWNER operation, so
require at least one of those on the handle before re-permissioning the
file. -EACCES is mapped to STATUS_ACCESS_DENIED by smb2_set_info().
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on
double SMB2_CANCEL") made smb2_cancel() skip a work whose state is
KSMBD_WORK_CANCELLED, so its cancel_fn cannot be fired a second time. But
KSMBD_WORK has three states (ACTIVE, CANCELLED, CLOSED), and the same
freeing producer path is reached for CLOSED too:
SMB2_CLOSE on the locking handle -> set_close_state_blocked_works() sets
the deferred work's state to KSMBD_WORK_CLOSED and wakes the smb2_lock()
worker. The worker takes the non-ACTIVE early-exit, locks_free_lock()s
the file_lock and, because the state is not KSMBD_WORK_CANCELLED, takes
the STATUS_RANGE_NOT_LOCKED branch with "goto out2" -- which, like the
cancelled branch, skips release_async_work(). The work stays on
conn->async_requests with a live cancel_fn = smb2_remove_blocked_lock
pointing at the freed file_lock.
A subsequent SMB2_CANCEL for the same AsyncId then passes the
KSMBD_WORK_CANCELLED-only guard (its state is KSMBD_WORK_CLOSED), so
smb2_cancel() fires cancel_fn again over the freed file_lock -- the same
use-after-free fixed, via SMB2_CLOSE instead of a first SMB2_CANCEL:
BUG: KASAN: slab-use-after-free in __locks_delete_block
__locks_delete_block
locks_delete_block
ksmbd_vfs_posix_lock_unblock
smb2_remove_blocked_lock
smb2_cancel <- 2nd SMB2_CANCEL fires cancel_fn
handle_ksmbd_work
Allocated by ...: locks_alloc_lock <- smb2_lock
Freed by ...: locks_free_lock <- smb2_lock (non-ACTIVE early-exit)
... cache file_lock_cache of size 192
Reproduced on mainline 7.1-rc7 (which already contains f580d27e8928) with
KASAN by an authenticated SMB client; the double-SMB2_CANCEL control is
silent on that kernel, so the splat is attributable to the CLOSE trigger.
Only an ACTIVE deferred work may have its cancel_fn fired: both terminal
states (CANCELLED and CLOSED) reach the smb2_lock() early-exit that frees
the file_lock and skips release_async_work(). Guard on KSMBD_WORK_ACTIVE
so any non-active work is skipped.
Fixes: f580d27e8928 ("ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL")
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A small piece of code in fs/smb/server/smb_common.c depends on
CONFIG_SMB_INSECURE_SERVER, which has never been defined in the
mainline kernel, but was present in old out-of-tree versions of ksmbd.
Remove this dead code.
Discovered while searching for CONFIG_* symbols referenced in code but
not defined in any Kconfig file.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Before this patch, we got the wrong file size:
- client: touch /mnt/file
- client: smbinfo setcompression default /mnt/file
- client: dd if=/dev/zero of=/mnt/file bs=1 count=1000
- client: smbinfo filecompressioninfo /mnt/file
Compressed File Size: 4096
Compression Format: 2 (LZNT1)
After this patch, we get the correct file size:
- client: smbinfo filecompressioninfo /mnt/file
Compressed File Size: 1000
Compression Format: 2 (LZNT1)
Note that the actual compressed file size must be got by other methods.
For Btrfs, use the following command to get actual compressed file size:
- server: compsize /export/file
Processed 1 file, 0 regular extents (0 refs), 1 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 4% 47B 1000B 1000B
zlib 4% 47B 1000B 1000B
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
I have added `filecompressioninfo` subcommand to `smbinfo`
in cifs-utils.git.
Example:
1. client: smbinfo setcompression lznt1 /mnt/file
2. client: smbinfo filecompressioninfo /mnt/file
Compressed File Size: 104857600
Compression Format: 2 (LZNT1)
Compression Unit Shift: 0
Chunk Shift: 0
Cluster Shift: 0
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Example:
1. client: smbinfo setcompression no /mnt/file
2. client: smbinfo getcompression /mnt/file
Compression: 0 (NONE)
3. client: smbinfo setcompression lznt1 /mnt/file
4. client: smbinfo getcompression /mnt/file
Compression: 2 (LZNT1)
5. client: smbinfo setcompression default /mnt/file
6. client: smbinfo getcompression /mnt/file
Compression: 2 (LZNT1)
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Example:
1. server: chattr +c /export/file
2. client: smbinfo getcompression /mnt/file
Compression: 2 (LZNT1)
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Example:
1. server: chattr +c /export/file
2. client: lsattr /mnt/file
--------c------------- /mnt/file
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
These definitions will also be used by ksmbd, move them into
common header file.
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Rename the following places:
- FSCTL_COPYCHUNK -> FSCTL_SRV_COPYCHUNK
- FSCTL_COPYCHUNK_WRITE -> FSCTL_SRV_COPYCHUNK_WRITE
- FSCTL_REQUEST_RESUME_KEY -> FSCTL_SRV_REQUEST_RESUME_KEY
server/smbfsctl.h contains the following additional definitions compared to
common/smbfsctl.h:
- IO_REPARSE_TAG_LX_SYMLINK_LE
- IO_REPARSE_TAG_AF_UNIX_LE
- IO_REPARSE_TAG_LX_FIFO_LE
- IO_REPARSE_TAG_LX_CHR_LE
- IO_REPARSE_TAG_LX_BLK_LE
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_vfs_path_lookup() enforces LOOKUP_BENEATH to restrict path
resolution within the share root. When a crafted path attempts to
escape the share boundary using parent-directory components ('..'),
vfs_path_parent_lookup() detects this and immediately fails,
returning -EXDEV.
However, a bug exists in __ksmbd_vfs_kern_path() under caseless mode.
The function fails to intercept the -EXDEV error and erroneously
falls through to the caseless retry logic, which is intended only
for genuinely missing files. During this retry process, the path
is reconstructed, leading to an unintended LOOKUP_BENEATH bypass
that allows write-capable users to create zero-length files or
directories outside the exported share.
Fix this by ensuring that the execution only proceeds to the caseless
lookup retry when the error is specifically -ENOENT. Any other errors,
such as -EXDEV from a path traversal attempt, must be returned immediately.
Cc: stable@vger.kernel.org
Reported-by: Y s65 <yu4ys@outlook.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When a blocking byte-range lock request is deferred in the
FILE_LOCK_DEFERRED path, ksmbd registers the asynchronous work into
the connection's async_requests list via setup_async_work(). The cancel
callback smb2_remove_blocked_lock() holds a reference to the flock.
If the lock waiter is subsequently woken up but the work state is no
longer KSMBD_WORK_ACTIVE (e.g., due to a concurrent cancellation), the
cleanup path calls locks_free_lock(flock) without dequeuing the work from
the async_requests list. Concurrently, smb2_cancel() walks the list
under conn->request_lock and invokes the cancel callback, which then
dereferences the already freed 'flock'. This leads to a slab-use-after-free
inside __wake_up_common.
Fix this by restructuring the cleanup logic after the worker returns
from ksmbd_vfs_posix_lock_wait(). Move list_del(&smb_lock->llist) and
release_async_work(work) to the top of the cleanup block. This guarantees
that the async work is completely dequeued and serialized under
conn->request_lock before locks_free_lock(flock) is called, rendering
the flock unreachable for any concurrent smb2_cancel().
Cc: stable@vger.kernel.org
Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
same_client_has_lease() returns an opinfo pointer from ci->m_op_list
after dropping ci->m_lock without taking a reference.
smb_grant_oplock() then dereferences that pointer in copy_lease() and
when checking breaking_cnt. A concurrent close can remove the old lease
from ci->m_op_list and drop the last reference before the caller uses
the returned pointer, leading to a use-after-free.
Take a reference when same_client_has_lease() selects an existing lease,
drop any previous match while scanning, and release the returned
reference in smb_grant_oplock() after copying the lease state.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The permission-check ACE walk in smb_check_perm_dacl() validates the ACE
header size and caps sid.num_subauth at SID_MAX_SUB_AUTHORITIES, but it
never checks that ace->size is actually large enough to contain
num_subauth sub-authorities before compare_sids() dereferences them.
CIFS_SID_BASE_SIZE covers the SID header up to but excluding the
sub_auth[] array, and offsetof(struct smb_ace, sid) is the ACE header,
so the existing guards only guarantee the 8-byte SID base, i.e. zero
sub-authorities. compare_sids() then reads ace->sid.sub_auth[i] for
i < min(local_sid->num_subauth, ace->sid.num_subauth). The local
comparison SIDs (sid_everyone, sid_unix_NFS_mode, and the id_to_sid()
result) always have at least one sub-authority, and an attacker controls
the ACE revision and authority bytes (which lie within the in-bounds SID
base), so they can match one of those SIDs and force the sub_auth read.
A crafted ACE with size == 16 and num_subauth >= 1 placed at the tail of
the security descriptor therefore causes a heap out-of-bounds read of up
to SID_MAX_SUB_AUTHORITIES * sizeof(__le32) bytes past the pntsd
allocation. The security descriptor is loaded by ksmbd_vfs_get_sd_xattr()
into a buffer sized exactly to the on-disk data (kzalloc(sd_size) in
ndr_decode_v4_ntacl()), so the read lands past the allocation. The
malformed descriptor can be stored verbatim via SMB2_SET_INFO (the DACL
is not normalised before being written to the security.NTACL xattr) and
the read fires on a subsequent SMB2_CREATE access check, making this
reachable by an authenticated client on a share that uses ACL xattrs.
Add the missing num_subauth-versus-ace_size check, mirroring the
identical guards already present in the sibling parsers parse_dacl() and
smb_inherit_dacl().
Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()")
Cc: stable@vger.kernel.org
Signed-off-by: Hem Parekh <hemparekh1596@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache updates from Al Viro:
- d_alloc_parallel() API change (Neil's with my changes)
- NORCU fixes
- Reorganization and simplification of dentry eviction logic
- Simplifying rcu_read_lock() scopes in fs/dcache.c
- Secondary roots work - getting rid of NFS fake root dentries and
dealing with remaining shrink_dcache_for_umount() and
shrink_dentry_list() races
- making cursors NORCU (surprisingly easy)
* tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (22 commits)
make cursors NORCU
nfs: get rid of fake root dentries
wind ->s_roots via ->d_sib instead of ->d_hash
shrink_dentry_tree(): unify the calls of shrink_dentry_list()
shrinking rcu_read_lock() scope in d_alloc_parallel()
d_walk(): shrink rcu_read_lock() scope
document dentry_kill()
adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill()
Document rcu_read_lock() use in select_collect2()
Shift rcu_read_{,un}lock() inside fast_dput()
simplify safety for lock_for_kill() slowpath
fold lock_for_kill() and __dentry_kill() into common helper
fold lock_for_kill() into shrink_kill()
shrink_dentry_list(): start with removing from shrink list
d_prune_aliases(): make sure to skip NORCU aliases
kill d_dispose_if_unused()
make to_shrink_list() return whether it has moved dentry to list
select_collect(): ignore dentries on shrink lists if they have positive refcounts
find_acceptable_alias(): skip NORCU aliases with zero refcount
fix a race between d_find_any_alias() and final dput() of NORCU dentries
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs superblock updates from Christian Brauner:
"This retires sget().
CIFS plus the two ext4 KUnit tests (extents-test, mballoc-test) were
the last in-tree callers, and all three convert cleanly to sget_fc().
That lets sget() and its prototype come out, taking ~60 lines that
only existed to be kept in lockstep with sget_fc() on every
publish-path change"
* tag 'vfs-7.2-rc1.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: retire sget()
smb: client: convert cifs_smb3_do_mount() to sget_fc()
ext4: convert mballoc KUnit test to sget_fc()
ext4: convert extents KUnit test to sget_fc()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull openat2 updates from Christian Brauner:
"Features:
- Add O_EMPTYPATH to openat(2)/openat2(2). To get an operable file
descriptor from an O_PATH file descriptor it is possible to use
openat(fd, ".", O_DIRECTORY) for directories, but other file types
require going through open("/proc/<pid>/fd/<nr>") and thus depend
on a functioning procfs.
With O_EMPTYPATH an empty path string is accepted and LOOKUP_EMPTY
is set at path resolution time, allowing to reopen the file behind
the file descriptor directly. Selftests are included.
- Add an OPENAT2_REGULAR flag for openat2(2) which refuses to open
anything but regular files with the new EFTYPE error code.
This implements the "ability to only open regular files" feature
requested by userspace via uapi-group.org and protects services
from being redirected to fifos, device nodes, and friends.
All atomic_open implementations were audited for OPENAT2_REGULAR
handling. Explicit checks were added to ceph, gfs2, nfs (v4), and
cifs/smb - these are the filesystems whose atomic_open can
encounter an existing non-regular file and would otherwise call
finish_open() on it or return a misleading error code.
The remaining implementations (9p, fuse, vboxsf, nfs v2/v3) only
call finish_open() on freshly created files and use
finish_no_open() for lookup hits, letting the VFS catch non-regular
files via the do_open() safety net.
Cleanups:
- Migrate the openat2 selftests to the kselftest harness and move
them under selftests/filesystems/. The tests were written in the
early days of selftests' TAP support and the modern kselftest
harness is much easier to follow and maintain. The contents of the
tests are unchanged and the new emptypath tests are ported on top.
- Make the LAST_XXX last-type constants private to fs/namei.c. The
only user outside of fs/namei.c was ksmbd which only needs to know
whether the last component is a regular one, so
vfs_path_parent_lookup() now performs the LAST_NORM check
internally. The ints are replaced with a dedicated enum last_type"
* tag 'vfs-7.2-rc1.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
vfs: replace ints with enum last_type for LAST_XXX
vfs: make LAST_XXX private to fs/namei.c
selftests: openat2: port emptypath_test to kselftest harness
kselftest/openat2: test for OPENAT2_REGULAR flag
openat2: new OPENAT2_REGULAR flag support
openat2: introduce EFTYPE error code
selftest: add tests for O_EMPTYPATH
vfs: add O_EMPTYPATH to openat(2)/openat2(2)
selftests: openat2: migrate to kselftest harness
selftests: openat2: switch from custom ARRAY_LEN to ARRAY_SIZE
selftests: openat2: move helpers to header
selftests: move openat2 tests to selftests/filesystems/
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs casefolding updates from Christian Brauner:
"This exposes the case folding behavior of local filesystems so that
file servers - nfsd, ksmbd, and user space file servers - can report
the actual behavior to clients instead of guessing.
Filesystems report case-insensitive and case-nonpreserving behavior
via new file_kattr flags in their fileattr_get implementations. fat,
exfat, ntfs3, hfs, hfsplus, xfs, cifs, nfs, vboxsf, and isofs are
wired up. Local filesystems that are not explicitly handled default to
the usual POSIX behavior of case-sensitive and case-preserving.
nfsd uses this to report case folding via NFSv3 PATHCONF and to
implement the NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
attributes - both have been part of the NFS protocols for decades to
support clients on non-POSIX systems - and ksmbd reports it via
FS_ATTRIBUTE_INFORMATION. Exposing the information through the
fileattr uapi covers user space file servers.
The immediate motivation is interoperability: Windows NFS clients
hard-require servers to report case-insensitivity for Win32
applications to work correctly, and a client that knows the server is
case-insensitive can avoid issuing multiple LOOKUP/READDIR requests
searching for case variants.
The Linux NFS client already grew support for case-insensitive shares
years ago in support of the Hammerspace NFS server - negative dentry
caching must be disabled (a lookup for "FILE.TXT" failing must not
cache a negative entry when "file.txt" exists) and directory change
invalidation must drop cached case-folded name variants. Such servers
often operate in multi-protocol environments where a single file
service instance caters to both NFS and SMB clients, and nfsd needs to
report case folding properly to participate as a first-class citizen
there.
A follow-up series brings fixes for the initial work: the nfsd
case-info probe now uses kernel credentials, maps -ESTALE to
NFS3ERR_STALE, and has its cost capped across READDIR entries; the nfs
client avoids transiently zeroed case capability bits during the probe
and skips the pathconf probe when neither field is consumed; the
FS_CASEFOLD_FL semantics are clarified in the UAPI header; and the
tools UAPI headers are synced"
* tag 'vfs-7.2-rc1.casefold' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits)
nfsd: Cap case-folding probe cost across READDIR entries
nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
nfsd: Use kernel credentials for case-info probe
fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
nfs: Skip pathconf probe when neither field is consumed
nfs: Avoid transient zeroed case capability bits during probe
tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
nfsd: Report export case-folding via NFSv3 PATHCONF
isofs: Implement fileattr_get for case sensitivity
vboxsf: Implement fileattr_get for case sensitivity
nfs: Implement fileattr_get for case sensitivity
cifs: Implement fileattr_get for case sensitivity
xfs: Report case sensitivity in fileattr_get
hfsplus: Report case sensitivity in fileattr_get
hfs: Implement fileattr_get for case sensitivity
ntfs3: Implement fileattr_get for case sensitivity
exfat: Implement fileattr_get for case sensitivity
fat: Implement fileattr_get for case sensitivity
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs inode updates from Christian Brauner:
"This extends the lockless ->i_count handling.
iput() could already decrement any value greater than one locklessly
but acquiring a reference always required taking inode->i_lock. Now
acquiring a reference is lockless as long as the count was already at
least 1, i.e., only the 0->1 and 1->0 transitions take the lock.
This avoids the lock for the common cases of nfs calling into the
inode hash and btrfs using igrab(). Cleanup-wise icount_read_once() is
added to line up with inode_state_read_once() and the open-coded
->i_count loads across the tree are converted, and ihold() is
relocated and tidied up.
On top of that some stale lock ordering annotations are retired from
the inode hash code: iunique() no longer takes the hash lock since the
inode hash became RCU-searchable and s_inode_list_lock is no longer
taken under the hash lock either"
* tag 'vfs-7.2-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: retire stale lock ordering annotations from inode hash
fs: allow lockless ->i_count bumps as long as it does not transition 0->1
fs: relocate and tidy up ihold()
fs: add icount_read_once() and stop open-coding ->i_count loads
|
|
Use an additional label so that a bit of common code can be better reused
at the end of this function implementation.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Use an additional label so that a bit of exception handling can be better
reused at the end of this function implementation.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The "*num_sgs" variable is a u32 so "ERR_PTR(*num_sgs)" doesn't work.
We would have to do something similar to the previous line where it's
cast to int and then long. However, it's simpler to store the return in
an int ret variable.
This bug would eventually result in a crash when dereference the invalid
error pointer.
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Cc: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The cifs_xattr_get() function returns type int, not ssize_t so
declare "rc" as int as well. This has no effect on runtime.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The CIFS FS_IOC_SETFLAGS path can set FS_COMPR_FL now, but it cannot
clear it again. This can be reproduced on a share backed by a filesystem
that supports compression, for example btrfs exported by Samba:
[compress_share]
vfs objects = btrfs
$ touch test.bin
$ chattr +c test.bin
$ lsattr test.bin
$ chattr -c test.bin
The final chattr -c fails with EOPNOTSUPP, and leaves the remote object
with the compressed attribute still set, because the client always sends
FSCTL_SET_COMPRESSION with COMPRESSION_FORMAT_DEFAULT. That is correct
for setting FS_COMPR_FL, but clearing FS_COMPR_FL requires sending
COMPRESSION_FORMAT_NONE.
Fix this by passing the requested compression state through the
set_compression operation. The SMB1 and SMB2 helpers no longer hard-code
COMPRESSION_FORMAT_DEFAULT.
When FS_COMPR_FL is set, send COMPRESSION_FORMAT_DEFAULT. When it is
cleared, send COMPRESSION_FORMAT_NONE. If the server accepts the request,
update the cached FILE_ATTRIBUTE_COMPRESSED bit under i_lock so
FS_IOC_GETFLAGS reports the new state.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Setting the compressed flag on a CIFS mount can fail with -EACCES:
[compress_share]
vfs objects = btrfs
$ touch test.bin
$ chattr +c test.bin
chattr: Permission denied while setting flags on test.bin
This can be reproduced against a Samba share backed by a filesystem that
supports compression, such as btrfs.
FS_IOC_SETFLAGS is issued on the file handle opened by userspace. chattr
opens the target read-only before setting FS_COMPR_FL, so the SMB client
currently sends FSCTL_SET_COMPRESSION on a handle that may not have
FILE_WRITE_DATA access. Samba requires FILE_WRITE_DATA for
FSCTL_SET_COMPRESSION and rejects the request.
Use the current handle only if it already has FILE_WRITE_DATA. Otherwise
try an existing writable handle for the inode. If none is available, open
a temporary FILE_WRITE_DATA handle for the compression request.
After FSCTL_SET_COMPRESSION succeeds, update the cached compressed
attribute immediately, matching how smb2_set_sparse() updates
FILE_ATTRIBUTE_SPARSE_FILE after a successful FSCTL_SET_SPARSE.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently, repeated lsattr calls on a regular CIFS file without the
compressed attribute may show random flags:
$ touch test.bin
$ lsattr test.bin
s-S-ia-A-EjI---------m test.bin
$ lsattr test.bin
------d-cEjI---------m test.bin
The lsattr reproducer depends on the previous contents of its userspace
buffer, so it may not reproduce on every setup. A deterministic
reproducer is to initialize the ioctl argument before FS_IOC_GETFLAGS
on a file without the compressed attribute:
int flags = 0x7fffffff;
ioctl(fd, FS_IOC_GETFLAGS, &flags);
On an affected kernel, flags remains 0x7fffffff. With the fix, it is
set to 0.
This happens because when the cached inode does not have the compressed
bit set, the CIFS fallback path in FS_IOC_GETFLAGS returns success
without calling put_user() to write the zero flags value into the user
buffer. As a result, the caller observes stale contents from its own
buffer.
Fix this by always writing the visible flags value back to the user
buffer before returning success, even when the value is zero.
Fixes: 64a5cfa6db94 ("Allow setting per-file compression via SMB2/3")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When a lease allows CIFS to use cached inode attributes, getattr may
return the locally cached attributes instead of revalidating them from
the server. After local writes extend a file, the write path updates the
file size, but i_blocks can remain based on the old allocation size.
For example, while the file is still open after two contiguous writes,
the local block count can remain smaller than the written range:
after first write: st_size = 4096, st_blocks = 7
after second write: st_size = 12288, st_blocks = 21
after close: st_size = 12288, st_blocks = 24
This can make a fully written file look sparse:
i_blocks * 512 < i_size
and can cause swap activation to reject a valid write-created swapfile
as having holes. This results in xfstests skipping swap-related tests
on CIFS mounts:
generic/472 [not run] swapfiles are not supported
generic/494 [not run] swapfiles are not supported
generic/497 [not run] swapfiles are not supported
generic/569 [not run] swapfiles are not supported
generic/636 [not run] swapfiles are not supported
generic/643 [not run] swapfiles are not supported
Update the local i_blocks estimate after successful writes, but only
when the write starts at or before the currently known allocated range.
This lets sequential writes grow i_blocks while avoiding treating
write-past-EOF holes as allocated.
Skip the local estimate for files that are already marked sparse, since
their allocation needs to come from the server rather than from a
contiguous-write estimate.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The cifsd demultiplex thread can run and access tcp_ses before the parent
thread has finished populating tcp_ses, which the worker thread accesses
locklessly.
Also, the kthread_run macro may start the thread before returning the
thread pointer. Because the pointer is part of the structure that the
thread can access, if the kernel is preempted after the thread is spawned,
but before the thread pointer is populated and the thread attempts to exit,
it will sleep, waiting for a SIGKILL signal.
Fix this by moving creation of the thread to after all of tcp_ses'es
fields are populated, and spawning the thread last, using a split
kthread_create/wake_up_process logic.
Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
parse_sid() only verified that the fixed SID header fit in the
returned security descriptor, but did not verify that the full SID
body described by num_subauth was present.
A malicious server can return a truncated owner or group SID whose
header lies within the descriptor buffer while sub_auth[] extends
past the end of the allocation, leading to an out-of-bounds read
when the client later parses or copies that SID.
Validate the full SID body in parse_sid(), centralize owner/group SID
lookup and bounds checking in sid_from_sd(), and use that validation
in parse_sec_desc(), build_sec_desc(), and copy_sec_desc() before
sub_auth[] is accessed.
Signed-off-by: Qihang <q.h.hack.winter@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs_swn_notify() looks up a witness registration by id under
cifs_swnreg_idr_mutex, drops the mutex, and then uses the registration's
cached tcon pointer. That pointer is not a lifetime reference, and it is
not a stable representative once cifs_get_swn_reg() lets multiple tcons
for the same net/share name share one registration id.
A same-share second mount can keep the cifs_swn_reg alive after the first
tcon unregisters and is freed. The registration then still points at the
freed first tcon, so taking tc_lock or incrementing tc_count through
swnreg->tcon only moves the use-after-free earlier. Taking tc_lock while
holding cifs_swnreg_idr_mutex also violates the documented CIFS lock
order.
Fix this by making the registration store only the stable witness
identity: id, net name, share name, and notify flags. When a notify
arrives, copy that identity under cifs_swnreg_idr_mutex, drop the mutex,
then find and pin a live witness tcon that currently matches the net/share
pair under the normal cifs_tcp_ses_lock -> tc_lock order. The notification
path uses that pinned tcon directly and drops the reference when done.
Registration and unregister messages now use the live tcon passed by the
caller instead of a cached tcon in the registration. The final unregister
send is folded into cifs_swn_unregister() while the registration is still
protected by cifs_swnreg_idr_mutex. This removes the previous
find/drop/reacquire raw-pointer window. The release path only removes the
idr entry and frees the stable identity strings.
This preserves the intended one-registration/many-tcon behavior: a
registration id represents a net/share pair, and notify handling acts on a
live representative selected at use time. It also preserves CLIENT_MOVE
ordering for the representative tcon because the old-IP unregister is sent
before cifs_swn_register() sends the new-IP register.
Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Cifs files may be put into fileinfo_put_wq during umounting cifs.
After umount done, cifsFileInfo_put_final is called, which cause
following BUG:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 134.222152] list_lru_add+0x64/0x1a0
[ 134.222399] ? cifs_put_tcon+0x171/0x340 [cifs]
[ 134.222772] d_lru_add+0x44/0x60
[ 134.222997] dput+0x1fc/0x210
[ 134.223213] cifsFileInfo_put_final+0x11a/0x140 [cifs]
[ 134.223576] process_one_work+0x17c/0x320
[ 134.223843] worker_thread+0x188/0x280
[ 134.224084] ? __pfx_worker_thread+0x10/0x10
[ 134.224366] kthread+0xcc/0x100
[ 134.224576] ? __pfx_kthread+0x10/0x10
[ 134.224827] ret_from_fork+0x30/0x50
[ 134.225063] ? __pfx_kthread+0x10/0x10
[ 134.225328] ret_from_fork_asm+0x1b/0x30
This can be reproduce by following:
unshare -n bash -c "
mkdir -p ${CIFS_MNT}
ip netns attach root 1
ip link add eth0 type veth peer veth0 netns root
ip link set eth0 up
ip -n root link set veth0 up
ip addr add 192.168.0.2/24 dev eth0
ip -n root addr add 192.168.0.1/24 dev veth0
ip route add default via 192.168.0.1 dev eth0
ip netns exec root sysctl net.ipv4.ip_forward=1
ip netns exec root iptables -t nat -A POSTROUTING -s 192.168.0.2 -o
${DEV} -j MASQUERADE
mount -t cifs ${CIFS_PATH} ${CIFS_MNT} -o
vers=3.0,sec=ntlmssp,credentials=${CIFS_CRED},rsize=65536,wsize=65536,cache=none,echo_interval=1
touch ${CIFS_MNT}/a.txt
ip netns exec root iptables -t nat -D POSTROUTING -s 192.168.0.2 -o
${DEV} -j MASQUERADE
"
umount ${CIFS_MNT}
Fixes: 340cea84f691 ("cifs: open files should not hold ref on superblock")
Signed-off-by: Jian Zhang <zhangjian496@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Apply conflicting option validation consistently across all the new
mount API paths, for both mount and remount.
Some checks were only applied during initial mount validation, while
others were handled during option parsing, causing mount and
remount/reconfigure to behave differently.
Move the conflicting option checks into smb3_handle_conflicting_options()
and call it from the common validation paths, including for
multichannel/max_channels handling.
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api")
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Today we do not invalidate the cached_dirent or the entire
parent cfid when a dentry in a dir has been removed/moved.
This change invalidates the parent cfid so that we don't serve
directory contents from the cache.
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Parallel lookup starts with a call of d_alloc_parallel(). That primitive
either returns a matching hashed dentry or allocates a new one in the
in-lookup state and returns it to the caller. Once the caller is done
with lookup, it indicates so either by call of d_{splice_alias,add}()
or by call of d_done_lookup(); at that point dentry leaves the in-lookup
state.
If d_alloc_parallel() finds a matching in-lookup dentry, it must wait for
that dentry to leave the in-lookup state, one way or another. Currently
by supplying wait_queue_head to d_alloc_parallel(). If d_alloc_parallel()
creates a new in-lookup dentry, the address of that wait_queue_head is stored
in ->d_wait of new dentry and stays there while it's in the in-lookup;
subsequent d_alloc_parallel() will wait on the queue found in the matching
in-lookup dentry. Transition out of in-lookup state wakes waiters on that
queue (if any).
That works, but the calling conventions are inconvenient - the caller must
supply wait_queue_head and make sure that it survives at least until the new
in-lookup dentry leaves the in-lookup state. That amounts to boilerplate
in the d_alloc_parallel() callers that are followed by a call of d_lookup_done()
in the same function; in cases like nfs asynchronous unlink it gets worse than
that.
This patch changes d_alloc_parallel() to use wake_up_var_locked() to
wake up waiters, and wait_var_event_spinlock() to wait. dentry->d_lock
is used for synchronisation as it is already held and the relevant
times.
That eliminates the need of caller-supplied wait_queue_head, simplifying
the calling conventions. Better yet, we only need one bit of information
stored in dentry itself: whether there are any waiters to be woken up,
and that can be easily stored in ->d_flags; ->d_wait goes away.
The reason we need that bit (DCACHE_LOOKUP_WAITERS) is that with wait_var
machinery the queues are shared with all kinds of stuff and there's
no way tell if any of the waiters have anything to do with our dentry;
most of the time none of them will be relevant, so we need to avoid the
pointless wakeups.
Another benefit of the new scheme comes from the fact that wakeups
have to be done outside of write-side critical areas of ->i_dir_seq;
with the old scheme we need to carry the value picked from ->d_wait from
__d_lookup_unhash() to the place where we actually wake the waiters up.
Now we can just leave DCACHE_LOOKUP_WAITERS in ->d_flags until we get
to doing wakeups - that's done within the same ->d_lock scope, so we
are fine; new bit is accessed only under ->d_lock and it's seen only
on dentries with DCACHE_PAR_LOOKUP in ->d_flags.
__d_lookup_unhash() no longer needs to re-init ->d_lru. That was
previously shared (in a union) with ->d_wait but ->d_wait is now gone
so it no longer corrupts ->d_lru.
Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> # saner handling of flags
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The CIFS mount path already runs through fs_context: smb3_get_tree()
calls smb3_get_tree_common() with a struct fs_context * in hand. But
the fc is dropped on the way to sget(). Plumb it through to sget_fc()
so the legacy sget() interface can go.
cifs_smb3_do_mount() now takes (struct fs_context *, struct
smb3_fs_context *). The old (fs_type, flags) pair is reconstructed
from fc->fs_type and fc->sb_flags. The flags argument was always
passed as 0 by the sole caller anyway. The cifs_dbg diagnostic now
prints fc->sb_flags directly.
cifs_match_super() and cifs_set_super() were the two void-data
callbacks for sget(). The match callback now takes
(struct super_block *, struct fs_context *) and reads struct
cifs_mnt_data out of fc->sget_key. The set callback is gone entirely:
sget_fc() pre-populates sb->s_fs_info from fc->s_fs_info before
invoking set() so set_anon_super_fc() (which just allocates an anon
bdev) is sufficient.
Before sget_fc() we stash cifs_sb in fc->s_fs_info, the per-mount data
in fc->sget_key and force fc->sb_flags to SB_NODIRATIME | SB_NOATIME
to reproduce the previous hard-coded behaviour (alloc_super() reads
fc->sb_flags). The original sb_flags is saved and restored around the
call so the rest of the mount path sees the same fc semantics as
before.
mnt_data.flags keeps its historical value of 0 so the CIFS_MS_MASK
comparison in compare_mount_options() returns the same (always-equal)
result.
No functional change. With this in place sget() has no remaining CIFS
caller.
Link: https://patch.msgid.link/20260529-work-sget-v2-3-57bbe08604e4@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
A deferred byte-range lock (an SMB2_LOCK that blocks) registers an async work on
conn->async_requests via setup_async_work(), with cancel_fn =
smb2_remove_blocked_lock and cancel_argv[0] pointing at the struct file_lock.
When the request is cancelled, the worker frees the file_lock with
locks_free_lock() and takes the cancelled early-exit, which "goto out"s and never
reaches release_async_work() -- the only site that unlinks the work from
conn->async_requests and clears cancel_fn/cancel_argv. The work therefore stays
matchable on async_requests with a live cancel_fn pointing at the freed file_lock,
until connection teardown finally runs release_async_work().
smb2_cancel() fires cancel_fn unconditionally with no state guard, so a second
SMB2_CANCEL for the same AsyncId, arriving in that window, re-runs
smb2_remove_blocked_lock() on the freed file_lock -- a slab use-after-free:
BUG: KASAN: slab-use-after-free in __locks_delete_block
__locks_delete_block
locks_delete_block
ksmbd_vfs_posix_lock_unblock
smb2_remove_blocked_lock
smb2_cancel <- 2nd SMB2_CANCEL fires cancel_fn
handle_ksmbd_work
Allocated by ...: locks_alloc_lock <- smb2_lock
Freed by ...: locks_free_lock <- smb2_lock (cancelled branch)
... cache file_lock_cache of size 192
Reproduced on mainline with KASAN by an authenticated SMB client.
Skip a work whose state is already KSMBD_WORK_CANCELLED so its cancel callback
cannot be fired a second time.
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Two concurrent same-user DHnC reconnects can both observe fp->conn == NULL
before either sets it. ksmbd_reopen_durable_fd() checks fp->conn to guard
against a handle already being reconnected, but the check and the binding
assignment are not atomic: both threads pass the guard, both call
ksmbd_conn_get() on the same fp, and both eventually reach
kfree(fp->owner.name) -- a double-free of the owner.name slab object.
The double-bound ksmbd_file also causes a write-UAF on the 344-byte
ksmbd_file_cache object when a concurrent smb2_close() spins on fp->f_lock
after the object has been freed by the losing reconnect path.
KASAN on 7.1-rc5 (48-thread concurrent reconnect, 3000 cycles):
BUG: KASAN: double-free in ksmbd_reopen_durable_fd+0x268/0x308
BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xac/0x150
Write of size 4 at offset 24 into freed ksmbd_file_cache object
Five double-bind windows observed; 63 total KASAN reports triggered.
Fix: validate and claim fp->conn under write_lock(&global_ft.lock) so the
check-and-claim is atomic. ksmbd_lookup_durable_fd() already treats
fp->conn != NULL as "in use" and skips such an fp; setting fp->conn before
dropping the lock closes the race. ksmbd_conn_get() is a non-sleeping
refcount increment, safe under the rwlock. The rollback path on __open_id()
failure also clears fp->conn/tcon under the lock so concurrent readers see
a consistent state.
Fixes: b1f1e80620de ("ksmbd: centralize ksmbd_conn final release to plug transport leak")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_oplock_break_noti() and smb2_lease_break_noti() read opinfo->conn
into a local with neither READ_ONCE() nor a NULL check. Both run from
oplock_break() after opinfo_get_list() has dropped ci->m_lock, so a
concurrent SMB2 LOGOFF (session_fd_check()) can set op->conn = NULL
under ci->m_lock within that window. ksmbd_conn_r_count_inc(conn) then
writes through NULL at offset 0xc4 -- a remotely triggerable oops.
Guard both reads the way compare_guid_key() already does: read
opinfo->conn with READ_ONCE() and return early if it is NULL, before
allocating the work struct so nothing leaks. A NULL conn means the
client is gone and the break is moot, so return 0; oplock_break() treats
that as success and runs the normal teardown.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb client fixes from Steve French:
- fix uninitialized variable in smb2_writev_callback()
- detect short folioq copy in cifs_copy_folioq_to_iter()
* tag 'v7.1-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix uninitialized variable in smb2_writev_callback
smb: client: detect short folioq copy in cifs_copy_folioq_to_iter()
|
|
The only user of LAST_XXX outside of fs/namei.c is fs/smb/server/vfs.c;
ksmbd_vfs_path_lookup() calls vfs_path_parent_lookup() and expects a
LAST_NORM last type (or it will be ENOENT). ksmbd_vfs_rename() also calls
vfs_path_parent_lookup() but forgets the LAST_NORM check.
It does not really make sense to have vfs_path_parent_lookup() expose
the last_type because it is only needed to ensure it is LAST_NORM. So
let's do this check in vfs_path_parent_lookup() instead and keep the
LAST_XXX internal to fs/namei.c. This changes the ENOENT errno in
ksmbd_vfs_path_lookup() to EINVAL, which matches better with how this is
handled by callers of filename_parentat().
Signed-off-by: Jori Koolstra <jkoolstra@xs4all.nl>
Link: https://patch.msgid.link/20260528175854.57626-1-jkoolstra@xs4all.nl
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
compiling with W=2 pointed out that "written may be used uninitialized"
Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Cc: stable@vger.kernel.org
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs_copy_folioq_to_iter() copies a requested number of bytes from
a folio queue into the destination iterator. Since the encrypted
SMB2 READ path was changed to pass the server-declared payload
length (data_len) instead of the larger folioq buffer length, the
caller can ask for fewer bytes than the folio queue holds.
In that case the helper continues walking the remaining folios after
data_size has reached zero and calls copy_folio_to_iter() with
len = 0, which is unnecessary work.
The helper also returns 0 (success) when the folio queue is
exhausted before data_size bytes have been copied. The caller has
no way to distinguish that from a full copy and the reported
transfer count ends up larger than the amount of data placed in the
iterator.
Add an early exit when data_size reaches zero, and return an error
when the folio queue is exhausted before all requested bytes have
been copied.
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
FSCTL_SET_SPARSE
FSCTL_SET_SPARSE in fsctl_set_sparse() modifies the file's sparse
attribute and saves it through xattr without any permission checks.
This exposes two issues:
1) A client on a read-only share can change the sparse attribute
on files it opened, even though the share is read-only.
Other FSCTL write operations already check
test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE),
but FSCTL_SET_SPARSE does not.
2) Even on writable shares, clients without FILE_WRITE_DATA or
FILE_WRITE_ATTRIBUTES access should not modify the sparse
attribute. Similar handle-level checks exist in other functions
but are missing here.
Add both share-level writable check and per-handle access check.
Use goto out on error to avoid leaking file references.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steve French <smfrench@gmail.com>
Signed-off-by: Sean Shen <grayhat@foxmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_query_inode_status() and ksmbd_lookup_fd_inode() both take a
reference on a ksmbd_inode via __ksmbd_inode_lookup() (which performs
atomic_inc_not_zero()) and later release it using a bare
atomic_dec(&ci->m_count). Unlike ksmbd_inode_put(), a bare
atomic_dec() does not check whether the reference count has reached
zero, so if the caller happens to drop the last reference, the
ksmbd_inode is leaked: it stays in the global inode hash table with
m_count == 0, future __ksmbd_inode_lookup() calls reject it via
atomic_inc_not_zero(), and ksmbd_inode_free() is never invoked.
The race is:
T1: __ksmbd_inode_lookup() -> atomic_inc_not_zero(): m_count = 2
T2: ksmbd_inode_put() -> atomic_dec_and_test(): m_count = 1
(not freed)
T1: atomic_dec(&ci->m_count) -> m_count = 0
return (LEAK)
In ksmbd_lookup_fd_inode() the matched-fp path (which now also uses
ksmbd_inode_put()) cannot currently reach m_count == 0 because the
matched ksmbd_file holds its own reference on ci, but converting it to
the proper API keeps the three call sites consistent and avoids
future regressions if the locking changes.
Because ksmbd_inode_put() may free the ksmbd_inode if this drops the
last reference, the call must happen after up_read(&ci->m_lock) on the
two affected paths in ksmbd_lookup_fd_inode(). On the no-match path
this is a pure reordering; on the matched path ksmbd_fp_get() is
moved above the unlock so that the returned ksmbd_file is pinned
before the inode reference is released.
Signed-off-by: Aleksandr Golovnya <cofedish@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit d07b26f39246 ("ksmbd: require minimum ACE size in
smb_check_perm_dacl()") introduced a transposed bounds check:
if (offsetof(struct smb_ace, sid) + aces_size < CIFS_SID_BASE_SIZE)
Since offsetof(..sid) is 8 and CIFS_SID_BASE_SIZE is 8, this evaluates
to `aces_size < 0`. Because `aces_size` is always non-negative, this
check becomes dead code and never breaks the loop.
Worse, that commit removed the old 4-byte guard, meaning the loop now
reads `ace->size` (offset 2) even when `aces_size` is 0-3 bytes. This
re-opens a 2-byte heap out-of-bounds (OOB) read past the pntsd allocation
during subsequent SMB2_CREATE operations.
Fix this by properly transposing the comparison to require at least
16 bytes (8-byte offset + 8-byte SID base), matching the correct form
used in smb_inherit_dacl().
Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()")
Cc: stable@vger.kernel.org
Signed-off-by: Ali Ganiyev <ali.qaniyev@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb server fixes from Steve French:
- fix for creating tmpfiles
- fix durable reconnect error path
- validate SID in security descriptor when inheriting DACL
* tag 'v7.1-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close
ksmbd: validate SID in parent security descriptor during ACL inheritance
ksmbd: fix durable reconnect error path file lifetime
|
|
Pull smb client fixes from Steve French:
- Fix missing lock
- Fix dentry in use after unmounting
- cifs.upcall security fix
- require CAP_NET_ADMIN for swn netlink
- change allocation in DUP_CTX_STR to GFP_KERNEL
- minor smbdirect debug fix
- handle_read_data() folio fix
* tag 'v7.1-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: change allocation requirements in DUP_CTX_STR macro
smb: client: require net admin for CIFS SWN netlink
smb: smbdirect: divide, not multiply, milliseconds by 1000
cifs: Fix busy dentry used after unmounting
smb: client: use data_len for SMB2 READ encrypted folioq copy
smb: client: reject userspace cifs.spnego descriptions
smb: client: protect tc_count increment in smb2_find_smb_sess_tcon_unlocked()
|
|
Reproducer:
1. server: systemctl start ksmbd
2. client: mount -t cifs //${server_ip}/export /mnt
3. client: C program: openat(AT_FDCWD, "/mnt", O_RDWR | O_TMPFILE, 0600)
Do not treat `FILE_DELETE_ON_CLOSE_LE` as delete pending while files
remain open.
This patch fixes xfstests generic/004.
Cc: stable@vger.kernel.org
Link: https://chenxiaosong.com/en/smb-xfstests-generic-004.html
Co-developed-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Tested-by: Steve French <stfrench@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Introduce smb_validate_ntsd_sid() helper to safely validate Owner SID
and Group SID inside the NT Security Descriptor (smb_ntsd) retrieved
from the parent directory.
Cc: stable@vger.kernel.org
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
After a durable reconnect succeeds, ksmbd_reopen_durable_fd() republishes
the same ksmbd_file into the session volatile-id table. If smb2_open()
then takes a later error path, cleanup first calls ksmbd_fd_put(work, fp)
and then unconditionally calls ksmbd_put_durable_fd(dh_info.fp).
In this case fp and dh_info.fp are the same object. The first put drops the
reconnect lookup reference, but the final durable put can run
__ksmbd_close_fd(NULL, fp). Because the final close is not session-aware,
it can free the file object without removing the volatile-id entry that was
just published into the session table.
Use the session-aware put for the final reconnect drop when the reconnect
had already succeeded and the error path is cleaning up the republished
file. Earlier reconnect failures, before fp is assigned to dh_info.fp, keep
using the durable-only put path.
Fixes: 1baff47b81f9 ("ksmbd: fix use-after-free in smb2_open during durable reconnect")
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently, the macro DUP_CTX_STR allocates new_ctx->field using
GFP_ATOMIC. DUP_CTX_STR is only used in smb3_fs_context_dup(), which
is never called in an atomic context. Using GFP_ATOMIC puts unnecessary
pressure on emergency memory pools.
Change GFP_ATOMIC to GFP_KERNEL.
Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
CIFS_GENL_CMD_SWN_NOTIFY is the userspace witness-notify command. The
intended sender is the cifs.witness helper, but the generic-netlink
operation currently has no capability flag, so any local process can send
RESOURCE_CHANGE or CLIENT_MOVE notifications to the in-kernel witness
handler.
The same family exposes CIFS_GENL_MCGRP_SWN without multicast-group
capability flags. Register messages sent to that group include the witness
registration id and, for NTLM-authenticated mounts, the username, domain,
and password attributes copied from the CIFS session. An unprivileged
local process should not be able to join that group and receive those
messages.
Require CAP_NET_ADMIN for incoming SWN_NOTIFY commands with
GENL_ADMIN_PERM, and require CAP_NET_ADMIN over the network namespace for
joining the SWN multicast group with GENL_MCAST_CAP_NET_ADMIN. The
cifs.witness service runs with the privileges needed for both operations.
Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Unless smbdirect_connection_legacy_debug_proc_show()
wants to debug-log keep_alive_interval as microseconds,
a magnitude higher precision than available by the way,
keepalive_interval_msec should not be multiplied by 1000.
Fixes: cc55f65dd352 ("smb: client: make use of common smbdirect_socket_parameters")
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since commit 340cea84f691c ("cifs: open files should not hold ref on
superblock"), cifs file only holds the dentry ref_cnt, the cifs file
close work(cfile->deferred) could be executed after unmounting, which
will trigger a warning in generic_shutdown_super:
BUG: Dentry 00000000a14a6845{i=c,n=file} still in use (1) [unmount of
cifs cifs]
The detailed processs is:
process A process B kworker
fd = open(PATH)
vfs_open
file->__f_path = *path // dentry->d_lockref.count = 1
cifs_open
cifs_new_fileinfo
cfile->dentry = dget(dentry) // dentry->d_lockref.count = 2
close(fd)
__fput
cifs_close
queue_delayed_work(deferredclose_wq, cfile->deferred)
dput(dentry) // dentry->d_lockref.count = 1
smb2_deferred_work_close
_cifsFileInfo_put
list_del(&cifs_file->flist)
umount
cleanup_mnt
deactivate_super
cifs_kill_sb
cifs_close_all_deferred_files_sb
cifs_close_all_deferred_files
// cannot find cfile, skip _cifsFileInfo_put
kill_anon_super
generic_shutdown_super
shrink_dcache_for_umount
umount_check
WARN ! // dentry->d_lockref.count = 1
cifsFileInfo_put_final
dput(cifs_file->dentry)
// dentry->d_lockref.count = 0
Fix it by flushing 'deferredclose_wq' before calling kill_anon_super.
Fetch a reproducer in https://bugzilla.kernel.org/show_bug.cgi?id=221548.
Fixes: 340cea84f691c ("cifs: open files should not hold ref on superblock")
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This flag indicates the path should be opened if it's a regular file.
This is useful to write secure programs that want to avoid being
tricked into opening device nodes with special semantics while thinking
they operate on regular files. This is a requested feature from the
uapi-group[1].
The previously introduced EFTYPE error code is returned when the path
doesn't refer to a regular file. For example, if openat2 is called on
path /dev/null with OPENAT2_REGULAR in the flag param, it will return
-EFTYPE.
When used in combination with O_CREAT, either the regular file is
created, or if the path already exists, it is opened if it's a regular
file. Otherwise, -EFTYPE is returned.
When OPENAT2_REGULAR is combined with O_DIRECTORY, -EINVAL is returned
as it doesn't make sense to open a path that is both a directory and a
regular file.
The UAPI bit lives in the upper 32 bits of open_how::flags
(((__u64)1 << 32)) so that open(2) and openat(2) -- whose @flags
argument is a C int -- cannot physically express it. This is a
structural guarantee, not a runtime mask: the bit is unrepresentable in
32 bits.
Because the rest of the VFS open path narrows to 32 bits in several
places (op->open_flag, f->f_flags, the unsigned open_flag argument of
i_op->atomic_open()), build_open_flags() translates OPENAT2_REGULAR
into a kernel-internal lower-32-bit carrier __O_REGULAR (bit 4, unused
as an O_* on every architecture) before the assignment to op->open_flag.
__O_REGULAR then rides through the existing channels exactly like
__FMODE_EXEC. do_dentry_open() strips it so it cannot leak back to
userspace via fcntl(F_GETFL).
Four BUILD_BUG_ON_MSG() invariants in build_open_flags() prevent any
future bit collision or accidental low-32 redefinition:
- VALID_OPEN_FLAGS fits in 32 bits.
- OPENAT2_REGULAR lives in the upper 32 bits.
- OPENAT2_REGULAR does not alias any open()/openat() flag.
- __O_REGULAR does not alias any user-visible flag.
[1]: https://uapi-group.org/kernel-features/#ability-to-only-open-regular-files
Christian Brauner <brauner@kernel.org> says:
Move OPENAT2_REGULAR to the upper 32 bits of open_how::flags with a
kernel-internal __O_REGULAR carrier so that open(2)/openat(2) cannot
encode the flag; add BUILD_BUG_ON_MSG() invariants and register
__O_REGULAR in the fcntl_init() allocation-uniqueness BUILD_BUG_ON()
(bit count 21 -> 22).
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Link: https://patch.msgid.link/20260328172314.45807-2-dorjoychy111@gmail.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Aleksa Sarai <aleksa@amutable.com>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Pull smb server fixes from Steve French:
- Fix two null pointer dereferences and a memory leak
* tag 'v7.1-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix null pointer dereference in compare_guid_key()
ksmbd: fix null pointer dereference in proc_show_files()
ksmbd: fix SID memory leak in set_posix_acl_entries_dacl() on overflow
|
|
In handle_read_data() the encrypted/folioq branch
(buf_len <= data_offset, reached via receive_encrypted_read for
transform PDUs > CIFSMaxBufSize + MAX_HEADER_SIZE) copies the READ
payload using buffer_len rather than data_len:
rdata->result = cifs_copy_folioq_to_iter(buffer, buffer_len,
cur_off,
&rdata->subreq.io_iter);
...
rdata->got_bytes = buffer_len;
buffer_len comes from the SMB3 transform header OriginalMessageSize
field (OriginalMessageSize - read_rsp_size); it represents the size
of the decrypted message after the SMB2 header. data_len comes from
the SMB2 READ response DataLength field; it represents the actual
READ payload size and may be smaller than buffer_len when the
decrypted message contains padding or other trailing bytes after the
READ payload. The existing check `data_len > buffer_len - pad_len`
only enforces an upper bound, so a server that emits
OriginalMessageSize larger than read_rsp_size + pad_len + data_len
passes the check and the kernel copies buffer_len bytes per response,
ignoring the server-asserted DataLength.
Two observable failures with a crafted server (DataLength=4,
buffer_len=20000):
- the kernel returns 20000 bytes per sub-request to userspace and
sets got_bytes = buffer_len, even though the response claimed
only 4 bytes of payload;
- on a partial netfs sub-request whose iterator is sized to
data_len, the over-large copy_folio_to_iter() short-reads,
cifs_copy_folioq_to_iter() returns -EIO via the n != len path,
and the entire netfs read collapses to -EIO even though the
leading sub-requests succeeded.
Use data_len for the copy length and for got_bytes so the kernel
honours the server-asserted READ payload size. For well-formed
servers (where buffer_len == pad_len + data_len) the change is
behaviour-equivalent.
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs.spnego key descriptions contain authority-bearing fields such as
pid, uid, creduid, and upcall_target that cifs.upcall treats as
kernel-originating inputs. However, userspace can also create keys of
this type through request_key(2) or add_key(2), allowing those fields to
be supplied without CIFS origin.
Only accept cifs.spnego descriptions while CIFS is using its private
spnego_cred to request the key.
Fixes: f1d662a7d5e5 ("[CIFS] Add upcall files for cifs to use spnego/kerberos")
Assisted-by: avom-custom-harness:gpt-5.5-qwen3.6-mod-mix
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit 96c4af418586 ("cifs: Fix locking usage for tcon fields")
refactored cifs code to change cifs_tcp_ses_lock for tc_lock around
tc_count changes.
There was missing lock around tc_count increment inside
smb2_find_smb_sess_tcon_unlocked().
Cc: stable@vger.kernel.org
Fixes: 96c4af418586 ("cifs: Fix locking usage for tcon fields")
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Fix a couple of undefined variables introduced by the patch to fix tearing
on ->remote_i_size and ->zero_point. For some reason, make W=1 with gcc
doesn't give undefined variable warnings (but clang does).
Fixes: 2c8f4742bb76 ("netfs: Fix potential for tearing in ->remote_i_size and ->zero_point")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605031459.eX5UbO3K-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202605021450.ca5QGqLH-lkp@intel.com/
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christian Brauner <brauner@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"This contains a fixes for the current development cycle. Note that AI
related review sometimes delays fixes a bit because we find more fixes
for the fixes. I might try and send smaller but more fixes PRs if this
trend keeps up.
- Fix various netfslib bugs
- Fix an out-of-bounds write when listing idmappings
- Fix the return values in jfs_mkdir() and orangefs_mkdir()
- Fix a writeback writeback array overflow in fuse
- Fix a forced iversion increment on lazytime timestamp updates
- Reject a negative timeval component in kern_select()
- Fix error return when vfs_mkdir() fails in the cachefiles code
- Fix wrong error code returned for pidns ioctls"
* tag 'vfs-7.1-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits)
cachefiles: Fix error return when vfs_mkdir() fails
afs: Fix the locking used by afs_get_link()
netfs, afs: Fix write skipping in dir/link writepages
netfs: Fix netfs_read_folio() to wait on writeback
netfs: Fix folio->private handling in netfs_perform_write()
netfs: Fix partial invalidation of streaming-write folio
netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages()
netfs: Fix leak of request in netfs_write_begin() error handling
netfs: Fix early put of sink folio in netfs_read_gaps()
netfs: Fix write streaming disablement if fd open O_RDWR
netfs: Fix read-gaps to remove netfs_folio from filled folio
netfs: Fix potential deadlock in write-through mode
netfs: Fix streaming write being overwritten
netfs: Defer the emission of trace_netfs_folio()
netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone
netfs: Fix overrun check in netfs_extract_user_iter()
netfs: fix error handling in netfs_extract_user_iter()
netfs: Fix potential uninitialised var in netfs_extract_user_iter()
netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call
netfs: Fix zeropoint update where i_size > remote_i_size
...
|
|
Fix smbdirect_map_sges_from_iter() to use pre-decrement, not post-decrement
so that it cleans up the correct slots.
Fixes: e5fbdde43017 ("cifs: Add a function to build an RDMA SGE list from an iterator")
Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Tom Talpey <tom@talpey.com>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB2 READ response validation in cifs_readv_receive() and
handle_read_data() checks data_offset + data_len against the received
buffer length. Both values are attacker-controlled fields from the
server response and are stored as unsigned int, so the addition can
wrap before the bounds check:
fs/smb/client/transport.c:1259
if (!use_rdma_mr && (data_offset + data_len > buflen))
fs/smb/client/smb2ops.c:4839
else if (buf_len >= data_offset + data_len)
A malicious SMB server can use this to bypass validation. In the
non-encrypted receive path the client attempts an oversized socket
read and stalls for the SMB response timeout (180 seconds) before
reconnecting. In the SMB3 encrypted path, runtime testing shows the
malformed length can reach copy_to_iter() in handle_read_data() with
attacker-controlled size, where usercopy hardening stops the oversized
copy before bytes reach userspace.
Guard both call sites with check_add_overflow(), which is already
used elsewhere in this subsystem (smb2pdu.c). On overflow, treat the
response as malformed and reject with -EIO.
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb3_reconfigure() moves strings out of cifs_sb->ctx before the
multichannel update, so a later failure can leave the live context
with NULL strings or options that do not match the session.
Stage the new ctx separately, commit it only on success, and restore
the snapshot on failure. Also make smb3_sync_session_ctx_passwords()
all-or-nothing.
Commit session passwords before channel updates so newly added channels
authenticate with the staged credentials.
Fixes: ef529f655a2c ("cifs: client: allow changing multichannel mount options on remount")
Reported-by: RAJASI MANDAL <rajasimandalos@gmail.com>
Closes: https://lore.kernel.org/lkml/CAEY6_V1+dzW3OD5zqXhsWyXwrDTrg5tAMGZ1AJ7_GAuRE+aevA@mail.gmail.com/
Link: https://lore.kernel.org/lkml/xkr2dlvgibq5j6gkcxd3yhhnj4atgxw2uy4eug2pxm7wy7nbms@iq6cf5taa65v/
Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
On 32-bit architectures, the infinite loop is as follows:
len = p->ErrorDataLength == 0xfffffff8
u8 *next = p->ErrorContextData + len
next == p
On 32-bit architectures, the out-of-bounds read is as follows:
len = p->ErrorDataLength == 0xfffffff0
u8 *next = p->ErrorContextData + len
next == (u8 *)p - 8
Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+")
Cc: stable@vger.kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
session_fd_check() walks the per-inode m_op_list during durable-handle
session teardown and sets op->conn = NULL for every opinfo whose conn
matched the closing session's connection. The matching opinfo, however,
stays linked in its per-ClientGuid lease_table_list entry's lb->lease_list
because destroy_lease_table() only runs on full TCP-connection teardown,
not on SESSION_LOGOFF.
If the same TCP connection then negotiates a fresh session with the
same ClientGuid (ClientGuid is bound to NEGOTIATE, not the session, and
is unchanged across LOGOFF + SETUP) and issues a SMB2 CREATE with a
lease context on a different inode, find_same_lease_key() walks
lb->lease_list, reaches the stale opinfo, and calls compare_guid_key(),
which unconditionally dereferences opinfo->conn->ClientGUID. The conn
pointer is NULL and the kernel panics.
Reproducer requires only a successful SMB2 SESSION_SETUP and a share
configured with 'durable handles = yes'. KASAN report on mainline
70390501d194:
general protection fault, probably for non-canonical address
0xdffffc0000000069: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000348-0x000000000000034f]
Workqueue: ksmbd-io handle_ksmbd_work
RIP: 0010:bcmp+0x5b/0x230
Call Trace:
compare_guid_key+0x4b/0xd0
find_same_lease_key+0x324/0x690
smb2_open+0x6aea/0x8e60
handle_ksmbd_work+0x796/0xee0
...
Faulting address 0x348 is the offset of ClientGUID within struct
ksmbd_conn, confirming opinfo->conn was NULL.
Read opinfo->conn once and bail out if it has been cleared by a
concurrent session_fd_check(). A half-detached opinfo cannot be the
owner of an active lease, so returning 0 is the correct match result.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Laratro <research@aradex.io>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When a SMB2 client opens a file with a durable v2 handle and then issues
SMB2 SESSION_LOGOFF, session_fd_check() clears fp->tcon = NULL on the
reconnectable file pointer but leaves the fp registered in global_ft.idr
until the durable scavenger fires (up to fp->durable_timeout seconds
later).
During that window any read of /proc/fs/ksmbd/files (mode 0400) panics
the kernel because proc_show_files() walks global_ft.idr and
unconditionally dereferences fp->tcon->id with no NULL guard.
Reproducer requires only a successful SMB2 SESSION_SETUP and a share
configured with 'durable handles = yes'. KASAN report on mainline
70390501d194:
general protection fault, probably for non-canonical address
0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:proc_show_files+0x118/0x740
Call Trace:
proc_show_files+0x118/0x740
seq_read_iter+0x4ef/0xe10
proc_reg_read_iter+0x1b7/0x280
...
Guard the dereference. A durable-disconnected fp legitimately has no
tcon; report its tree id as 0 rather than oopsing.
Fixes: b38f99c1217a ("ksmbd: add procfs interface for runtime monitoring and statistics")
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Laratro <research@aradex.io>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16
DACL size overflow") added check_add_overflow() guards that break out
of the ACE-building loops in set_posix_acl_entries_dacl() when the
accumulated DACL size would wrap past 65535.
However, each iteration allocates a struct smb_sid via kmalloc_obj()
at the top of the loop and relies on the kfree(sid) call at the end
of the loop body (the 'pass_same_sid' label in the first loop, and
the explicit kfree at the tail of the second loop) to release it.
The newly introduced 'break' statements bypass those kfree() calls,
leaking the sid buffer every time an overflow is detected.
A malicious or malformed file with enough POSIX ACL entries to trip
the overflow check will leak one or more struct smb_sid allocations
on every request that touches the file's DACL, providing a trivial
kernel memory exhaustion vector.
Free sid before breaking out of the loops to plug the leak.
Fixes: 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16 DACL size overflow")
Cc: stable@vger.kernel.org
Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
New Infolevels for QUERY_DIR (and QUERY_INFO) levels 78 through 81 are
now being used by Windows clients and were added to the documentation.
Add defines for them (and correct some typos in documentation). See
MS-SMB2 2.2.33 and MS-FSCC 2.4
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Fix potential tearing in using ->remote_i_size and ->zero_point by copying
i_size_read() and i_size_write() and using the same seqcount as for i_size.
We need to make sure that netfslib and the filesystems that use it always
hold i_lock whilst updating any of the sizes to prevent i_size_seqcount
from getting corrupted.
Fixes: 4058f742105e ("netfs: Keep track of the actual remote file size")
Fixes: 100ccd18bb41 ("netfs: Optimise away reads above the point at which there can be no data")
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-6-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Similarly to inode_state_read_once(), it makes the caller spell out
they acknowledge instability of the returned value.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20260421182538.1215894-2-mjguzik@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
FS_ATTRIBUTE_INFORMATION responses have always reported
FILE_CASE_SENSITIVE_SEARCH and FILE_CASE_PRESERVED_NAMES
unconditionally. Case-insensitive filesystems like exFAT, and
casefolded directories on ext4 or f2fs, have no way to signal
their actual semantics to SMB clients.
Now that filesystems expose case behavior through ->fileattr_get,
query it via vfs_fileattr_get() and translate the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags into the corresponding SMB
attributes. Filesystems without ->fileattr_get continue reporting
default POSIX behavior (case-sensitive, case-preserving).
SMB's FS_ATTRIBUTE_INFORMATION reports per-share attributes from
the share root, not per-file. Shares mixing casefold and
non-casefold directories report the root directory's behavior.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-15-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Upper layers such as NFSD need a way to query whether a filesystem
handles filenames in a case-sensitive manner. Report CIFS/SMB case
handling behavior via FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING.
The authoritative source is the server itself: at mount time CIFS
issues QueryFSInfo(FS_ATTRIBUTE_INFORMATION) and caches the reply
on the tcon. That reply carries FILE_CASE_SENSITIVE_SEARCH and
FILE_CASE_PRESERVED_NAMES, which reflect whatever case handling
the share actually implements after SMB3.1.1 POSIX extensions
negotiation. Translating those two bits into the VFS flags lets
cifs_fileattr_get report what the server advertises rather than
what the client was asked to pretend.
QueryFSInfo is best-effort; the mount completes even if the server
does not answer. MaxPathNameComponentLength is zero in that case
and is used as the "no reply received" sentinel. When no reply is
available, fall back to the nocase mount option so that the reported
behavior agrees with the dentry comparison operations installed on
the superblock.
The callback is registered on cifs_dir_inode_ops so that NFSD,
ksmbd, and other consumers querying case handling against a
directory get a definitive answer, and on cifs_file_inode_ops to
preserve FS_COMPR_FL reporting on regular files. cifs_set_ops()
also installs cifs_namespace_inode_operations on DFS referral
directories that carry IS_AUTOMOUNT; register the same callback
there so the answer does not depend on whether the directory is
a referral point.
Registering fileattr_get routes FS_IOC_GETFLAGS through
vfs_fileattr_get() and short-circuits the syscall's fallback to
cifs_ioctl(). That fallback invoked CIFSGetExtAttr() under
CONFIG_CIFS_POSIX and CONFIG_CIFS_ALLOW_INSECURE_LEGACY on servers
advertising CIFS_UNIX_EXTATTR_CAP, surfacing the SMB1 Unix-extension
immutable, append, and nodump bits. cifs_fileattr_get carries over
only FS_COMPR_FL from cached cifsAttrs; the SMB1 extattr fetch is
not reproduced. SMB1 is deprecated, and acquiring a netfid from
within a dentry-only callback is not worth preserving a path tied
to an insecure legacy dialect.
Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-9-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pull smb client fixes from Steve French:
- Fix for two ACL issues (security fix to validate dacloffset better
and chmod fix)
- Fix out of bounds reads (in check_wsl_eas and smb2_check_msg for
symlinks)
- Two Kerberos fixes including an important one when AES-256 encryption
chosen
- Fix open_cached_dir problem when directory leases disabled
* tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: validate dacloffset before building DACL pointers
smb/client: fix out-of-bounds read in smb2_compound_op()
smb/client: fix out-of-bounds read in symlink_data()
smb: client: Zero-pad short GSS session keys per MS-SMB2
smb: client: Use FullSessionKey for AES-256 encryption key derivation
smb: client: use kzalloc to zero-initialize security descriptor buffer
cifs: abort open_cached_dir if we don't request leases
|
|
parse_sec_desc(), build_sec_desc(), and the chown path in
id_mode_to_cifs_acl() all add the server-supplied dacloffset to pntsd
before proving a DACL header fits inside the returned security
descriptor.
On 32-bit builds a malicious server can return dacloffset near
U32_MAX, wrap the derived DACL pointer below end_of_acl, and then slip
past the later pointer-based bounds checks. build_sec_desc() and
id_mode_to_cifs_acl() can then dereference DACL fields from the wrapped
pointer in the chmod/chown rewrite paths.
Validate dacloffset numerically before building any DACL pointer and
reuse the same helper at the three DACL entry points.
Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If a server sends a truncated response but a large OutputBufferLength, and
terminates the EA list early, check_wsl_eas() returns success without
validating that the entire OutputBufferLength fits within iov_len.
Then smb2_compound_op() does:
memcpy(idata->wsl.eas, data[0], size[0]);
Where size[0] is OutputBufferLength. If iov_len is smaller than size[0],
memcpy can read beyond the end of the rsp_iov allocation and leak adjacent
kernel heap memory.
Link: https://lore.kernel.org/linux-cifs/d998240c-aca9-420d-9dbd-f5ba24af19e0@chenxiaosong.com/
Fixes: ea41367b2a60 ("smb: client: introduce SMB2_OP_QUERY_WSL_EA")
Cc: stable@vger.kernel.org
Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since smb2_check_message() returns success without length validation for
the symlink error response, in symlink_data() it is possible for
iov->iov_len to be smaller than sizeof(struct smb2_err_rsp). If the buffer
only contains the base SMB2 header (64 bytes), accessing
err->ErrorContextCount (at offset 66) or err->ByteCount later in
symlink_data() will cause an out-of-bounds read.
Link: https://lore.kernel.org/linux-cifs/297d8d9b-adf7-42fd-a1c2-5b1f230032bc@chenxiaosong.com/
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+")
Cc: Stable@vger.kernel.org
Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Per MS-SMB2 section 3.2.5.3, Session.SessionKey is the first 16 bytes
of the GSS cryptographic key, right-padded with zero bytes if the key
is shorter than 16 bytes.
SMB2_auth_kerberos() copies the GSS session key from the cifs.upcall
response using kmemdup(msg->data, msg->sesskey_len, ...) and stores
the GSS-reported length verbatim in ses->auth_key.len. generate_key()
reads SMB2_NTLMV2_SESSKEY_SIZE bytes from this buffer when feeding the
HMAC-SHA256 KDF for signing key derivation. If a GSS mechanism returns
a session key shorter than 16 bytes (e.g. a deprecated single-DES
Kerberos enctype with an 8-byte session key), the KDF call performs an
out-of-bounds slab read and derives keys that do not match the server,
which pads per the spec.
Modern KDCs disable short-key enctypes by default, so this is latent
rather than reachable in production, but it is still a kernel heap
over-read.
Allocate auth_key.response with kzalloc() at a length of
max(msg->sesskey_len, SMB2_NTLMV2_SESSKEY_SIZE), copy the GSS key in,
and rely on kzalloc()'s zero initialization for the spec-mandated
padding. Set ses->auth_key.len to the padded length. Larger GSS keys
(e.g. the 32-byte aes256-cts-hmac-sha1-96 session key) continue to be
stored at their natural length, preserving the FullSessionKey path.
Emit a cifs_dbg(VFS, ...) message when a short key is encountered to
surface deprecated-enctype usage.
NTLMv2 and NTLMSSP code paths produce a 16-byte session key by
construction and are unaffected.
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com>
Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM
or AES-256-GCM), the SMB3 encryption and decryption keys must be derived
using the full session key (Session.FullSessionKey) rather than just the
first 16 bytes (Session.SessionKey).
Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and
Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey
must be set to the full cryptographic key from the GSS authentication
context. The encryption and decryption key derivation (SMBC2SCipherKey,
SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The
signing key derivation continues to use Session.SessionKey (first 16
bytes) in all cases.
Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the
HMAC-SHA256 key input length for all derivations. When Kerberos with
AES-256 provides a 32-byte session key, the KDF for encryption/decryption
was using only the first 16 bytes, producing keys that did not match the
server's, causing mount failures with sec=krb5 and require_gcm_256=1.
Add a full_key_size parameter to generate_key() and pass the appropriate
size from generate_smb3signingkey():
- Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes)
- Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16
Also fix cifs_dump_full_key() to report the actual session key length for
AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools
like Wireshark receive the correct key for decryption.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com>
Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit 62e7dd0a39c2d ("smb: common: change the data type of num_aces
to le16") split struct smb_acl's __le32 num_aces field into __le16
num_aces and __le16 reserved. The reserved field corresponds to Sbz2
in the MS-DTYP ACL wire format, which must be zero [1].
When building an ACL descriptor in build_sec_desc(), we are using a
kmalloc()'ed descriptor buffer and writing the fields explicitly using
le16() writes now. This never writes to the 2 byte reserved field,
leaving it as uninitialized heap data.
When the reserved field happens to contain non-zero slab garbage,
Samba rejects the security descriptor with "ndr_pull_security_descriptor
failed: Range Error", causing chmod to fail with EINVAL.
Change kmalloc() to kzalloc() to ensure the entire buffer is
zero-initialized.
Fixes: 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16")
Cc: stable@vger.kernel.org
Signed-off-by: Bjoern Doebel <doebel@amazon.de>
Assisted-by: Kiro:claude-opus-4.6
[1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/20233ed8-a6c6-4097-aafa-dd545ed24428
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
It is possible that SMB2_open_init may not set lease context based
on the requested oplock level. This can happen when leases have been
temporarily or permanently disabled. When this happens, we will have
open_cached_dir making an open without lease context and the response
will anyway be rejected by open_cached_dir (thereby forcing a close to
discard this open). That's unnecessary two round-trips to the server.
This change adds a check before making the open request to the server
to make sure that SMB2_open_init did add the expected lease context
to the open in open_cached_dir.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb_inherit_dacl() walks the parent directory DACL loaded from the
security descriptor xattr. It verifies that each ACE contains the fixed
SID header before using it, but does not verify that the variable-length
SID described by sid.num_subauth is fully contained in the ACE.
A malformed inheritable ACE can advertise more subauthorities than are
present in the ACE. compare_sids() may then read past the ACE.
smb_set_ace() also clamps the copied destination SID, but used the
unchecked source SID count to compute the inherited ACE size. That could
advance the temporary inherited ACE buffer pointer and nt_size accounting
past the allocated buffer.
Fix this by validating the parent ACE SID count and SID length before
using the SID during inheritance. Compute the inherited ACE size from the
copied SID so the size matches the bounded destination SID. Reject the
inherited DACL if size accumulation would overflow smb_acl.size or the
security descriptor allocation size.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shota Zaizen <s@zaizen.me>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The kernel test robot reported W=1 build warnings for ksmbd_conn_get()
and ksmbd_conn_put() due to missing parameter descriptions.
Add the @conn description to fix these warnings.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Non-pipe shares must have a duplicated backing path before they can be
published. share_config_request() currently calls kstrndup() for that
path, but if the allocation fails it leaves ret unchanged. If veto list
parsing succeeds and share->name exists, the partially built share is
still inserted into the global share table with share->path left NULL.
A later share-root SMB2 create uses tree_conn->share_conf->path as the
lookup root. If the share was published with path == NULL, that request
passes a NULL pathname into do_getname_kernel()/strlen() and can crash
the ksmbd worker.
Set ret = -ENOMEM when path duplication fails so the incomplete share is
destroyed before publication.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_durable_scavenger() has two related races against any walker
that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode()
(used by ksmbd_vfs_rename) and the share-mode checks in
fs/smb/server/smb_common.c.
(1) fp->node list-head reuse. Durable-preserved handles can remain
linked on f_ci->m_fp_list after session teardown so share-mode checks
still see them while the handle is reconnectable. The scavenger
collected expired handles by adding fp->node to a local
scavenger_list after removing them from the global durable idr.
Because fp->node is the same list_head used by m_fp_list,
list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links
and corrupts both lists. CONFIG_DEBUG_LIST can report this on the
share-mode walk path.
(2) Refcount race against m_fp_list walkers. The scavenger qualifies
an expired durable handle with atomic_read(&fp->refcount) > 1 and
fp->conn under global_ft.lock, removes fp from global_ft, then drops
global_ft.lock before unlinking fp from m_fp_list and freeing it.
During that gap fp is still linked on m_fp_list with f_state ==
FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls
ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and
takes a live reference; the scavenger then unlinks and frees fp
while the holder owns a reference, leading to UAF on the holder's
subsequent ksmbd_fd_put() and on any field reads performed by a
concurrent share-mode walker that iterates m_fp_list without taking
ksmbd_fp_get() (smb_check_perm_dleases-like paths).
Fix both:
* Stop reusing fp->node as a scavenger-private list node. Remove
one expired handle from global_ft under global_ft.lock, take an
explicit transient reference, drop the lock, unlink fp->node
from m_fp_list under f_ci->m_lock, then drop both the durable
lifetime and transient references with atomic_sub_and_test(2,
&fp->refcount). If the scavenger is the last putter the close
runs there; otherwise an in-flight holder that already raced
through the m_fp_list lookup owns the final close via its
ksmbd_fd_put() path. The one-at-a-time disposal can rescan the
durable idr when multiple handles expire in the same pass, but
durable scavenging is a background expiration path and the final
full scan recomputes min_timeout before the next wait.
* Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right
after idr_remove(), so a delayed final close from a holder that
snatched fp does not re-issue idr_remove() on a persistent id
that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have
already handed out to a brand-new durable handle.
* Bypass the per-conn open_files_count decrement in
__put_fd_final() when fp is detached from any session table
(fp->conn cleared by session_fd_check() at durable preserve --
paired with the volatile_id clear at unpublish, so checking
fp->conn alone is sufficient). The walker that owns the final
close runs from an unrelated work->conn whose
stats.open_files_count never tracked this durable fp; without
this guard the holder would underflow that unrelated counter.
The two races are folded into one patch because patch (1) alone
cleans up the corrupted list but leaves a deterministic UAF window
for m_fp_list walkers that the transient-reference and
persistent_id discipline in (2) close; bisecting onto an
intermediate state would land on a UAF that pre-patch chaos merely
made less reproducible.
Validation:
* CONFIG_DEBUG_LIST coverage for the list_head reuse path.
* KASAN-enabled direct SMB2 durable-handle coverage that exercised
ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode()
returns while durable handles expired under concurrent rename
lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING
reports.
* checkpatch --strict
* make -j$(nproc) M=fs/smb/server
Fixes: d484d621d40f ("ksmbd: add durable scavenger timer")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
__close_file_table_ids() is the per-session teardown that closes every
fp belonging to a session (or to one tree connect on that session) by
walking the session's volatile-id idr. The current loop has three
related problems on busy or racing workloads:
* Sleeping under ft->lock. The session-teardown skip callback,
session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner()
-> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a
rw_semaphore). Running the callback inside write_lock(&ft->lock)
trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a
durable-fd workload.
* Refcount accounting blind to f_state. The unconditional
atomic_dec_and_test(&fp->refcount) does not distinguish
FP_INITED (idr-owned reference still intact) from FP_CLOSED (an
earlier ksmbd_close_fd() already consumed the idr-owned reference
while leaving fp in the idr because a holder kept refcount
non-zero). When the latter races with teardown the same path
over-decrements into a holder reference and ksmbd_fd_put() later
UAFs that holder.
* FP_NEW window. Between __open_id() publishing fp into the
session idr and ksmbd_update_fstate(..., FP_INITED) committing the
transition at the end of smb2_open(), an fp is in FP_NEW and an
intervening teardown that takes a transient reference and
unpublishes the volatile id leaves the original idr-owned
reference orphaned -- the opener is unaware that fp has been
unpublished, returns success to the client, and the fp leaks at
refcount = 1.
Refactor __close_file_table_ids() to take a transient reference on fp
and unpublish fp from the session idr *under ft->lock* before calling
skip() outside the lock. A transient ref protects lifetime but not
concurrent field mutation, so the idr_remove() is what keeps
__ksmbd_lookup_fd() through this session's idr from granting a new
ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon /
fp->volatile_id / op->conn / lock_list links are about to be rewritten
by session_fd_check(). Durable reconnect is unaffected because it
reaches fp through the global durable table (ksmbd_lookup_durable_fd
-> global_ft).
Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition
under ft->lock so teardown and ksmbd_close_fd() never both consume the
idr-owned reference. See ksmbd_mark_fp_closed() for the per-state
accounting. For the FP_NEW path to be safe, the opener has to learn
that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT
when an FP_NEW -> FP_INITED transition finds f_state already advanced
or the volatile id cleared (both committed by teardown under
ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID
and drops the original reference via ksmbd_fd_put().
The list removal cannot be left for a deferred final putter because
fp->volatile_id has already been cleared and __ksmbd_remove_fd() will
intentionally skip both idr_remove() and list_del_init(). Move the
m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so
that an FP_NEW fp that happened to be added to m_fp_list (smb2_open()
adds fp->node before ksmbd_update_fstate() runs) is still cleaned up
on the deferred putter path; list_del_init() on an empty node is a
no-op and remains safe for fps that were never added.
Add a defensive guard in session_fd_check() that refuses non-FP_INITED
fps so that even if a teardown reaches an FP_NEW fp it falls into the
close branch (where the n_to_drop = 1 accounting keeps the opener's
reference alive) instead of the durable-preserve branch (which mutates
fp->conn / fp->tcon).
Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST
and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload
(open/write storm on one tcon, 50 tree disconnects on the other) and
reported no list-corruption, work_struct ODEBUG, sleep-in-atomic,
lockdep or kmemleak reports. Reverting only the
__close_file_table_ids() hunk while keeping a forced-is_reconnectable()
harness produced the expected sleep-in-atomic at vfs_cache.c:1095,
confirming the ft->lock-out-of-sleepable-skip discipline.
KASAN-enabled direct SMB2 coverage with durable handles enabled
exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(),
the FP_NEW failure path, tree_conn_fd_check(), and a non-zero
session_fd_check() durable-preserve return. This produced no KASAN,
DEBUG_LIST, ODEBUG, or WARNING reports.
Fixes: f44158485826 ("cifsd: add file operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_conn_free() is one of four sites that can observe the last
refcount drop of a struct ksmbd_conn. The other three
fs/smb/server/connection.c ksmbd_conn_r_count_dec()
fs/smb/server/oplock.c __free_opinfo()
fs/smb/server/vfs_cache.c session_fd_check()
end the conn with a bare kfree(), skipping
ida_destroy(&conn->async_ida) and
conn->transport->ops->free_transport(conn->transport). Whenever one
of them is the last putter, the embedded async_ida and the entire
transport struct leak -- for TCP, that is also the struct socket and
the kvec iov.
__free_opinfo() being a final putter is not theoretical. opinfo_put()
queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so
ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and
have ksmbd_conn_free() run in the handler thread before any of them
fire. ksmbd_conn_free() then observes refcnt > 0 and short-circuits;
the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn)
branch and the transport is lost.
A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare:
each iteration holds 8 files open via sleep processes, force-closes
TCP with "ss -K sport = :445", kills the holders, lazy-umounts;
repeated 10 times, then ksmbd shutdown and kmemleak scan.
state conn_alloc conn_free tcp_free opi_rcu kmemleak
---------- ---------- --------- -------- ------- --------
pre-patch 20 20 10 160 7
with patch 20 20 20 160 0
Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak backtraces point
into struct tcp_transport / iov. With this patch tcp_free matches
conn_free at 20/20 and kmemleak is clean.
Move the per-struct final release into __ksmbd_conn_release_work() and
route the three bare-kfree final-put sites through a new
ksmbd_conn_put(). Those sites now pair ida_destroy() and
free_transport() with kfree(conn) regardless of which holder happens
to release the last reference. stop_sessions() only triggers the
transport shutdown and does not itself drop the last conn reference,
so it is unaffected.
The centralized release reaches sock_release() -> tcp_close() ->
lock_sock_nested() (might_sleep) from every final putter, including
__free_opinfo() invoked from an RCU softirq callback, which trips
CONFIG_DEBUG_ATOMIC_SLEEP. Defer the release to a dedicated
ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any
non-sleeping context.
Make ksmbd_file own a strong connection reference while fp->conn is
non-NULL so durable-preserve and final-close paths cannot dereference
a stale connection. ksmbd_open_fd() and ksmbd_reopen_durable_fd()
take the reference via ksmbd_conn_get() (the latter also reorders the
fp->conn / fp->tcon assignments before __open_id() so the published fp
is never observed with fp->conn == NULL); session_fd_check() and
__ksmbd_close_fd() drop it via ksmbd_conn_put(). With that invariant,
session_fd_check() can take a local conn pointer once and use it
across the m_op_list and lock_list iterations even though op->conn
puts may otherwise drop the last reference.
At module exit the workqueue is flushed and destroyed after
rcu_barrier(), so any release queued by a trailing RCU callback is
drained before the inode hash and module text go away.
Fixes: ee426bfb9d09 ("ksmbd: add refcnt to ksmbd_conn struct")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ib_dma_map_sg() modifies the provided scatterlist and returns the
number of mapped entries, which can be fewer than the requested
mr->sgt.nents if the DMA controller coalesces contiguous memory
segments. Passing the original, uncoalesced count to ib_map_mr_sg()
causes memory registration failures if coalescing actually occurs.
Capture the actual mapped count returned by ib_dma_map_sg() and pass it
to ib_map_mr_sg() to ensure correct MR registration.
Also update the ib_dma_map_sg() error logging to drop the error
pointer formatting, since the return value is an integer count
rather than an error code.
Ensure a proper error code (-EIO) is assigned when DMA mapping or
MR registration fails.
Fixes: de5ef8ec3c46 ("smb: smbdirect: introduce smbdirect_mr.c with client mr code")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221408
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Yi Kuo <yi@yikuo.dev>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This makes it easier to rebuild cifs.ko and ksmbd.ko against
a running kernel.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This is a better solution than
EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") as it makes
it possible to rebuild smbdirect.ko against a
running kernel and then load the existing cifs.ko and ksmbd.ko
from the running kernel.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb server fixes from Steve French:
- Fix shutdown (stop sessions)
- Fix readdir unsupported info level
* tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: rewrite stop_sessions() with restartable iteration
smb: server: handle readdir_info_level_struct_sz() error
|
|
Today we skip calling change_conf for negotiates and session setup
requests. This can be a problem for mchan as the immediate next call
after session setup could be due to an I/O that is made on the
mount point. For single channel, this is not a problem as
there will be several calls after setting up session.
This change enforces calling change_conf when the total credits contain
enough for reservations for echoes and oplocks. We expect this to happen
during the last session setup response. This way, echoes and oplocks are
not disabled before the first request to the server. So if that first
request is an open, it does not need to disable requesting leases.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently, smb2_compound_op() allocates
struct smb2_compound_vars *vars using GFP_ATOMIC, although
smb2_compound_op() can sleep when it calls compound_send_recv()
before vars is freed.
Allocate vars using GFP_KERNEL.
Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
stop_sessions() walks conn_list with hash_for_each() and, for every
entry, drops conn_list_lock across the transport ->shutdown() call
before re-acquiring the read lock to continue the loop. The hash
walk relies on cross-iteration state (the current bucket and the
hlist position), which is not preserved across unlock/relock: if
another thread performs a list mutation during the unlocked window,
the ongoing iteration becomes unreliable and can re-visit
connections that have already been handled or skip connections that
have not. The outer `if (!hash_empty(conn_list)) goto again;` retry
masks the symptom in the common case but does not address the
unsafe iteration itself.
Reframe the loop so it never relies on iterator state across
unlock/relock. Under conn_list_lock held for read, pick the first
connection whose ->shutdown() has not yet been issued by this path,
pin it by taking an extra reference, record that fact on the
connection and mark it EXITING while still inside the locked walk,
then drop the lock. Then call ->shutdown() outside the lock, drop
the pin (freeing the connection if the handler already released its
reference), and restart from the top.
Use a new per-connection flag, conn->stop_called, as the "shutdown
issued from stop_sessions()" marker rather than reusing the status
state. ksmbd_conn_set_exiting() is also invoked by
ksmbd_sessions_deregister() on sibling channels of a multichannel
session without issuing a transport shutdown, so treating
KSMBD_SESS_EXITING as "already handled here" would skip connections
that still need shutdown() to wake their handler out of recv(),
leaving the outer retry waiting indefinitely for the hash to drain.
stop_sessions() is serialised by init_lock in
ksmbd_conn_transport_destroy(), so writing stop_called under the
read lock has no other writer.
Set EXITING inside the locked walk so the selection, the stop_called
marker, and the status transition all happen together, and guard
against regressing a connection that has already advanced to
KSMBD_SESS_RELEASING on its own (for example, if the handler exited
its receive loop for an unrelated reason between teardown steps).
When the pin drop is the last put, release the transport and pair
ida_destroy(&target->async_ida) with the ida_init() done in
ksmbd_conn_alloc(), so stop_sessions() retiring a connection on its
own does not leak the xarray backing of the embedded async_ida.
The outer retry with msleep() is kept to wait for handler threads to
reach ksmbd_conn_free() and drain the hash.
Observed with an instrumented build that logs one line per visit and
widens the unlocked window before ->shutdown() by 200 ms, under
five concurrent cifs mounts (nosharesock, one connection each):
* Current code: the same connection address is revisited many
times during a single stop_sessions() call and ->shutdown() is
invoked well beyond the number of live connections before the
hash finally drains.
* Rewritten code: each live connection produces exactly one
->shutdown() call; the function returns as soon as the hash is
empty.
Functional teardown via `ksmbd.control --shutdown` with the same
five mounts completes cleanly on the rewritten path.
Performance is observably unchanged. Tearing down N concurrent
nosharesock cifs connections with `ksmbd.control --shutdown` +
`rmmod ksmbd` takes essentially the same wall time before and after
the rewrite:
N before after
10 4.93s 5.34s
30 7.34s 7.03s
50 7.31s 7.01s (3-run avg: 7.04s vs 7.25s)
100 6.98s 6.78s
200 6.77s 6.89s
and the number of ->shutdown() calls equals the number of live
connections on both paths when the race is not widened. The
teardown is dominated by the msleep(100)-based outer retry waiting
for handler threads to run ksmbd_conn_free(), not by the iteration
itself; the restartable loop's worst-case O(N^2) visit cost is in
the microseconds even at N=200 and sits far below the msleep(100)
granularity.
Applied alone on top of ksmbd-for-next-next, this patch does not
introduce a new leak site. Under the same reproducer (10x
concurrent-holders + ss -K + ksmbd.control --shutdown + rmmod), the
tree still shows the pre-existing per-connection transport leak
count that arises when the last refcount drop lands in one of
ksmbd_conn_r_count_dec(), __free_opinfo() or session_fd_check() -
all of which end with a bare kfree() today. kmemleak backtraces
for the unreferenced objects point into the TCP accept path
(sk_clone -> inet_csk_clone_lock, sock_alloc_inode) and none
involve stop_sessions(). Plugging those bare-kfree sites is the
responsibility of the follow-up patch.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
early exit in smb2_populate_readdir_entry() if the requested info_level
is unknown.
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Commit abdb1742a3123 ("cifs: get rid of mount options string parsing")
removed the last caller.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull more smb server updates from Steve French:
- move fs/smb/common/smbdirect to fs/smb/smbdirect
- change signature calc to use AES-CMAC library, simpler and faster
- invalid signature fix
- multichannel fix
- open create options fix
- fix durable handle leak
- cap maximum lock count to avoid potential denial of service
- four connection fixes: connection free and session destroy IDA fixes,
refcount fix, connection leak fix, max_connections off by one fix
- IPC validation fix
- fix out of bounds write in getting xattrs
- fix use after free in durable handle reconnect
- three ACL fixes: fix potential ACL overflow, harden num_aces check,
and fix minimum ACE size check
* tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd:
smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/
smb: server: stop sending fake security descriptors
ksmbd: scope conn->binding slowpath to bound sessions only
ksmbd: fix CreateOptions sanitization clobbering the whole field
ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open
ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount
ksmbd: destroy async_ida in ksmbd_conn_free()
ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()
ksmbd: Use AES-CMAC library for SMB3 signature calculation
ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()
ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment
ksmbd: use check_add_overflow() to prevent u16 DACL size overflow
ksmbd: fix use-after-free in smb2_open during durable reconnect
ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl()
smb: server: fix max_connections off-by-one in tcp accept path
ksmbd: require minimum ACE size in smb_check_perm_dacl()
ksmbd: validate response sizes in ipc_validate_msg()
smb: server: fix active_num_conn leak on transport allocation failure
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Four bug fixes: OOB read in ioctl query info, 3 ACL fixes
- SMB1 Unix extensions mount fix
- Four crypto improvements: move to AES-CMAC library, simpler and faster
- Remove drop_dir_cache to avoid potential crash, and move to /procfs
- Seven SMB3.1.1 compression fixes
* tag 'v7.1-rc1-part3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()
smb: client: Make generate_key() return void
smb: client: Remove obsolete cmac(aes) allocation
smb: client: Use AES-CMAC library for SMB3 signature calculation
smb: common: add SMB3_COMPRESS_MAX_ALGS
smb: client: compress: add code docs to lz77.c
smb: client: compress: LZ77 optimizations
smb: client: compress: increase LZ77_MATCH_MAX_DIST
smb: client: compress: fix counting in LZ77 match finding
smb: client: compress: fix buffer overrun in lz77_compress()
smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl
smb: client: fix (remove) drop_dir_cache module parameter
smb: client: require a full NFS mode SID before reading mode bits
smb: client: validate the whole DACL before rewriting it in cifsacl
smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path
cifs: update internal module version number
smb: client: compress: fix bad encoding on last LZ77 flag
smb: client: fix dir separator in SMB1 UNIX mounts
|
|
This also removes the smbdirect_ prefix from the files.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/linux-cifs/CAHk-=whmue3PVi88K0UZLZO0at22QhQZ-yu+qO2TOKyZpGqecw@mail.gmail.com/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since the crypto library API is now being used instead of crypto_shash,
all structs for MAC computation are now just fixed-size structs
allocated on the stack; no dynamic allocations are ever required.
Besides being much more efficient, this also means that the
'allocate_crypto' argument to smb2_calc_signature() and
smb3_calc_signature() is no longer used. Remove this unused argument.
Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since the crypto library API is now being used instead of crypto_shash,
generate_key() can no longer fail. Make it return void and simplify the
callers accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Since the crypto library API is now being used instead of crypto_shash,
the "cmac(aes)" crypto_shash that is being allocated and stored in
'struct cifs_secmech' is no longer used. Remove it.
That makes the kconfig selection of CRYPTO_CMAC and the module softdep
on "cmac" unnecessary. So remove those too.
Finally, since this removes the last use of crypto_shash from the smb
client, also remove the remaining crypto_shash-related helper functions.
Note: cifs_unicode.c was relying on <linux/unaligned.h> being included
transitively via <crypto/internal/hash.h>. Since the latter include is
removed, make cifs_unicode.c include <linux/unaligned.h> explicitly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Convert smb3_calc_signature() to use the AES-CMAC library instead of a
"cmac(aes)" crypto_shash.
The result is simpler and faster code. With the library there's no need
to allocate memory, no need to handle errors except for key preparation,
and the AES-CMAC code is accessed directly without inefficient indirect
calls and other unnecessary API overhead.
For now a "cmac(aes)" crypto_shash is still being allocated in
'struct cifs_secmech'. Later commits will remove that, simplifying the
code even further.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Set it to number of currently defined algorithms (6 as of now).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Document parts of the code, especially the apparently
non-sense parts.
Other:
- change pointer increment constants to sizeof() values
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).
Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
defined hot path (match finding) and a cold path (encode literals +
match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
aggressively if too few matches are being found
- name some constants for more meaningful context
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Increase max distance (i.e. window size) from 1k to 8k.
This allows better compression and is just as fast.
Other:
- drop LZ77_MATCH_MIN_DIST as it's nused -- main loop
already checks if dist > 0
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
- lz77_match_len() increments @cur before checking for equality,
leading to off-by-one match len in some cases.
Fix by moving pointers increment to inside the loop.
Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
"buf + step < end" when the correct is "<=" for such cases.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.
However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.
To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).
While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
After validate_dacl() was factored out in commit 149822e5541c, the
local end_of_dacl in parse_dacl() is only read by the dump_ace()
call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the
variable is assigned but never used, which gcc -W=1 flags as
-Wunused-but-set-variable.
Remove the local and compute the end-of-dacl pointer inline at the
single call site inside the existing CIFS_DEBUG2 guard. No
functional change: when CIFS_DEBUG2 is enabled the argument value
is identical to what the removed local carried; when CIFS_DEBUG2
is disabled the code was already dead.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/
Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl")
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Being a module parameter, it's possible to do:
# modprobe cifs drop_dir_cache=1
Which will lead to a crash, because cifs_tcp_ses_list hasn't been
initialized yet:
[ 168.242624] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 168.242952] #PF: supervisor read access in kernel mode
[ 168.243175] #PF: error_code(0x0000) - not-present page
[ 168.243394] PGD 0 P4D 0
[ 168.243524] Oops: Oops: 0000 [#1] SMP NOPTI
[ 168.243703] CPU: 2 UID: 0 PID: 1105 Comm: modprobe Not tainted 7.0.0-lku #5 PREEMPT(lazy)
[ 168.244054] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014
[ 168.244557] RIP: 0010:cifs_param_set_drop_dir_cache+0x7c/0x100 [cifs]
...
[ 168.248785] Call Trace:
[ 168.248915] <TASK>
[ 168.249023] parse_args+0x285/0x3a0
[ 168.249204] ? __pfx_unknown_module_param_cb+0x10/0x10
[ 168.249448] load_module+0x192b/0x1bb0
[ 168.249637] ? __pfx_unknown_module_param_cb+0x10/0x10
[ 168.249882] ? kernel_read_file+0x27d/0x2b0
[ 168.250088] init_module_from_file+0xce/0xf0
[ 168.250291] idempotent_init_module+0xfb/0x2f0
[ 168.250496] __x64_sys_finit_module+0x5a/0xa0
[ 168.250694] do_syscall_64+0xe0/0x5a0
[ 168.250863] ? exc_page_fault+0x65/0x160
[ 168.251050] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 168.251284] RIP: 0033:0x7fcaa12b774d
Instead of fixing this with some kind of "is module initialized"
approach, this patch instead moves that functionality to procfs,
setting a write op for the existing open_dirs entry, where
writing a 0 to it will drop the cached directory entries.
Also make it available only when CONFIG_CIFS_DEBUG=y.
A small change needed now is to not call flush_delayed_work()
on invalidate_all_cached_dirs() when called from procfs (can't sleep in
that context).
So add a @sync arg to invalidate_all_cached_dirs() to control when to
flush the delayed works.
Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents")
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS
mode SID and reads sid.sub_auth[2] to recover the mode bits.
That assumes the ACE carries three subauthorities, but compare_sids()
only compares min(a, b) subauthorities. A malicious server can return
an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still
matches sid_unix_NFS_mode and then drives the sub_auth[2] read four
bytes past the end of the ACE.
Require num_subauth >= 3 before treating the ACE as an NFS mode SID.
This keeps the fix local to the special-SID mode path without changing
compare_sids() semantics for the rest of cifsacl.
Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a
server-supplied dacloffset and then use the incoming ACL to rebuild the
chmod/chown security descriptor.
The original fix only checked that the struct smb_acl header fits before
reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate
header-field OOB read, but the rewrite helpers still walk ACEs based on
pdacl->num_aces with no structural validation of the incoming DACL body.
A malicious server can return a truncated DACL that still contains a
header, claims one or more ACEs, and then drive
replace_sids_and_copy_aces() or set_chmod_dacl() past the validated
extent while they compare or copy attacker-controlled ACEs.
Factor the DACL structural checks into validate_dacl(), extend them to
validate each ACE against the DACL bounds, and use the shared validator
before the chmod/chown rebuild paths. parse_dacl() reuses the same
validator so the read-side parser and write-side rewrite paths agree on
what constitutes a well-formed incoming DACL.
Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|