Skip to content

Fully ignore private IP literals as outbound connections (early return)#310

Open
Mishenevd wants to merge 1 commit into
mainfrom
fix/ignore-private-ips-outbound
Open

Fully ignore private IP literals as outbound connections (early return)#310
Mishenevd wants to merge 1 commit into
mainfrom
fix/ignore-private-ips-outbound

Conversation

@Mishenevd

@Mishenevd Mishenevd commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

Follow-up to #308. The agent records every getAllByName() argument as an outbound connection, including raw private/internal IP literals. Those come from infrastructure, not real outbound domains, and flooded the "new outbound connection" feature with private IPs on port 0.

Sources we confirmed:

  • Reactor Netty (WebClient / Netty-backed RestClient) resolver bootstrap on the first client call: 0.0.0.0 and :: from io.netty.channel.epoll.Native.<clinit>, and each /etc/resolv.conf nameserver via UnixResolverDnsServerAddressStreamProvider (private on ECS: VPC resolver 10.x, 169.254.169.253, 127.0.0.53).
  • Service discovery connecting by IP (10.20.x.x), and a startup library resolving RFC1918 base addresses (10.0.0.0, 172.16.0.0, ...).

The fix

DNSRecordCollector.report() returns early when the looked-up host is a private IP literal:

Set<Integer> ports = PendingHostnamesStore.getAndRemove(hostname);

if (IsPrivateIP.isPrivateIp(hostname)) {
    return;
}
// ... record hostname, outbound blocking, SSRF (unchanged) ...

#308 only skipped the HostnamesStore record but still fell through to the outbound-domain blocking check. In lockdown mode (blockNewOutgoingRequests) that would block these internal resolutions and break the app. The early return skips both the record and the block, consistent with the other Zen agents.

Behaviour

Scenario Result
Resolve or connect to a private IP literal (getAllByName("10.20.11.143"), or Netty bootstrap resolving 0.0.0.0 / nameservers) Fully ignored. Not recorded as an outbound connection, and not run through outbound blocking, so lockdown mode does not block it.
Private IP reached via a URL (http://10.0.0.1:8080) URLCollector registers the pending port, then getAllByName("10.0.0.1") returns early. Nothing recorded, not blocked in lockdown, and the pending port is still consumed.
Outbound to a real domain, including internal names that resolve to a private IP (keycloak.internal...) Unchanged. Hostname recorded, lockdown still applies, SSRF / stored-SSRF still run on the resolved IPs.
Public IP literal Unchanged. Still recorded.

SSRF is unaffected: it never fires on an IP literal (hostname == ip is treated as "no resolution, safe"), and real domains do not hit the early return.

Tests

  • testPrivateIpLiteralNotRecordedAsOutboundHostname, testPrivateIpLiteralWithPendingPortStillConsumedButNotRecorded (from Don't record private IP literals as outbound hostnames (Zen alert flood) #308) still pass.
  • testPrivateIpLiteralNotBlockedInLockdownMode — a private IP literal is not blocked in lockdown mode.
  • testPrivateIpLiteralViaUrlInLockdownNotBlockedNorRecorded — private IP via URL: not recorded, not blocked, pending port consumed.
  • testHostnameResolvingToPrivateIpStillRecorded, testPublicIpLiteralStillRecorded confirm domains and public IPs are unaffected.

🤖 Generated with Claude Code

Summary by Aikido

Security Issues: 0 Quality Issues: 0 Resolved Issues: 0

🐛 Bugfixes

  • Ignored private IP literals early to prevent recording and blocking

More info

Follow-up to #308. The agent records every getAllByName() argument as an
outbound connection, including raw private/internal IP literals. These come from
infrastructure rather than real outbound domains: the Reactor Netty resolver
bootstrap resolving the any-address/nameservers, service discovery connecting by
IP, a library building a private-IP matcher at startup, etc. They flooded the
"new outbound connection" feature with private IPs on port 0.

#308 skipped recording them but still fell through to the outbound-domain
blocking check, so in lockdown mode (blockNewOutgoingRequests) these internal
resolutions would be blocked and break the app. This returns early for private
IP literals, skipping both the record and the block, consistent with the other
Zen agents.

Real domains that resolve to private IPs are not literals, so they fall through
and are still tracked, blocked by lockdown, and SSRF-checked. SSRF is unaffected:
it never fires on a literal (hostname == ip is treated as safe).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mishenevd pushed a commit that referenced this pull request Jul 1, 2026
…gression

Follow-up to the reverted #308/#310. Customer flood was InetAddress.getAllByName()
picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::,
/etc/resolv.conf nameservers) as "new outbound connections" on port 0, and
blocking them in lockdown mode. #310 fixed the flood with an early return in
DNSRecordCollector.report() that also skipped the SSRF check below it -
verified with a regression test that this let an attacker-supplied private-IP
literal (e.g. a webhook field pointing straight at 169.254.169.254) through
undetected.

Investigating further found the actual root cause is bigger: Spring's
WebClient was never instrumented at all, and Reactor Netty's default HTTP
client bypasses InetAddress.getAllByName() entirely (it uses its own async
DNS resolver). So even after wrapping WebClient to register pending ports,
DNSRecordCollector was never invoked for real WebClient targets - confirmed
empirically via trace logs against a live running app, with distinct markers
proving InetAddressWrapper never fires for WebClient/Reactor Netty traffic in
this configuration. WebClient had zero outbound-domain visibility and zero
SSRF protection, independent of the original bug.

- DNSRecordCollector: narrow the private-IP-literal gate to only skip
  recording + outbound blocking when there's no pending port (genuine infra
  noise). SSRF checks are unconditional again, fixing the bypass above.
- SpringWebClientWrapper: register pending host+port for every WebClient
  request by hooking ExchangeFunction.exchange(), the interface every
  WebClient call goes through, same pattern as the existing OkHttp/Apache/JDK
  HttpClient wrappers. Uses string-based ByteBuddy matchers
  (hasSuperType(named(...))) instead of .class literals, since spring-webflux
  is compileOnly and only present on the target app's classloader - a .class
  reference in the matcher crashes the agent at premain with
  NoClassDefFoundError.
- SocketChannelWrapper: hook java.nio.channels.SocketChannel.connect(), the
  JDK-level call every NIO-based client (including Reactor Netty) makes once
  it has a resolved address, regardless of which DNS resolver produced it.
  This is what actually closes the gap for WebClient, and it also catches
  literal IP targets that never go through any resolver at all. Not
  Netty-specific instrumentation - it's a generic JDK hook with no references
  to io.netty.* types.
- DNSRecordCollector.reportConnect(): entry point for the new hook. Peeks the
  pending port instead of consuming it (report()'s getAndRemove), because a
  single request can trigger multiple connect() calls to the same hostname
  (e.g. the IPv4 then IPv6 address of a dual-stack host like localhost).
  Consuming on the first attempt let a blocked SSRF target succeed on the
  second attempt via the other address family - found live, fixed, covered by
  a regression test.
- PendingHostnamesStore: peeking instead of consuming means entries rely on
  WebRequestCollector's per-incoming-request clear() for cleanup, which never
  fires for WebClient calls made outside any incoming-request context (e.g. a
  @scheduled background task). Capped the store at 1000 entries per thread,
  evicting the least-recently-used one once exceeded - the same bounded-LRU
  pattern (LinkedHashMap with accessOrder=true + removeEldestEntry())
  already used by Hostnames.java for the same class of problem. Deliberately
  not a time-based TTL, to avoid a timing-dependent race reopening the
  dual-stack gap under load.
- RequestController (SpringWebfluxSampleApp): new /api/request endpoint used
  to validate all of the above against a real running app end to end.

Known limitation, not fixed here: Spring WebFlux has no request-body taint
tracking at all (SpringWebfluxContextObject never populates
ContextObject.body), so SSRF via JSON body can't be detected for WebFlux apps
regardless of this change - flagged separately, doesn't regress anything.
Mishenevd pushed a commit that referenced this pull request Jul 1, 2026
…gression

Follow-up to the reverted #308/#310. Customer flood was InetAddress.getAllByName()
picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::,
/etc/resolv.conf nameservers) as "new outbound connections" on port 0, and
blocking them in lockdown mode. #310 fixed the flood with an early return in
DNSRecordCollector.report() that also skipped the SSRF check below it -
verified with a regression test that this let an attacker-supplied private-IP
literal (e.g. a webhook field pointing straight at 169.254.169.254) through
undetected.

Investigating further found the actual root cause is bigger: Spring's
WebClient was never instrumented at all, and Reactor Netty's default HTTP
client bypasses InetAddress.getAllByName() entirely (it uses its own async
DNS resolver). So even after wrapping WebClient to register pending ports,
DNSRecordCollector was never invoked for real WebClient targets - confirmed
empirically via trace logs against a live running app, with distinct markers
proving InetAddressWrapper never fires for WebClient/Reactor Netty traffic in
this configuration. WebClient had zero outbound-domain visibility and zero
SSRF protection, independent of the original bug.

- DNSRecordCollector: narrow the private-IP-literal gate to only skip
  recording + outbound blocking when there's no pending port (genuine infra
  noise). SSRF checks are unconditional again, fixing the bypass above.
- SpringWebClientWrapper: register pending host+port for every WebClient
  request by hooking ExchangeFunction.exchange(), the interface every
  WebClient call goes through, same pattern as the existing OkHttp/Apache/JDK
  HttpClient wrappers. Uses string-based ByteBuddy matchers
  (hasSuperType(named(...))) instead of .class literals, since spring-webflux
  is compileOnly and only present on the target app's classloader - a .class
  reference in the matcher crashes the agent at premain with
  NoClassDefFoundError.
- SocketChannelWrapper: hook java.nio.channels.SocketChannel.connect(), the
  JDK-level call every NIO-based client (including Reactor Netty) makes once
  it has a resolved address, regardless of which DNS resolver produced it.
  This is what actually closes the gap for WebClient, and it also catches
  literal IP targets that never go through any resolver at all. Not
  Netty-specific instrumentation - it's a generic JDK hook with no references
  to io.netty.* types.
- DNSRecordCollector.reportConnect(): entry point for the new hook. Peeks the
  pending port instead of consuming it (report()'s getAndRemove), because a
  single request can trigger multiple connect() calls to the same hostname
  (e.g. the IPv4 then IPv6 address of a dual-stack host like localhost).
  Consuming on the first attempt let a blocked SSRF target succeed on the
  second attempt via the other address family - found live, fixed, covered by
  a regression test.
- PendingHostnamesStore: peeking instead of consuming means entries rely on
  WebRequestCollector's per-incoming-request clear() for cleanup, which never
  fires for WebClient calls made outside any incoming-request context (e.g. a
  @scheduled background task). Capped the store at 1000 entries per thread,
  evicting the least-recently-used one once exceeded - the same bounded-LRU
  pattern (LinkedHashMap with accessOrder=true + removeEldestEntry())
  already used by Hostnames.java for the same class of problem. Deliberately
  not a time-based TTL, to avoid a timing-dependent race reopening the
  dual-stack gap under load.
- RequestController (SpringWebfluxSampleApp): new /api/request endpoint used
  to validate all of the above against a real running app end to end.

Known limitation, not fixed here: Spring WebFlux has no request-body taint
tracking at all (SpringWebfluxContextObject never populates
ContextObject.body), so SSRF via JSON body can't be detected for WebFlux apps
regardless of this change - flagged separately, doesn't regress anything.
Mishenevd pushed a commit that referenced this pull request Jul 1, 2026
…gression

Follow-up to the reverted #308/#310. Customer flood was InetAddress.getAllByName()
picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::,
/etc/resolv.conf nameservers) as "new outbound connections" on port 0, and
blocking them in lockdown mode. #310 fixed the flood with an early return in
DNSRecordCollector.report() that also skipped the SSRF check below it -
verified with a regression test that this let an attacker-supplied private-IP
literal (e.g. a webhook field pointing straight at 169.254.169.254) through
undetected.

Investigating further found the actual root cause is bigger: Spring's
WebClient was never instrumented at all, and Reactor Netty's default HTTP
client bypasses InetAddress.getAllByName() entirely (it uses its own async
DNS resolver). So even after wrapping WebClient to register pending ports,
DNSRecordCollector was never invoked for real WebClient targets - confirmed
empirically via trace logs against a live running app, with distinct markers
proving InetAddressWrapper never fires for WebClient/Reactor Netty traffic in
this configuration. WebClient had zero outbound-domain visibility and zero
SSRF protection, independent of the original bug.

- DNSRecordCollector: narrow the private-IP-literal gate to only skip
  recording + outbound blocking when there's no pending port (genuine infra
  noise). SSRF checks are unconditional again, fixing the bypass above.
- SpringWebClientWrapper: register pending host+port for every WebClient
  request by hooking ExchangeFunction.exchange(), the interface every
  WebClient call goes through, same pattern as the existing OkHttp/Apache/JDK
  HttpClient wrappers. Uses string-based ByteBuddy matchers
  (hasSuperType(named(...))) instead of .class literals, since spring-webflux
  is compileOnly and only present on the target app's classloader - a .class
  reference in the matcher crashes the agent at premain with
  NoClassDefFoundError.
- SocketChannelWrapper: hook java.nio.channels.SocketChannel.connect(), the
  JDK-level call every NIO-based client (including Reactor Netty) makes once
  it has a resolved address, regardless of which DNS resolver produced it.
  This is what actually closes the gap for WebClient, and it also catches
  literal IP targets that never go through any resolver at all. Not
  Netty-specific instrumentation - it's a generic JDK hook with no references
  to io.netty.* types.
- DNSRecordCollector.reportConnect(): entry point for the new hook. Peeks the
  pending port instead of consuming it (report()'s getAndRemove), because a
  single request can trigger multiple connect() calls to the same hostname
  (e.g. the IPv4 then IPv6 address of a dual-stack host like localhost).
  Consuming on the first attempt let a blocked SSRF target succeed on the
  second attempt via the other address family - found live, fixed, covered by
  a regression test.
- PendingHostnamesStore: peeking instead of consuming means entries rely on
  WebRequestCollector's per-incoming-request clear() for cleanup, which never
  fires for WebClient calls made outside any incoming-request context (e.g. a
  @scheduled background task). Capped the store at 1000 entries per thread,
  evicting the least-recently-used one once exceeded - the same bounded-LRU
  pattern (LinkedHashMap with accessOrder=true + removeEldestEntry())
  already used by Hostnames.java for the same class of problem. Deliberately
  not a time-based TTL, to avoid a timing-dependent race reopening the
  dual-stack gap under load.
- RequestController (SpringWebfluxSampleApp): new /api/request endpoint used
  to validate all of the above against a real running app end to end.

Known limitation, not fixed here: Spring WebFlux has no request-body taint
tracking at all (SpringWebfluxContextObject never populates
ContextObject.body), so SSRF via JSON body can't be detected for WebFlux apps
regardless of this change - flagged separately, doesn't regress anything.
Mishenevd pushed a commit that referenced this pull request Jul 1, 2026
…gression

Follow-up to the reverted #308/#310. Customer flood was InetAddress.getAllByName()
picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::,
/etc/resolv.conf nameservers) as "new outbound connections" on port 0, and
blocking them in lockdown mode. #310 fixed the flood with an early return in
DNSRecordCollector.report() that also skipped the SSRF check below it -
verified with a regression test that this let an attacker-supplied private-IP
literal (e.g. a webhook field pointing straight at 169.254.169.254) through
undetected.

Investigating further found the actual root cause is bigger: Spring's
WebClient was never instrumented at all, and Reactor Netty's default HTTP
client bypasses InetAddress.getAllByName() entirely (it uses its own async
DNS resolver). So even after wrapping WebClient to register pending ports,
DNSRecordCollector was never invoked for real WebClient targets - confirmed
empirically via trace logs against a live running app, with distinct markers
proving InetAddressWrapper never fires for WebClient/Reactor Netty traffic in
this configuration. WebClient had zero outbound-domain visibility and zero
SSRF protection, independent of the original bug.

- DNSRecordCollector: narrow the private-IP-literal gate to only skip
  recording + outbound blocking when there's no pending port (genuine infra
  noise). SSRF checks are unconditional again, fixing the bypass above.
- SpringWebClientWrapper: register pending host+port for every WebClient
  request by hooking ExchangeFunction.exchange(), the interface every
  WebClient call goes through, same pattern as the existing OkHttp/Apache/JDK
  HttpClient wrappers. Uses string-based ByteBuddy matchers
  (hasSuperType(named(...))) instead of .class literals, since spring-webflux
  is compileOnly and only present on the target app's classloader - a .class
  reference in the matcher crashes the agent at premain with
  NoClassDefFoundError.
- SocketChannelWrapper: hook java.nio.channels.SocketChannel.connect(), the
  JDK-level call every NIO-based client (including Reactor Netty) makes once
  it has a resolved address, regardless of which DNS resolver produced it.
  This is what actually closes the gap for WebClient, and it also catches
  literal IP targets that never go through any resolver at all. Not
  Netty-specific instrumentation - it's a generic JDK hook with no references
  to io.netty.* types.
- DNSRecordCollector.reportConnect(): entry point for the new hook. Peeks the
  pending port instead of consuming it (report()'s getAndRemove), because a
  single request can trigger multiple connect() calls to the same hostname
  (e.g. the IPv4 then IPv6 address of a dual-stack host like localhost).
  Consuming on the first attempt let a blocked SSRF target succeed on the
  second attempt via the other address family - found live, fixed, covered by
  a regression test.
- PendingHostnamesStore: peeking instead of consuming means entries rely on
  WebRequestCollector's per-incoming-request clear() for cleanup, which never
  fires for WebClient calls made outside any incoming-request context (e.g. a
  @scheduled background task). Capped the store at 1000 entries per thread,
  evicting the least-recently-used one once exceeded - the same bounded-LRU
  pattern (LinkedHashMap with accessOrder=true + removeEldestEntry())
  already used by Hostnames.java for the same class of problem. Deliberately
  not a time-based TTL, to avoid a timing-dependent race reopening the
  dual-stack gap under load.
- SpringWebClientRedirectWrapper: WebClient calls with followRedirect(true)
  never re-invoke Spring's request-adaptation layer for redirect hops (Reactor
  Netty resends bodiless requests internally), so a redirect to a private IP
  was invisible to both tracking and SSRF - same failure mode as the DNS gap
  above, just one layer up. Hooks HttpClientConnect$HttpClientHandler.redirect()
  (package-private, mirroring the same tradeoff HttpConnectionRedirectWrapper
  already makes for the JDK's equally-internal followRedirect0) and feeds the
  chain into the existing RedirectCollector/PrivateIPRedirectFinder mechanism,
  the same one already used for JDK HttpURLConnection redirects.
- RequestController (SpringWebfluxSampleApp): /api/request endpoint (plus a
  followRedirect(true) variant) used to validate all of the above against a
  real running app end to end, and now wired into end2end/spring_webflux_postgres.py
  as an automated "ssrf" e2e payload.

Known limitation, not fixed here: Spring WebFlux has no request-body taint
tracking at all (SpringWebfluxContextObject never populates
ContextObject.body), so SSRF via JSON body can't be detected for WebFlux apps
regardless of this change - flagged separately, doesn't regress anything.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant