TLS/PKI Testing in Practice: What Scanners Miss

Jan 14

By Nathan Gabriel Wang, Associate Consultant, Artais

Introduction

TLS scanners are useful. They catch expired certificates, weak ciphers, and obvious misconfigurations quickly. At the surface, they save time.

But here's the catch: they also miss important context. TLS scanners assess what the server presents, not how different clients actually behave. The scanners don't know if a missing intermediate breaks your mobile app or if your internal hostnames are covered by SANs they can't see. They flag HSTS preload as missing without knowing whether preload even makes sense for your domain.

The result is often a report full of findings that are technically accurate but practically misleading. This lack of context can lead to teams debating scanner scores instead of focusing on the one thing that is breaking clients in production. A "B" rating might be fine. A "failing" OCSP check might not matter at all. We'll dive into the gaps scanners leave and how to validate what actually matters.

The following examples use placeholder domains and are intended to demonstrate validation approaches, not to audit specific production systems.

Authorization note: Only run these commands against systems you own or have explicit permission to test.

The Misconceptions Scanners Create

Scanners encourage a few assumptions that don't hold up in practice:

"The chain is correct because the scanner validated it." Scanners test from one vantage point. Your users connect from different networks, devices, and TLS stacks. A chain that works for the scanner might fail elsewhere.
"If it works in Chrome, it works everywhere." Browsers are forgiving. They fetch missing intermediates, cache certificates, and handle edge cases gracefully. Older clients, APIs, and mobile apps don't.
"Every finding is a security issue." Scanners report what they detect, not what's exploitable. Missing OCSP stapling, absent HSTS preload, and certificate transparency warnings are often informational, not vulnerabilities.
"TLS is binary: secure or insecure." Real-world TLS is about tradeoffs. A configuration that scores poorly might be intentional. A "passing" scan doesn't mean the setup is appropriate for your threat model.

Let's dig into specific areas where this plays out.

Chain Building Quirks

In practice, this often shows up as a certificate that "works in the browser" but fails in mobile apps, older Java runtimes, or embedded clients.

# See what certificates the server actually sends (not what it could fetch)
openssl s_client -connect example.com:443 -servername example.com -showcerts </dev/null 2>/dev/null | \
  grep -c "BEGIN CERTIFICATE"

# Inspect Authority Information Access (AIA) to see if intermediates are referenced but not sent
openssl s_client -connect example.com:443 -servername example.com </dev/null 2>/dev/null | \
  openssl x509 -noout -text | grep -A2 "Authority Information Access"

What scanners get wrong here: A missing intermediate isn't automatically exploitable. Many clients fetch intermediates on their own via AIA. Different TLS stacks handle this differently, and scanners don't account for that.

Some stricter or older clients do not fetch intermediates automatically. For example, validating a chain against a restricted trust store (such as older Java runtimes) can fail even when modern browsers succeed. This typically surfaces as client-specific connection failures rather than a universal security issue.

SAN Pitfalls

# Show all Subject Alternative Names (SANs) on the certificate
openssl s_client -connect example.com:443 -servername example.com </dev/null 2>/dev/null | \
  openssl x509 -noout -ext subjectAltName

What scanners get wrong here: Hostname mismatch findings are often noise. Browsers validate against SANs, not the CN field. If the SAN covers the hostname in use, there's no issue. Scanners flag what they see, not what clients actually validate.

OCSP Stapling: Recommendation vs Reality

# Check whether the server staples an OCSP response
openssl s_client -connect example.com:443 -servername example.com -status </dev/null 2>&1 | \
  grep -A5 -i "OCSP"

What scanners get wrong here: No stapling doesn't mean revocation checking is broken. Most clients soft-fail, meaning they'll connect even if OCSP is unreachable. Reporting this as a vulnerability overstates the actual risk.

An important edge case is the Must-Staple extension. If a certificate includes Must-Staple and the server fails to staple an OCSP response, some clients will hard-fail the connection. This scenario is rare, but critical when present.

HSTS vs HSTS Preload

# Check whether HSTS is enabled via HTTP response headers
curl -sIL https://example.com | grep -i strict-transport-security

# Check whether the domain is included in the HSTS preload list
curl -s "https://hstspreload.org/api/v2/status?domain=example.com"

What scanners get wrong here: Not being preloaded isn't a vulnerability: it's a tradeoff. Preload is effectively permanent and applies to all subdomains. Many sites skip it intentionally. The HSTS header alone provides protection after the first visit.

Preload also has strict eligibility requirements: a max-age of at least one year, the includeSubDomains directive, and the preload flag. Domains that miss any requirement are silently excluded, which scanners often report without explaining why preload eligibility was never met.

Validating with OpenSSL (Briefly)

These commands are provided as illustrative validation techniques. They show how to inspect certificate chains, SANs, OCSP behavior, and HSTS configuration using standard tooling. Results will vary depending on the target environment and client behavior. If no TLS service is listening on the target host/port, openssl s_client will not return a certificate and downstream x509 parsing will fail. This is expected.

Reporting TLS Issues Responsibly

Security: actual exploitability. Can an attacker intercept traffic, impersonate the server, or downgrade the connection? Examples: expired certificates, self-signed certs in production, weak key sizes.

Availability: things that break clients. Missing intermediates might not be exploitable, but they'll cause connection failures for certain clients. That's an availability issue, not necessarily a security one.

Hygiene: best practices that reduce future risk but aren't actively exploitable. Missing HSTS preload, OCSP stapling disabled, certificate transparency issues. Worth noting, but don't call them critical.

Good vs bad framing

Bad:

"The server does not implement OCSP stapling. This could allow an attacker to use a revoked certificate without detection. Risk: High."

This overstates the issue. OCSP stapling is a performance optimization and a defense-in-depth measure, not a critical control. Most clients soft-fail on revocation checks anyway.

Better:

"OCSP stapling is not enabled. While this is a recommended hardening measure, most browsers implement soft-fail revocation checking, meaning the practical impact is limited. Consider enabling stapling to reduce latency and improve revocation reliability. Risk: Informational."

Bad:

"The site is not on the HSTS preload list. An attacker could perform a man-in-the-middle attack on the first connection. Risk: Medium."

This ignores context. Preload has tradeoffs, and the HSTS header alone mitigates the risk after first visit.

Better:

"The domain is not in the HSTS preload list. The HSTS header is present with a one-year max-age, which protects returning visitors. Preload would protect first-time visitors but is effectively permanent and applies to all subdomains. Evaluate whether preload is appropriate for this domain's operational requirements. Risk: Low."

The goal is accuracy. Inflated findings damage credibility and make it harder for teams to prioritize what actually matters.

Conclusion

TLS issues are usually about mismatched assumptions: between what the server sends and what clients expect, between what scanners flag and what actually breaks, between best practices and operational reality.

Scanners are a starting point, not the answer. Use them to find obvious problems, then validate manually. Check what the server actually sends. Understand how your clients behave. Frame findings based on real impact, not scanner severity scores.

The goal isn't a perfect grade. It's a configuration that works for your users and matches your risk tolerance.

Mark Hammond