Zion Boggan

In-depth vulnerability research, detection engineering & applied cryptography.

● Open to security-research & detection roles
GitHub · LinkedIn · Email
← Research notebook
SSRF

How I Found Two SSRF Vulnerabilities in a Major Cloud Platform's Image Pipeline

Author: HackerOne artemispwns1 / Bugcrowd Researcher
Disclosure: under the program’s responsible disclosure policy; vendor and specific technical details intentionally omitted per policy.

Introduction

I have been doing independent security research for a few months now, working through bug bounty programs on HackerOne and Bugcrowd. What started as curiosity about how web applications handle trust boundaries has turned into a genuine discipline, and the findings keep getting more interesting as I learn to think at the architectural level rather than just scanning for low-hanging fruit.

Recently I completed an audit of a major cloud-based media processing platform through their bug bounty program. The engagement produced two distinct Server-Side Request Forgery (SSRF) vulnerabilities in separate code paths, both stemming from the same root cause: inconsistent URL validation across different parts of the application. I cannot name the vendor or disclose specific technical details per the program’s disclosure policy, but I can share the methodology, the thought process, and what I learned about how these kinds of gaps form in production systems.

This writeup is for anyone getting into bug bounty research or application security who wants to understand how to move beyond surface-level recon and start thinking about systemic vulnerabilities.

Background: What is SSRF and Why Does It Matter?

Server-Side Request Forgery is a vulnerability class where an attacker can make a server issue HTTP requests on their behalf to destinations the attacker should not be able to reach. The classic example is tricking a server into calling the AWS metadata endpoint at 169.254.169.254, which can return temporary IAM credentials and expose the entire cloud environment.

SSRF has been climbing the OWASP Top 10 and sits in every serious attacker’s playbook because cloud infrastructure made internal network access exponentially more valuable. A single SSRF in the right place can pivot from “I can make a server fetch a URL” to “I now have AWS keys to the production S3 buckets.” The impact scales with the trust the server has on the internal network.

The Methodology: Thinking in Code Paths

When I started this audit, my first instinct was to test the obvious entry points. The platform had a URL-based fetch feature where you supply an external URL and the server retrieves the resource. Standard stuff. I tested internal IPs against this endpoint and got exactly what I expected: a clean 403 Forbidden. They had SSRF protections in place.

A less experienced version of me would have stopped there. Good filter, move on.

But I had been reading about how large platforms evolve over time. Features get built by different teams, at different times, with different security assumptions. The upload API team probably thought carefully about SSRF. The question is whether every other team that added URL-accepting functionality did the same.

So I mapped every parameter and feature in the platform that could potentially cause the server to make an outbound HTTP request. Not just the obvious ones like “fetch this URL” but also webhook callbacks, notification endpoints, asynchronous processing triggers, and file format features that might resolve external references during server-side rendering.

This is where the shift from scanning to auditing happens. You stop testing individual inputs and start testing trust boundaries across the entire application surface.

Finding 1: Blind SSRF via Webhook Callbacks

The platform had a notification system where you could specify a callback URL that receives a POST request after certain operations complete. Think of it like a webhook: “when the job is done, POST the results to this URL.”

The interesting thing was that the main upload parameter had solid SSRF filtering. Internal IPs were blocked, RFC1918 ranges were rejected, the metadata endpoint was explicitly denied. But the notification URL parameter, which also causes the server to make an outbound HTTP request, had zero filtering.

Same server. Same outbound HTTP client (presumably). Completely different security posture depending on which parameter you used.

Here is a generic representation of what that comparison looked like:

# The upload/fetch parameter correctly blocks internal IPs:
curl -X POST "https://api.example.com/v1/upload" \
 -d "file=http://169.254.169.254/latest/meta-data/" \
 -d "api_key=${API_KEY}" \
 -d "timestamp=${TIMESTAMP}" \
 -d "signature=${SIGNATURE}"
# Response: 403 Forbidden - blocked by SSRF filter

# The notification/callback parameter accepts the EXACT same internal address:
curl -X POST "https://api.example.com/v1/upload" \
 -d "file=https://normal-image.png" \
 -d "callback_url=http://169.254.169.254/latest/meta-data/" \
 -d "api_key=${API_KEY}" \
 -d "timestamp=${TIMESTAMP}" \
 -d "signature=${SIGNATURE}"
# Response: 200 OK - upload succeeds, server POSTs to metadata endpoint

That contrast is the entire finding in two commands. Same API, same server, same internal target, two completely different outcomes depending on which parameter carries the URL.

I tested this across the full range of internal targets to confirm it was not limited to a single address:

Target Description Result
http://169.254.169.254/latest/meta-data/ AWS metadata Accepted
http://169.254.170.2/v2/credentials ECS credentials Accepted
http://127.0.0.1:6379/ Localhost Redis Accepted
http://127.0.0.1:9200/ Localhost Elastic Accepted
http://10.0.0.1:80/ RFC1918 Class A Accepted
http://172.16.0.1:80/ RFC1918 Class B Accepted

Every single internal address that the upload parameter correctly blocked was silently accepted through the notification parameter.

The severity escalation came from discovering that this gap existed not just in authenticated API calls but also in a feature that allowed preconfigured notification URLs to be triggered without authentication. This meant that once the configuration was set, anyone who knew the right identifiers could trigger the SSRF repeatedly with no API key, no signature, and no rate limiting.

The impact profile: blind SSRF to AWS metadata endpoints, localhost services on arbitrary ports, and RFC1918 internal networks. The POST body contained JSON with several attacker-controllable fields, meaning internal services that parse incoming webhooks could receive partially crafted payloads.

I rated this as High severity. The platform’s existing SSRF protections on the upload parameter proved they understood the risk. The gap in the notification parameter was a systemic oversight, not a design choice.

Finding 2: SSRF via Image Format Processing

The second finding came from a completely different angle. The platform supports server-side image transformation, converting between formats, resizing, applying effects. One of the supported input formats allows embedded references to external URLs as part of the file specification.

When the server processes this file format and performs a transformation (for example, converting it to PNG), the underlying image processing library resolves those external references by making its own HTTP requests. These requests happen inside the rendering engine, completely outside the URL validation layer that protects the upload and fetch endpoints.

The crafted input file looked something like this (generalized):

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg"
 xmlns:xlink="http://www.w3.org/1999/xlink"
 width="500" height="500">
 <image xlink:href="https://attacker-controlled-url.com/test"
 width="500" height="500"/>
</svg>

When the server converts this to a raster format, the image processing library follows that external reference and fetches whatever URL is specified. The fetched content gets rendered into the output image as pixel data.

I tested several targets to map the behavior:

Target Result
External URL returning an image Full content rendered into output (101KB PNG)
External URL returning text / JSON Broken image icon rendered (139KB PNG)
AWS metadata endpoint (internal) Blank output, 12+ second response time
Baseline with no external ref Blank output, ~300 bytes, sub-second

Two illustrative outputs from the engagement (vendor identifiers cropped / omitted; the platform’s image processor faithfully renders fetched content into its output pixel data):

External URL returning text: SSRF text render proof

External URL returning an image: SSRF external image fetch proof

The timing differential was the key evidence for internal network access. Normal transformations completed in under one second. Transformations referencing the metadata endpoint consistently took 12+ seconds before returning, indicating the server attempted the connection and waited for a response that never came.

I confirmed this by uploading a specially crafted file with an external URL reference, then triggering a format conversion through the delivery API. The output image contained rendered content from the external URL, proving the server had fetched it. When I tested with the AWS metadata endpoint, the transformation took 12+ seconds (versus less than one second for normal transforms), confirming the connection attempt through timing analysis.

The key insight: this SSRF vector exists in a completely different code path from the first finding. The upload filter, the fetch filter, and the notification filter are all irrelevant here because the image processing library makes its own HTTP requests that none of those filters touch.

The impact was more constrained than the first finding. Responses are rendered as pixel data rather than returned as raw text, so text-based secrets (like AWS credentials in JSON format) are not directly readable. But image-format responses from internal services (monitoring dashboards, status pages, cached graphics) would be fully exfiltrated. And the timing differential still enables internal network mapping and port scanning.

I rated this as Medium severity. Confirmed external URL resolution with content rendered into output, confirmed internal IP connection attempts via timing analysis, but limited by the pixel-rendering constraint on data extraction.

For anyone defending against this class of vulnerability, the fix for the image processing path is well-documented and straightforward. Most image processing libraries support policy configuration that disables external URL resolution entirely:

<!-- Restrict the image processing library from resolving external URLs -->
<policy domain="coder" rights="none" pattern="URL" />
<policy domain="coder" rights="none" pattern="HTTPS" />
<policy domain="coder" rights="none" pattern="HTTP" />

For the webhook/callback path, the remediation is to apply the same IP/URL validation that already exists on the upload path to every other parameter that triggers outbound requests. Block RFC1918, loopback, link-local, and validate resolved DNS before making callbacks to prevent DNS rebinding.

These are not novel recommendations. The fact that they need to be applied separately to each code path is exactly why these gaps exist in the first place.

The Systemic Pattern

What makes these two findings interesting together is the pattern they reveal. The platform had invested in SSRF protections, and those protections worked correctly where they were applied. The problem was coverage.

There were at least four separate code paths that cause the server to make outbound HTTP requests:

  1. The file upload/fetch path (protected)
  2. The notification/webhook callback path (unprotected)
  3. The image processing/rendering path (unprotected)
  4. Other async processing paths (untested but likely similar)

Each of these was probably built by a different team or at a different time. The team that built the upload path thought about SSRF and implemented filtering. The teams that built the notification system and the image processing pipeline either did not think about SSRF or assumed it was handled elsewhere.

This is the pattern I see most often in mature platforms. The vulnerability is not that they do not know about SSRF. It is that their SSRF defenses are applied per-feature rather than per-network-boundary. Every outbound HTTP request from the server should pass through the same validation layer, regardless of which feature triggered it.

What I Learned

Map the full request surface, not just the obvious inputs. The most interesting SSRF vectors are rarely in the parameter named “url.” They are in webhook callbacks, file format parsers, async job configurations, and anywhere else the server might make an outbound request as a side effect.

Timing analysis is underrated. When you cannot see the response body (blind SSRF), response timing becomes your primary signal. A 12-second timeout versus a sub-second response tells you everything you need to know about whether a connection attempt was made.

Compare security controls across features. If one parameter blocks internal IPs and another parameter in the same API does not, that is not just a bug. It is evidence of a systemic gap in how security controls are applied. Document the comparison explicitly in your report because it makes the remediation path obvious.

Think about chaining, not just individual findings. A blind SSRF by itself might be Medium. A blind SSRF with attacker-controlled POST body fields targeting internal services with no authentication is High. Context matters more than the individual primitive.

Write clean reports. I spent almost as much time on the reports as I did on the research. Clear reproduction steps, evidence tables, severity justification, and specific remediation recommendations. Triagers are busy. Make their job easy and your findings get taken seriously.

Growth Trajectory

Six months ago I was running automated scanners and hoping something interesting would fall out. Today I am reading RFC specifications, studying image processing library internals, and mapping application architecture before I test a single input.

The shift from tool-driven recon to methodology-driven auditing is where the real growth happens in bug bounty. Tools find the easy stuff. Understanding how systems are built, where trust boundaries break down, and how security controls fail to scale across features is what finds the stuff that matters.

I am still early in this journey. But findings like these, where I can identify a systemic pattern across multiple code paths in a production platform used by thousands of companies, tell me I am heading in the right direction.

If you are getting into bug bounty or offensive security research, my advice is simple: stop looking for bugs and start understanding systems. The bugs will find you.


All research was conducted through authorized bug bounty programs with responsible disclosure. No customer data was accessed or exfiltrated. Test artifacts were cleaned up after submission.


Source · github.com/zionsworking/security-research-notebook · writeups/ssrf-via-image-pipeline.md