5465 Total CVEs
26 Years
GitHub
README.md
Rendering markdown...
POC / root_cause.md MD
# Root Cause Analysis: CVE-2026-26198

## The Bug in Three Sentences

Ormar's aggregate query methods (`min()`, `max()`, `sum()`, `avg()`) all accept a string parameter
representing the column name. The `sum()` and `avg()` methods validated this string against the
model's field definitions. The `min()` and `max()` methods did not, passing it directly to
`sqlalchemy.text()` — a raw SQL expression constructor.

## Code Path (Ormar 0.21.0)

### 1. Entry Point — `QuerySet.max()`

```python
# queryset.py, line ~721
async def max(self, columns: Union[str, List[str]]) -> Any:
    return await self._query_aggr_function(func_name="max", columns=columns)
```

### 2. Dispatcher — `_query_aggr_function()`

This method creates `SelectAction` objects for each requested column. The column string is
split by `__` (Ormar's prefix notation for related fields), and the last part becomes
`self.field_name`.

**No validation happens here.** The string `"(SELECT password FROM users)"` flows through
unchanged.

### 3. The Sink — `SelectAction.get_text_clause()`

```python
# select_action.py, lines 41-43
def get_text_clause(self) -> sqlalchemy.sql.expression.TextClause:
    alias = f"{self.table_prefix}_" if self.table_prefix else ""
    return sqlalchemy.text(f"{alias}{self.field_name}")
    #                              ^^^^^^^^^^^^^^^^
    #                              User input, no escaping
```

`sqlalchemy.text()` creates a raw SQL text expression. Anything passed to it is treated
as literal SQL. This is the **injection sink**.

### 4. The SQL Construction — `SelectAction.apply_func()`

The text clause is wrapped in the aggregate function:

```python
func.max(self.get_text_clause())
# Produces: max((SELECT password FROM users LIMIT 1))
```

This becomes part of the final query sent to the database engine.

## Why sum() and avg() Were Safe

```python
# In the _query_aggr_function() path for sum/avg:
if not column_field.is_numeric:
    raise QueryDefinitionError(
        f"Column {column} is not numeric..."
    )
```

For `sum()` and `avg()`, Ormar looked up the column in the model's field registry. If the
column didn't exist as a field, the lookup failed. If it existed but wasn't numeric, the
`is_numeric` check rejected it.

For `min()` and `max()`, this check was **skipped** because min/max can operate on non-numeric
types (strings, dates). The developer's reasoning was correct — min/max shouldn't require
numeric fields — but the implementation threw out ALL validation instead of just the numeric check.

## The Fix

The official fix in version 0.23.0 ensures that ALL aggregate methods validate the column
parameter against the model's field definitions. The validation confirms that the column name
refers to an actual field that exists on the model, regardless of its type.

### Correct approach (whitelist):
```python
def _validate_column(self, column: str) -> str:
    field_name = column.split("__")[-1]

    # Must be a valid identifier (defense-in-depth)
    if not re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*$', field_name):
        raise ColumnValidationError(f"Invalid column name: {column!r}")

    # Must exist in the model (primary defense)
    if field_name not in self.model_fields:
        raise ColumnValidationError(f"Unknown column: {column!r}")

    return field_name
```

## Timeline

| Date | Event |
|------|-------|
| 2021-03-12 | Vulnerable code introduced in commit `ff9d412` (version 0.9.9) |
| 2026-02-11 | CVE reserved |
| 2026-02-24 | CVE published, CVSS 9.8 |
| 2026-02-24 | Ormar 0.23.0 released with fix |

## Lessons Learned

1. **Inconsistent validation is a code smell.** When some methods validate an input and others
   don't, the unvalidated paths become high-value targets.

2. **ORMs are not automatic SQL injection shields.** Any ORM method that accepts raw strings
   and passes them to text clauses or raw SQL builders requires explicit validation.

3. **"Works for non-numeric types" ≠ "skip all checks."** The original dev correctly identified
   that min/max shouldn't be limited to numeric fields, but the fix should have been to relax
   the *type* check while keeping the *existence* check.