In this article, we’re going to show you how to construct queries using the a GraphQL IDE, how to make GraphQL API calls in R and then how to convert the returned JSON data into an R data frame.
Most of the screenshots in this blog post are of old versions of the Splashback API and GraphQL IDE, please keep this in mind when referring to them, however they still should be useful. All text content and example code is up to date.
We’re going to be accessing Splashback’s GraphQL API as an example, so parts of this article will be Splashback specific, which are in blue boxes like this one.
In this article, we’ve broken down the process of using GraphQL with R into three steps:
- Construct queries with a GraphQL IDE
- Perform GraphQL requests with R (
ghql
) - Format retrieved data with R (
jsonlite
andtidyr
)
Construct queries with a GraphQL IDE
GraphQL queries are far more flexible that REST API queries. We can very precisely specify what data we want returned.
This flexibility can make constructing a GraphQL query daunting. Thankfully we have our GraphQL IDE, to make constructing and testing queries more straightforward. We will be using the Splashback IDE, but Insomnia is a good alternative if you aren’t using us.
Most GraphQL APIs require authentication, which usually takes the form of an API key. The docs on your specific API should tell you how to authenticate. We’ll show you how to authenticate using the Authorization HTTP header in this example.
For Splashback Users:
note: you may need to re-log in to see your new API key in your list of Splashback API keys
Open your GraphQL IDE
Once your GraphQL IDE is open, enter your endpoint URL.
For Splashback users:
In your browser, go to
https://api.splashback.io/graphql/
This will take you to our browser based GraphQL IDE.

Authorization
We will need to enter the details for our Authorization HTTP header. You will need to use your values here.
{
"Authorization": "<type> <credentials>"
}
For Splashback users:
{
"Authorization": "API-Key XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
}
- Explore the schema (section hidden)
With the IDE you can explore the schema of the API you’re accessing.
This way you can see how to access the nodes you’re interested in from the endpoint you want to interact with.
I’m going to have a look at the schema for the sampleValues
endpoint.
I want to fetch the parameter name
and unit
, qualifier
, value
, site name
and location
, and sample dateTime
. So I need to see how to access these nodes via the sampleValues
endpoint.
Write the query
I want to fetch the nodes parameter name
and unit
, qualifier
, value
, site name
and location
, and sample dateTime
. It’s often important to also filter what data you work with – I want to filter by sample dateTime
, site and parameter and also remove any samples which have multiple sampleVariants.
Further below there’s some code which you can read and edit to help you get started making queries.

Note: GraphQL syntax can vary between implementations. This query was written for the Splashback API, if you are accessing a different API the syntax may be different. Look around for more examples of GraphQL queries, and use the code complete and suggestion features of the IDE to help you understand the correct syntax for the API you’re working with.
The Splashback GraphQL IDE has code completion, and also will provide suggestions if you hit ctrl+space.
Run the query!
If you are using Splashback you can copy our query here:
query {
sampleValues(
#### the pool ID of our demo data set, Splash Valley
poolId: "00000000-0000-0000-0000-000000001004"
############# Result Filtering
where:{
sampleVariant:
{sample:
######## Filtering by Sample Site (name and location) and Sample dateTime
## {gt: "1990-01-24T00:00:00"} gt (greater than), only sample values recorded after 24/1/90 will fulfill criteria
## also available are eq (equal), neq (not equal), lt (less than), gte (greater than or
## equal), lte (less than or equal)
## {and: [{cond1}, {cond2}]} must satisfy condition 1 and condition 2 to fulfill criteria
## {or: [{cond1}, {cond2}, {cond3}]} must satisfy condition 1 or condition2 or condition 3 fulfill criteria
{and: [
{dateTime: {gt: "1990-01-24T00:00:00"}},
{site:
{or: [
{and: [{name: {eq: "Splash River"}}, {location: {eq: "above mine"}}]},
{and: [{name: {eq: "Splash River"}}, {location: {eq: "below mine"}}]},
{and: [{name: {eq: "Pit Lake"}}, {location: {eq: "Middle"}}]},
{and: [{name: {eq: "Pit Lake"}}, {location: {eq: "Outlet"}}]}
]
}
}
]
}
},
######## this line filters out samples which have multiple sample variants
## only samples with no sample variant comment and no sample variant value will fulfill this criteria
and: [{sampleVariant: {comment: {eq: ""}}}, {sampleVariant: {value: {eq: null}}}],
######## Here we are selecting the parameters we would like sample values returned for.
## using select in [list], if parameter name is in the list the sample will fulfill this criteria
## nin (not in [list]) is also available.
parameter:
{name:
{in : [
"pH field - sensor TC",
"Nitrogen (Total) as N",
"Water Temperature",
"Sulphate (Dionex) as SO4"
]
}
}
}
)
################ Selecting Fields to Return
{
nodes {
sampleVariant{
comment
value
sample{
dateTime
site{
name
location
}
}
}
parameter{
name
unit
}
qualifier
value
}
}
}
Perform GraphQL requests with R
To make GraphQL requests from R we’ll use the package ghql
.
Save your API key in your .Renviron file
Read this great article if you are not yet familiar with the .Renviron file. TL;DR, it is a secure way of storing sensitive values such as API keys.
Save the Authorization HTTP header value with your API key to your .Renviron file by putting the following content in it:
RAuthHeader="<type> <credentials>"
For Splashback users:
RAuthHeader="API-Key XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
Create a GraphQL client
Make sure to replace the URL parameter with the same endpoint URL you entered in the GraphQL IDE:
For Splashback users:
url = "https://api.splashback.io/graphql/"
# Import ghql library
library("ghql")
# Get API key from .Renviron file
authheader <- Sys.getenv("RAuthHeader")
# Create instance of GraphQL client and set connection settings
con <- GraphqlClient$new(
url = "<https://url.of.your/api>",
headers = list(Authorization = authheader)
)
Make a request using the client
I’m going to use the query we wrote in the GraphQL IDE earlier (but with comments removed). Find the code for the next steps below.

query1 <- Query$new()
query1$query('mysampledata', '
{
sampleValues(
poolId: "00000000-0000-0000-0000-000000001004" where: {
sampleVariant: {
sample: {
and:[
{dateTime: {gt: "1990-01-24T00:00:00"}},
{site: {or: [
{and: [{name: {eq: "Splash River"}}, {location: {eq: "above mine"}}]},
{and: [{name: {eq: "Splash River"}}, {location: {eq: "below mine"}}]},
{and: [{name: {eq: "Pit Lake"}}, {location: {eq: "Middle"}}]},
{and: [{name: {eq: "Pit Lake"}}, {location: {eq: "Outlet"}}]}
]}}
]
}
}
and: [{sampleVariant: {comment: {eq: ""}}}, {sampleVariant: {value: {eq: null}}}],
parameter: {name: {nin : [
"pH field - sensor TC",
"Nitrogen (Total) as N",
"Water Temperature",
"Sulphate (Dionex) as SO4"
]}}
}) {
nodes {
sampleVariant {
sample {
dateTime
site { name location }
}
}
parameter { name unit }
qualifier
value
}
}
}')
result <- con$exec(query1$queries$mysampledata)
This call will return a JSON response, which will need some formatting before we can use it.
Format Retrieved Data into a data.frame with R
The retrieved data is not usable… yet! We want the data in a data.frame
, this can be easily converted to a tibble
or data.table
later if required.
Data retrieved from the sampleValues
endpoint is more simple to format compared with sample value data retrieved from the sampleVariants
or samples
endpoints.
We only need to remove the outer levels of nesting which do not contain any data.
# Import required libraries
library("jsonlite")
library("tidyr")
results1 <- fromJSON(result, flatten = TRUE)
# Gives data in a long format
results1 <- results1[["data"]][["sampleValues"]][["nodes"]]
# Convert to wide format
results1 <- pivot_wider(results1, names_from = c(parameter.name, parameter.unit), values_from = c(qualifier,value))
# For cases with irregular nesting the function, use this function from tidyr:
#unnest()
If you would like to see more advanced GraphQL in R examples or how-tos on getting the most out of your Splashback data with R (think dashboards and interactive visualisations), then please let us know!
References
Introduction to ghql. Available: https://docs.ropensci.org/ghql/articles/ghql.html