pallas
pallas.Rmd🦉 pallas: An R package to compose and query SPARQL.
Introduction
The pallas package allows users to query SPARQL
endpoints from R. As composing SPARQL queries can be complicated,
pallas also provides a set of utilities in the S7 framework
to assist the user.
Installation instructions
Get the latest stable R release from CRAN. Then install
pallas from GitHub with
remotes:
install.packages("remotes")
remotes::install_github("minotau-R/pallas")We can use map_endpoint() to take information about what
SPARQL terms, namely classes and predicates, are defined at a given
endpoint. map_endpoint() takes the URL of an endpoint as
input and returns an object of the OWL S7 class.
endpoint_url = "https://sparql.uniprot.org/"
x <- map_endpoint(endpoint_url)
#> adding rname '5e918bfe31a47685419ba31861ec8b48ec502dbd71a767e0b138d60c40148804'OWL objects contain Web Ontology Language (OWL). A
typical SPARQL query consists of at least three parts:
- prefix declarations
- query type (SELECT, ASK, CONSTRUCT, DESCRIBE) and arguments
- WHERE clause This is also reflected in the
OWLclass:
x
#> pallas::OWL tbl S7_object.
#> PREFIX core: <http://purl.uniprot.org/core/>
#> PREFIX x01: <http://www.w3.org/2000/01/>
#> SELECT *
#> WHERE
#> {
#>
#> }
#> No 'where' found. Add where-clauses with `where_clause()`.Printing the object will show our SPARQL code so far. It will also
give some remarks regarding the SPARQL code we want to write. We can
specify where-clauses using the function
where_clause():
x |>
where_clause(C.Enzyme(?a)) |>
where_clause(P.alternativeName(g:h ~ ?j)) |>
where_clause(P.activity(?d ~ e:f),
C.Cluster(b:c))
#> pallas::OWL tbl S7_object.
#> PREFIX core: <http://purl.uniprot.org/core/>
#> PREFIX x01: <http://www.w3.org/2000/01/>
#> SELECT *
#> WHERE
#> {
#> ?a a core:Enzyme .
#> g:h core:alternativeName ?j .
#> ?d core:activity e:f .
#> b:c a core:Cluster .
#> }
res <- x |>
select_query("SELECT ?protein") |>
where_clause(
C.Protein(?protein),
triple("?protein", "core:mnemonic", "'A4_HUMAN'")
) |>
as.SPARQL() |>
send_query(endpoint_url = "https://sparql.uniprot.org/")
#> adding rname 'ba54eca6524fe4552b036fff5457e43adc2d8a8a9b05382d41d09ebd4574f3f0'
res
#> protein
#> 1 http://purl.uniprot.org/uniprot/P05067