Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input file path and type not abstracted from rml mapping #10

Open
henrieglesorotos opened this issue Nov 9, 2023 · 8 comments
Open

Comments

@henrieglesorotos
Copy link

Currently the input file can't be parameterised via cli or api. It is hardcoded into the mapping file. Eg:

rml:logicalSource [ 
    rml:source "./examples/artists/Artist.csv" ;
    rml:referenceFormulation ql:CSV
  ]

It would be more flexible to be able to provide this as a parameter.

@henrieglesorotos henrieglesorotos changed the title Input file not abstracted from rml mapping Input file path and type not abstracted from rml mapping Nov 9, 2023
@henrieglesorotos
Copy link
Author

Reckon it's something we could work on @anuzzolese? Also are there any tests?

@anuzzolese
Copy link
Owner

Hi @henrieglesorotos, if i got the problem you are referring to correctly I would say that it is somehow implemented (maybe not the best solution, but we can discuss about improvements). In fact, pyrml supports the parametrisation of RML mapping files by relying on Jinja2.

RML files processed by pyrml can accepts parameters as Jinja2 does, e.g.:

rml:logicalSource [ 
    rml:source {{ source_file }};
    rml:referenceFormulation ql:CSV
  ]

Than when you instantiate your mapper in the Python code you can do something like this:

from pyrml import RMLConverter
from rdflib import Graph

rml_map_file: str = '/path_to_your_rml'

# here you create a dictionary for linking actual values to the parameter defined in the RML files (i.e. 'source_file').
vars = {'source_file': './examples/artists/Artist.csv'}

rml_mapper: RMLConverter = RMLConverter.get_instance()
g: Graph = rml_mapper.convert(rml_map_file, template_vars=vars)

@henrieglesorotos
Copy link
Author

This is excellent news! Can we add to the docs? Also - shall we create some simple tests if they don't exist?

@anuzzolese
Copy link
Owner

Yes, controbuting in documenting and providing how-to guides would be utmost helpful.

@henrieglesorotos
Copy link
Author

@anuzzolese

Having some issues. See example below:

We have some pre-existing rml rules in mapping.ttl:

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix skos: <http://www.w3.org/2004/02/skos/core#>.
@prefix industries: <https://data.beamery.com/naics/2022/industries/>.

:rules_000 a void:Dataset.
:source_000 a rml:LogicalSource;
    rml:source "input.json";
    rml:iterator "$";
    rml:referenceFormulation ql:JSONPath.
:rules_000 void:exampleResource :map_Concept_000.
:map_Concept_000 rml:logicalSource :source_000;
    a rr:TriplesMap;
    rdfs:label "Concept".
:s_000 a rr:SubjectMap.
:map_Concept_000 rr:subjectMap :s_000.
:s_000 rr:template "https://data.beamery.com/naics/2022/industries/{NAICS22}#this";
    rr:graphMap :gm_000.
:gm_000 a rr:GraphMap;
    rr:template "https://data.beamery.com/naics/2022/industries/{NAICS22}".
:pom_000 a rr:PredicateObjectMap.
:map_Concept_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant skos:example.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
    rml:reference "Index Item Description";
    rr:termType rr:Literal;
    rml:languageMap :language_000.
:language_000 rr:constant "en".

Input file: input.json

{"NAICS22":"315990","Index Item Description":"Hats, cloth, cut and sewn from purchased fabric (except apparel contractors)"}

I am getting:

python converter.py -o test.ttl mapping.ttl
Traceback (most recent call last):
  File "/Users/henrieglesorotos/repos/pyrml/converter.py", line 65, in <module>
    PyrmlCMDTool().do_map()
  File "/Users/henrieglesorotos/repos/pyrml/converter.py", line 34, in do_map
    g = rml_converter.convert(self.__args.input, self.__args.m)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_mapper.py", line 131, in convert
    triple_mappings = RMLParser.parse(rml_mapping)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_mapper.py", line 46, in parse
    return TripleMappings.from_rdf(g)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 1586, in from_rdf
    return set([TripleMappings.__build(g, row) for row in qres])
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 1586, in <listcomp>
    return set([TripleMappings.__build(g, row) for row in qres])
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 1594, in __build
    predicate_object_maps = PredicateObjectMap.from_rdf(g, row.tm)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 752, in from_rdf
    return list(map(lmbd(g), qres))
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 751, in <lambda>
    lmbd = lambda graph : lambda row :  PredicateObjectMap.__build(graph, row)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 758, in __build
    predicates = PredicateBuilder.build(g, row.pom)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 669, in build
    predicates += PredicateMap.from_rdf(g, predicate_ref)
  File "/Users/henrieglesorotos/repos/pyrml/pyrml/pyrml_core.py", line 629, in from_rdf
    pm = PredicateMap(row.tripleMap, row.map, row.termType, row.predicateMap)
  File "/Users/henrieglesorotos/repos/pyrml/venv/lib/python3.9/site-packages/rdflib/query.py", line 124, in __getattr__
    raise AttributeError(name)
AttributeError: tripleMap

Any ideas?

@henrieglesorotos
Copy link
Author

henrieglesorotos commented Nov 9, 2023

Btw - we generally work in yarrrml so it's simpler, and then convert using https://github.com/RMLio/yarrrml-parser

@henrieglesorotos
Copy link
Author

FYI:

python --version == 3.9.0

pip freeze

click==8.1.7
decorator==5.1.1
Flask==2.2.2
importlib-metadata==6.8.0
isodate==0.6.1
itsdangerous==2.1.2
Jinja2==3.1.2
jsonpath-ng==1.5.3
lark-parser==0.12.0
MarkupSafe==2.1.3
numpy==1.23.4
pandas==1.5.1
ply==3.11
pyparsing==3.1.1
pyrml==0.3.0
python-dateutil==2.8.2
python-slugify==7.0.0
pytz==2023.3.post1
rdflib==6.2.0
shortuuid==1.0.9
six==1.16.0
SPARQLWrapper==2.0.0
text-unidecode==1.3
Unidecode==1.3.7
werkzeug==3.0.1
zipp==3.17.0

@henrieglesorotos
Copy link
Author

Did you manage to replicate this @anuzzolese ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants