kgsteward config file: supported YAML syntax
YAML 1.1 syntax is supported.
A YAML extension is available: !include <filename>
.
This directive will insert in place the content of filename
.
The path of <filename>
is interpreted with the directory of the parent YAML file as default directory.
This inclusion mechanism is executed early, before the YAML configuration is validated.
Within the YAM config file(s), UNIX environment variables can by referred to using ${...}
syntax.
Evaluation of these is performed at the time of command execution.
Hence ${...}
syntax cannot be used in !include
directive.
kgsteward YAML syntax
KGStewardConf
Top level YAML keys
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
version | string |
✅ | kgsteward2.0 |
No description | ||
server | object |
✅ | GraphDBConf or RDF4JConf or FusekiConf | |||
dataset | object |
✅ | GraphConf or GraphSource | Mandatory key to specify the content of the knowledge graph in the triplestore | ||
context_base_IRI | string |
string | "http://example.org/context/" |
toto | ||
queries | array |
string | A list of paths to files with SPARQL queries to be add to the repository user interface. Each query is first checked for syntactic correctness by being submitted to the SPARQL endpoint, with a short timeout. The query result is not iteself checked. Wildcard * can be used. |
|||
validations | array |
string | A list of paths to files contining SPARQL queries used to validate the repsository. Wildcard * can be used. By convention, a valid result should be empty, i.e. no row is returned. Failed results should return rows permitting to diagnose the problems. |
Definitions
FusekiConf
Type: object
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
brand | string |
✅ | fuseki |
One of ‘graphdb’ or ‘fuseki’ ( ‘graphdb’ by default). | ||
repository | string |
✅ | ^\w{1,32}$ |
The name of the ‘repository’ (GraphDB naming) or ‘dataset’ (fuseki) in the triplestore. | ||
location | string |
string | "http://localhost:3030" |
URL of the server. The SPARL endpoint is different and server specific. | ||
file_server_port | integer |
integer | Integer, 0 by default, i.e. the file server is turned off. When set to a positive integer, say 8000 , local files will be exposed through a temporary HTTP server and loaded from it. Support for different RDF file types and their compressed version depend on the tripelstore. The benefit is the that RDF data from file are processed with the same protocol as those supplied remotely through url . Essentially for GraphDB, file-size limits are suppressed and compressed formats are supported. Beware that the used python-based server is potentially insecure (see here for details). This should however pose no real treat if used on a personal computer or on a server that is behind a firewall. |
GraphConf
Type: object
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
name | string |
✅ | ^[a-zA-Z]\w{0,31}$ |
Mandatory name of a graphs record. | ||
context | string |
string | IRI for ‘context’ in RDF4J/GraphDB terminology, or IRI for ‘named graph’ in RDF/SPARQL terminology. If missing, contect IRI will be built by concataining context_base_IRI and name |
|||
parent | array |
string | A list of names to declare dependency between graph records. Updating the parent datset will provoke the update of its children. | |||
frozen | boolean |
boolean | Frozen record, use -d |
|||
system | array |
string | A list of system command. This is a simple convenience provided by kgsteward which is not meant to be a replacement for serious Make-like system as for example git/dvc. | |||
file | array |
string | List of files containing RDF data. Wildcard * can be used. The strategy used to load these files will depends on if a file server is used (see file_server_port option`). With GraphDB, there might be a maximum file size (200 MB by default (?)) and compressed files may not be supported. Using a file server, these limitations are overcomed, but see the security warning described above. |
|||
url | array |
string | List of url from which to load RDF data | |||
stamp | array |
string | List of paths to files which last modification dates will used. The file contents are ignored. Wildcard * can be used. |
|||
replace | string |
string | Dictionary to perform string substitution in SPARQL queries from update list. Of uttermost interest is the ${TARGET_GRAPH_CONTEXT} which permit to restrict updates to the current context. |
|||
update | array |
string | List of files containing SPARQL update commands. Wildcard are not supported here! | |||
zenodo | array |
integer | Do not use! Fetch turtle files from zenodo. This is a completely ad hoc command developped for ENPKG (), that will be suppressed sooner or later |
GraphDBConf
Type: object
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
brand | string |
✅ | graphdb |
One of ‘graphdb’ or ‘fuseki’ ( ‘graphdb’ by default). | ||
server_config | string |
✅ | string | Filename with the triplestore configuration, possibly a turtle file. graphdb_config is a deprecated synonym. This file can be saved from the UI interface of RDF4J/GraphDB after a first repository was created interactively, thus permitting to reproduce the repository configuration elsewhere. This file is used by the -I and -F options. Beware that the repository ID could be hard-coded in the config file and should be maintained in sync with repository . |
||
repository | string |
✅ | ^\w{1,32}$ |
The name of the ‘repository’ (GraphDB naming) or ‘dataset’ (fuseki) in the triplestore. | ||
description | integer |
integer | "Top level description" |
|||
location | string |
string | "http://localhost:7200" |
URL of the server. The SPARL endpoint is different and server specific. | ||
file_server_port | integer |
integer | Integer, 0 by default, i.e. the file server is turned off. When set to a positive integer, say 8000 , local files will be exposed through a temporary HTTP server and loaded from it. Support for different RDF file types and their compressed version depend on the tripelstore. The benefit is the that RDF data from file are processed with the same protocol as those supplied remotely through url . Essentially for GraphDB, file-size limits are suppressed and compressed formats are supported. Beware that the used python-based server is potentially insecure (see here for details). This should however pose no real treat if used on a personal computer or on a server that is behind a firewall. |
|||
username | string |
string | The name of a user with write-access rights in the triplestore. | |||
password | string |
string | The password of a user with write-access rights to the triplestore. It is recommended that the value of this variable is passed trough an environment variable. By this way the password is not stored explicitely in the config file. Alternatively ? can be used and the password will be asked interactively at run time. |
|||
prefixes | array |
string | No description |
GraphSource
Type: object
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
source | string |
✅ | string |
RDF4JConf
Type: object
Property | Type | Required | Possible Values | Deprecated | Default | Description |
---|---|---|---|---|---|---|
brand | string |
✅ | rdf4j |
One of ‘graphdb’ or ‘fuseki’ ( ‘graphdb’ by default). | ||
repository | string |
✅ | ^\w{1,32}$ |
The name of the ‘repository’ (GraphDB naming) or ‘dataset’ (fuseki) in the triplestore. | ||
location | string |
string | "http://localhost:3030" |
URL of the server. The SPARL endpoint is different and server specific. | ||
file_server_port | integer |
integer | Integer, 0 by default, i.e. the file server is turned off. When set to a positive integer, say 8000 , local files will be exposed through a temporary HTTP server and loaded from it. Support for different RDF file types and their compressed version depend on the tripelstore. The benefit is the that RDF data from file are processed with the same protocol as those supplied remotely through url . Essentially for GraphDB, file-size limits are suppressed and compressed formats are supported. Beware that the used python-based server is potentially insecure (see here for details). This should however pose no real treat if used on a personal computer or on a server that is behind a firewall. |
Markdown generated with jsonschema-markdown 0.2.1 on 2024-12-19 15:51:40.