569 lines
20 KiB
Plaintext
569 lines
20 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "secret-front",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Mapping *Data Stewardship terminology* and *Metadata* from spreadsheets to SKOS resources\n",
|
||
|
"\n",
|
||
|
"This Notebook implements a simple parser used to transform the Data Stewardship terminology and the Metadata, created in the Task 3.1 of the SSHOC project and published as spreadsheets, into SKOS resources. The parser reads the spreadsheets and transforms the content in SKOS data following a set of mapping rules, the result is stored in two Turtle files.\n",
|
||
|
"\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 1,
|
||
|
"id": "frozen-workplace",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import pandas as pd\n",
|
||
|
"import rdflib\n",
|
||
|
"import itertools\n",
|
||
|
"import yaml"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "operational-respect",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The file *config.yaml* contains the external information used in the parsing, including the position of the spreadsheets. Set the correct values before running the Notebook."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"id": "streaming-wrestling",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"try:\n",
|
||
|
" with open(\"config.yaml\", 'r') as stream:\n",
|
||
|
" try:\n",
|
||
|
" conf=yaml.safe_load(stream)\n",
|
||
|
" except yaml.YAMLError as exc:\n",
|
||
|
" print(exc)\n",
|
||
|
"except FileNotFoundError:\n",
|
||
|
" print('Warning config.yaml file not present! Please store it in the same directory as the notebook')\n",
|
||
|
"#print (conf)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "polar-tenant",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The following cells defines the *Namespaces* used in the parsing"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 3,
|
||
|
"id": "miniature-frontier",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from rdflib.namespace import DC, DCAT, DCTERMS, OWL, \\\n",
|
||
|
" RDF, RDFS, SKOS, \\\n",
|
||
|
" XMLNS, XSD, XMLNS\n",
|
||
|
"from rdflib import Namespace\n",
|
||
|
"from rdflib import URIRef, BNode, Literal\n",
|
||
|
"\n",
|
||
|
"sshocterm = Namespace(conf['Namespaces']['SSHOCTERM'])\n",
|
||
|
"sshoccmd= Namespace(conf['Namespaces']['SSHOCCMD'])\n",
|
||
|
"dc11=Namespace(\"http://purl.org/dc/elements/1.1/\");\n",
|
||
|
"dct = Namespace(\"http://purl.org/dc/terms/\")\n",
|
||
|
"iso369=Namespace(\"http://id.loc.gov/vocabulary/iso639-3\");"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "rubber-interval",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Download **Data Stewardship terminology** spreadsheet and show it to check if the operation has been executed correctly"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"id": "stable-olympus",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"url=conf['Source']['VOCABULARYSOURCE']\n",
|
||
|
"df_data=pd.read_csv(url)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"id": "medical-investigator",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"df_data.rename(columns = {'Unnamed: 0': 'Concept ID', 'Unnamed: 1':'Subject', 'Unnamed: 2':'Term',\n",
|
||
|
" 'Unnamed: 3':'Source of definition', 'Translations':'Dutch', 'Unnamed: 5':'French', \n",
|
||
|
" 'Unnamed: 6':'German', 'Unnamed: 7':'Greek',\n",
|
||
|
" 'Unnamed: 8':'Italian', 'Unnamed: 9':'Slovenian',\n",
|
||
|
" 'Linking':'Loterre Open Science Thesaurus', 'Unnamed: 11':'Terms4FAIRSkills',\n",
|
||
|
" 'Unnamed: 12':'CCR metadata', 'Unnamed: 13':'Linked Open Vocabularies',\n",
|
||
|
" 'Unnamed: 14':'LOV 2', 'Unnamed: 15':'ISO',\n",
|
||
|
" 'Unnamed: 16':'Broader Concept'}, inplace = True)\n",
|
||
|
"df_data=df_data.drop(0)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "fuzzy-disney",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Create a graph for the SKOS data and binds the namespaces to it"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"id": "fabulous-remains",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"c1rdf = rdflib.Graph()\n",
|
||
|
"c1rdf.bind(\"sshocterm\", sshocterm)\n",
|
||
|
"c1rdf.bind(\"dc11\", dc11)\n",
|
||
|
"c1rdf.bind(\"dct\", dct)\n",
|
||
|
"c1rdf.bind(\"iso369-3\", iso369)\n",
|
||
|
"c1rdf.bind(\"skos\", SKOS)\n",
|
||
|
"c1rdf.bind(\"dc\", DC)\n",
|
||
|
"c1rdf.bind(\"rdf\", RDF)\n",
|
||
|
"c1rdf.bind(\"owl\", OWL)\n",
|
||
|
"c1rdf.bind(\"xsd\", XSD)\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "stainless-pennsylvania",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Insert in the graph the *SKOS.ConceptScheme*"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"id": "catholic-mortgage",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<Graph identifier=Nb61b2c34a07d4dc69f11be18779c6cff (<class 'rdflib.graph.Graph'>)>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 7,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"title=Literal(conf['Texts']['VOCABULARYTITLE'], lang=conf['Texts']['LANG'])\n",
|
||
|
"description=Literal(conf['Texts']['VOCABULARYDESCRIPTION'], lang=conf['Texts']['LANG'])\n",
|
||
|
"identifier=Literal(conf['Texts']['VOCABULARYID'], lang=conf['Texts']['LANG'])\n",
|
||
|
"createddate= Literal(conf['Texts']['VOCABULARYCREATEDATE'],datatype=XSD.date)\n",
|
||
|
"moddate= Literal(conf['Texts']['VOCABULARYMODDATE'],datatype=XSD.date)\n",
|
||
|
"version= Literal(conf['Texts']['VOCABULARYVERSION'],datatype=XSD.string)\n",
|
||
|
"\n",
|
||
|
"c1rdf.add((sshocterm[''], RDF.type, SKOS.ConceptScheme))\n",
|
||
|
"c1rdf.add((sshocterm[''], DC.title, title))\n",
|
||
|
"c1rdf.add((sshocterm[''], DC.identifier, identifier))\n",
|
||
|
"c1rdf.add((sshocterm[''], DC.description, description))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.created, createddate))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.modified, createddate))\n",
|
||
|
"c1rdf.add((sshocterm[''], OWL.versionInfo, version))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.eng))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.ger))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.fra))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.ell))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.ita))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.dut))\n",
|
||
|
"c1rdf.add((sshocterm[''], dct.language, iso369.slv))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 8,
|
||
|
"id": "encouraging-click",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#c1rdf.serialize(destination='data/skostest.rdf', format=\"n3\");#format=\"pretty-xml\")\n",
|
||
|
"#comrdf.serialize(destination='data/parsed_rdf/prima_cantica_forme_com.rdf', format=\"n3\");"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "regulation-vehicle",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The following cell implements the mapping rules for creating SKOS resources."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 9,
|
||
|
"id": "amended-joshua",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"4371\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"\n",
|
||
|
"for index, row in df_data.iterrows():\n",
|
||
|
" \n",
|
||
|
" if row.Subject.lower()==\"preflabel\":\n",
|
||
|
" label=row[\"Concept ID\"]\n",
|
||
|
" enlabel=Literal(row[\"Term\"], lang='en')\n",
|
||
|
" frlabel=Literal(row[\"French\"], lang='fr')\n",
|
||
|
" nllabel=Literal(row['Dutch'], lang='nl')\n",
|
||
|
" delabel=Literal(row['German'], lang='de')\n",
|
||
|
" itlabel=Literal(row['Italian'], lang='it')\n",
|
||
|
" sllabel=Literal(row['Slovenian'], lang='sl')\n",
|
||
|
" ellabel=Literal(row['Greek'], lang='el')\n",
|
||
|
" \n",
|
||
|
" c1rdf.add((sshocterm[label], RDF.type, SKOS.Concept))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.inScheme, sshocterm['']))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.topConceptOf, sshocterm['']))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, enlabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, frlabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, nllabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, delabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, itlabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, sllabel))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.prefLabel, ellabel))\n",
|
||
|
" if row.Subject.lower()==\"altlabel\":\n",
|
||
|
" if not pd.isna(row['Term']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"Term\"], lang='en')))\n",
|
||
|
" if not pd.isna(row['French']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"French\"], lang='fr')))\n",
|
||
|
" if not pd.isna(row['Dutch']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"Dutch\"], lang='nl')))\n",
|
||
|
" if not pd.isna(row['German']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"German\"], lang='de')))\n",
|
||
|
" if not pd.isna(row['Italian']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"Italian\"], lang='it')))\n",
|
||
|
" if not pd.isna(row['Slovenian']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"Slovenian\"], lang='sl')))\n",
|
||
|
" if not pd.isna(row['Greek']):\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.altLabel, Literal(row[\"Greek\"], lang='el')))\n",
|
||
|
" \n",
|
||
|
" if row.Subject.lower()==\"definition\":\n",
|
||
|
" endef=Literal(row[\"Term\"], lang='en')\n",
|
||
|
" frdef=Literal(row[\"French\"], lang='fr')\n",
|
||
|
" nldef=Literal(row['Dutch'], lang='nl')\n",
|
||
|
" dedef=Literal(row['German'], lang='de')\n",
|
||
|
" itdef=Literal(row['Italian'], lang='it')\n",
|
||
|
" sldef=Literal(row['Slovenian'], lang='sl')\n",
|
||
|
" eldef=Literal(row['Greek'], lang='el')\n",
|
||
|
" \n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, endef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, frdef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, nldef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, dedef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, itdef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, sldef))\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.definition, eldef))\n",
|
||
|
" if not pd.isna(row['Source of definition']):\n",
|
||
|
" source=Literal(row['Source of definition'])\n",
|
||
|
" #print (f'{label}, {source}')\n",
|
||
|
" c1rdf.add((sshocterm[label], dct.source, source))\n",
|
||
|
" if not pd.isna(row['Loterre Open Science Thesaurus']):\n",
|
||
|
" lote=URIRef(row['Loterre Open Science Thesaurus'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.exactMatch, lote))\n",
|
||
|
" \n",
|
||
|
" if not pd.isna(row['Linked Open Vocabularies']):\n",
|
||
|
" lov=URIRef(row['Linked Open Vocabularies'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.exactMatch, lov))\n",
|
||
|
" \n",
|
||
|
" if not pd.isna(row['LOV 2']):\n",
|
||
|
" lov2=URIRef(row['LOV 2'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.exactMatch, lov2))\n",
|
||
|
" #Terms4FAIRSkills ISO \n",
|
||
|
" if not pd.isna(row['Terms4FAIRSkills']):\n",
|
||
|
" t4fs=Literal(row['Terms4FAIRSkills'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.note, t4fs))\n",
|
||
|
" if not pd.isna(row['ISO']):\n",
|
||
|
" tiso=Literal(row['ISO'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.note, tiso))\n",
|
||
|
" if not pd.isna(row['Broader Concept']):\n",
|
||
|
" broc=URIRef(row['Broader Concept'])\n",
|
||
|
" c1rdf.add((sshocterm[label], SKOS.broadMatch, broc))\n",
|
||
|
" \n",
|
||
|
"print(len(c1rdf))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 10,
|
||
|
"id": "tutorial-zimbabwe",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#for s, p, o in c1rdf.triples((None, None, None)):\n",
|
||
|
"# print(\"{} {}\".format(s, o.n3))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "violent-reproduction",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Create a *Turtle* file in the **/data** directory with the SKOS resources for **Data Stewardship terminology** "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 11,
|
||
|
"id": "sweet-mixer",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"c1rdf.serialize(destination='data/mdstskos.rdf', format=\"n3\");#format=\"pretty-xml\")\n",
|
||
|
"#comrdf.serialize(destination='data/parsed_rdf/prima_cantica_forme_com.rdf', format=\"n3\");"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "quiet-rehabilitation",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Download **Metadata** spreadsheet and show it to check if the operation has been executed correctly"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 12,
|
||
|
"id": "electrical-hydrogen",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"mdurl=conf['Source']['METADATASOURCE']\n",
|
||
|
"df_metadata=pd.read_csv(mdurl)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 13,
|
||
|
"id": "multiple-berry",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"df_metadata.rename(columns = {'English': 'Englishterm', 'Unnamed: 1':'Englishdefinition', 'Unnamed: 2':'source',\n",
|
||
|
" 'Unnamed: 3':'URI', 'Dutch':'Dutchterm', 'Unnamed: 5':'Dutchdefinition', \n",
|
||
|
" 'French':'Frenchterm', 'Unnamed: 7':'Frenchdefinition',\n",
|
||
|
" 'Greek':'Greekterm', 'Unnamed: 9':'Greekdefinition',\n",
|
||
|
" 'Italian':'Italianterm', 'Unnamed: 11':'Italiandefinition'}, inplace = True)\n",
|
||
|
"df_metadata=df_metadata.drop(0)\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "promising-gender",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Create a graph for the SKOS data and binds the namespaces to it"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 14,
|
||
|
"id": "positive-library",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"ccr = rdflib.Graph()\n",
|
||
|
"ccr.bind(\"sshoccmd\", sshoccmd)\n",
|
||
|
"ccr.bind(\"dc11\", dc11)\n",
|
||
|
"ccr.bind(\"dct\", dct)\n",
|
||
|
"ccr.bind(\"iso369-3\", iso369)\n",
|
||
|
"ccr.bind(\"skos\", SKOS)\n",
|
||
|
"ccr.bind(\"dc\", DC)\n",
|
||
|
"ccr.bind(\"rdf\", RDF)\n",
|
||
|
"ccr.bind(\"owl\", OWL)\n",
|
||
|
"ccr.bind(\"xsd\", XSD)\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 15,
|
||
|
"id": "outside-dressing",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"<Graph identifier=N18d0f9a944dc43bdb1a1f4c0131302a3 (<class 'rdflib.graph.Graph'>)>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 15,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"title=Literal(conf['Texts']['METADATATITLE'], lang=conf['Texts']['LANG'])\n",
|
||
|
"description=Literal(conf['Texts']['METADATADESCRIPTION'], lang=conf['Texts']['LANG'])\n",
|
||
|
"identifier=Literal(conf['Texts']['METADATAID'], lang=conf['Texts']['LANG'])\n",
|
||
|
"createddate= Literal(conf['Texts']['METADATACREATEDATE'],datatype=XSD.date)\n",
|
||
|
"moddate= Literal(conf['Texts']['METADATAMODDATE'],datatype=XSD.date)\n",
|
||
|
"version= Literal(conf['Texts']['METADATAVERSION'],datatype=XSD.string)\n",
|
||
|
"\n",
|
||
|
"ccr.add((sshoccmd[''], RDF.type, SKOS.ConceptScheme))\n",
|
||
|
"ccr.add((sshoccmd[''], DC.title, title))\n",
|
||
|
"ccr.add((sshoccmd[''], DC.description, description))\n",
|
||
|
"ccr.add((sshoccmd[''], DC.identifier, identifier))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.created, createddate))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.modified, createddate))\n",
|
||
|
"ccr.add((sshoccmd[''], OWL.versionInfo, version))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.eng))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.ger))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.fra))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.ell))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.ita))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.dut))\n",
|
||
|
"ccr.add((sshoccmd[''], dct.language, iso369.slv))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "contemporary-familiar",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The following cell implements the mapping rules for creating SKOS resources."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 16,
|
||
|
"id": "comparative-matthew",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"3027\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"for index, row in df_metadata.iterrows():\n",
|
||
|
" \n",
|
||
|
" label=row[\"URI\"]\n",
|
||
|
" urilabel=URIRef(label)\n",
|
||
|
" lastslash=label.rfind('/')\n",
|
||
|
" label='sshoc_'+label[lastslash+1:]\n",
|
||
|
" \n",
|
||
|
" \n",
|
||
|
" strsource=row['source']\n",
|
||
|
" \n",
|
||
|
" strsource=strsource.replace('(source: ','')\n",
|
||
|
" strsource=strsource.replace(')','')\n",
|
||
|
" source=Literal(strsource)\n",
|
||
|
" enterm=Literal(row[\"Englishterm\"], lang='en')\n",
|
||
|
" frterm=Literal(row[\"Frenchterm\"], lang='fr')\n",
|
||
|
" nlterm=Literal(row['Dutchterm'], lang='nl')\n",
|
||
|
" #determ=Literal(row['Germanterm'], lang='de')\n",
|
||
|
" itterm=Literal(row['Italianterm'], lang='it')\n",
|
||
|
" #slterm=Literal(row['Slovenianterm'], lang='sl')\n",
|
||
|
" elterm=Literal(row['Greekterm'], lang='el')\n",
|
||
|
" \n",
|
||
|
" endef=Literal(row[\"Englishdefinition\"], lang='en')\n",
|
||
|
" frdef=Literal(row[\"Frenchdefinition\"], lang='fr')\n",
|
||
|
" nldef=Literal(row['Dutchdefinition'], lang='nl')\n",
|
||
|
" #dedef=Literal(row['Germandefinition'], lang='de')\n",
|
||
|
" itdef=Literal(row['Italiandefinition'], lang='it')\n",
|
||
|
" #sldef=Literal(row['Sloveniandefinition'], lang='sl')\n",
|
||
|
" eldef=Literal(row['Greekdefinition'], lang='el')\n",
|
||
|
" \n",
|
||
|
" ccr.add((sshoccmd[label], RDF.type, SKOS.Concept))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.prefLabel, enterm))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.prefLabel, frterm))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.prefLabel, nlterm))\n",
|
||
|
" #ccr.add(sshoccmd[label], SKOS.prefLabel, determ))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.prefLabel, itterm))\n",
|
||
|
" #ccr.add((sshoccmd[label], SKOS.prefLabel, slterm))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.prefLabel, elterm))\n",
|
||
|
" \n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.definition, endef))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.definition, frdef))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.definition, nldef))\n",
|
||
|
" #ccr.add(sshoccmd[label], SKOS.definition, dedef))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.definition, itdef))\n",
|
||
|
" #ccr.add((sshoccmd[label], SKOS.definition, sldef))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.definition, eldef))\n",
|
||
|
" \n",
|
||
|
" ccr.add((sshoccmd[label], dct.source, source))\n",
|
||
|
" ccr.add((sshoccmd[label], SKOS.exactMatch, urilabel))\n",
|
||
|
" \n",
|
||
|
" \n",
|
||
|
"print(len(ccr))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "reflected-dealer",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Create a *Turtle* file in the **/data** directory with the SKOS resources for **Metadata** "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 17,
|
||
|
"id": "located-metadata",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"ccr.serialize(destination='data/skosccr.rdf', format=\"n3\");#format=\"pretty-xml\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "geographic-format",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.7.9"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 5
|
||
|
}
|