Skip to content

Developing a Schema.org representation of the Ocean Best Practices repository metadata

Notifications You must be signed in to change notification settings

adamml/ocean-best-practices-on-schema

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Schema.org implementation pattern for the Ocean Best Practices Repository

  • Author: Adam Leadbetter, Marine Institute, Ireland
  • Version: 1.0
  • Date: 16th June 2020

Contents

  1. Base type
  2. Metadata field mapping
  3. @id
  4. Title
  5. Authors and Maintainers
  6. Publisher
  7. Keywords
  8. Identifiers
  9. Subjects
  10. Example
  11. Areas to develop
  12. Version history

Base type

The base type chosen from the Schema.org vocabulary to represent a Best Practices document within the Ocean Best Practices Repository is CreativeWork. From the Schema.org documentation, this base type represents the most generic kind of creative work, including books, movies, photographs, software programs, etc..

Base type diagram for representing entries in the Ocean Best Practices repository as Schema.org CreativeWork instances

Metadata field mapping

Digging into the DSpace implementation of the repository, reveals a rich metadata model based on the Dublin Core mtadata terms. The following table shows a mapping from Dublin Core to appropiate Schema.org attributes, within the domain of the type of CreativeWork.

Dublin Core Schema.org attribute Range of Schema.org attribute
dc.contributor.author author Person
dc.coverage.spatial spatial Place
dc.date.accessioned
dc.date.available datePublished Date
dc.date.issued
dc.identifier.citation identifier PropertyValue
dc.identifier.doi identifier PropertyValue
dc.identifier.uri identifier PropertyValue
dc.description.abstract abstract Text
dc.description.sponsorship funder Organization
dc.language.iso inLanguage Language
dc.publisher publisher Organization
dc.relation.uri sameAs URL
dc.rights license CreativeWork
dc.rights.uri license CreativeWork
dc.title name Text
dc.type genre Text
dc.description.status
dc.format.pages numberOfPages - Strictly only on type Book. As is this flags as a 'warning' in Google's Structured Data Testing Tool. We could request an additional domian from Schema.org to make this 100% Integer
dc.description.refereed
dc.publisher.place publisher Organization
dc.subject.parameterDiscipline about, keywords ?, Text
dc.subject.instrumentType about, keywords ?, Text
dc.subject.dmProcesses about, keywords ?, Text
dc.subject.other about, keywords
dc.description.currentStatus
dc.description.contactemail maintainer Person
dc.description.contactname maintainer Person
dc.description.sdg about, keywords ?, Text
dc.description.eov about, keywords ?, Text
dc.description.bptype about, keywords ?, Text
dc.description.maturitylevel about, keywords ?, Text
dc.description.bptype about, keywords ?, Text

The DSpace record also contains links to a number of files, which are mapped through the hasPart attribute.

@id

The @id attribute of a JSON-LD encoding, such as is used in Schema.org is used to give the canonical Uniform Resource Identifier to the resource being described. As the Ocean Best Practices Repository uses the handle system, these persistent identifiers should be used in the @id attribute:

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"@id": "http://hdl.handle.net/11329/424"
}

Title

The name attribute in Schema.org

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"name": "Schema.org implementation pattern for the Ocean Best Practices Repository"
}

Authors and Maintainers

The Schema.org/author attribute assigns a Person or Organization responsible for the creation of a CreativeWork to that entity.

The use of Schema.org/Person for instances of author and maintainer

For example:

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"author": [{"@type": "Person", "familyName": "Leadbetter", "givenName": "Adam"}]
}

The use of ORCIDs would help to disambiguate these instances of Person and would reduce the size of the knowledge graph if the Schema.org serialization was ingested into a triple store.

The Schema.org/maintainer attribute assigns a Person or Organization responsible for the manages contributions to, and/or publication of the resource. This well describes the DCAT contact, for example:

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"author": [{"@type": "Person", "familyName": "Leadbetter", "givenName": "Adam", "email": "[email protected]"}]
}

Publisher

Keywords

The keywords attribute provides only an unstructured list of plain text keywords, which is why its use here is supplemented in the section on Subjects, which allow a richer description of the keywords to be developed. For example:

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"keywords": ["Parameter Discipline::Biological oceanography::Macroalgae and seagrass", "Parameter Discipline::Biological oceanography::Phytoplankton", "Phytoplankton biomass and diversity", "Sea surface temperature", "Ocean colour", "Ocean surface stress", "Sea surface height", "Subsurface temperature", "Surface currents", "Sea surface salinity", "Subsurface salinity", "Ocean surface heat flux", "Biotoxins / Phycotoxins", "Best Practice", "Manual"]
}

About

The about attribute allows for more detailed terminology to be assigned to keyword than the (keywords)[#keywords] attribute.

{
	"@context": {"@vocab": "https://schema.org/"},
	"@type": "CreativeWork",
	"keywords":
}

Identifiers

Example

Taking an example Best Practice from the repository, "Creating a weekly Harmful Algal Bloom bulletin. Version 1.0. [Best Practie Description Document]" a Schema.org representation, which would be embedded within the landing page for the repository is as follows:

{
	"@context": "https://schema.org/",
	"@type": "CreativeWork",
	"@id": "https://repository.oceanbestpractices.org/handle/11329/424",
	"numberOfPages": 59,
	"author": [
		{"@type": "Person", "familyName": "Leadbetter", "givenName": "Adam"},
		{"@type": "Person", "familyName": "Silke", "givenName": "Joe"},
		{"@type": "Person", "familyName": "Cusack", "givenName": "Caroline"}
	],
	"maintainer": {"@type": "Person", "familyName": "Leadbetter", "givenName": "Adam", "email": "[email protected]"},
	"keywords": ["Parameter Discipline::Biological oceanography::Macroalgae and seagrass", "Parameter Discipline::Biological oceanography::Phytoplankton", "Phytoplankton biomass and diversity", "Sea surface temperature", "Ocean colour", "Ocean surface stress", "Sea surface height", "Subsurface temperature", "Surface currents", "Sea surface salinity", "Subsurface salinity", "Ocean surface heat flux", "Biotoxins / Phycotoxins", "Best Practice", "Manual"]
}

Areas to develop

  1. ORCID inclusion on Person instances
  2. Other IRIs which can be included
  3. Request a change to the domain of numberOfPages

Version history

  • 1.0 - 16th June 2020 - Initial draft

About

Developing a Schema.org representation of the Ocean Best Practices repository metadata

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published