General Information

Schema.org Style Schemas (SOSSs) and RO-Crate Profiles and Modes

Crate-O is a browser-based editor that allows you to create and update Research Object Crates (RO-Crates), either using the web interface or with metadata from a spreadsheet. It provides researchers with a relatively simple way to describe their data using best practice in formal metadata description.

Designed as a Vue component, Crate-O can be easily integrated into any Vue.js project. Simply install the component with npm and include it in your application. As a Vue software component, it can be used, cloned and distributed by anyone.

Crate-O is implemented in the GitHub page Crate-O. This implementation works only with Google Chrome and Microsoft Edge. It does not link to other services, and as such, any data uploaded to Crate-O will not go anywhere other than populating your RO-Crate. We will be releasing versions that work with online resources directly, which will be compatible with other browsers (see the Roadmap).

RO-Crate is a way of packaging research data that stores the data together with its associated metadata and other component files, such as the data license. It is a flexible, developer-friendly approach to linked-data description and packaging.

While the current version of Crate-O is designed for editing self-contained RO-Crates (and works fine with crates containing tens of thousands of entities), our roadmap includes adding the ability to edit fragments of larger linked-data resources and to integrate with repositories, such as the Oni repository, data API and archival repositories such as the Language Data Commons of Australia.

Crate-O Use Cases

Crate-O is designed to work with any of the following use cases:

Describe data collections and files on a user’s computer, and add contextual information about those files
Describe abstract contextual entities, such as in a Cultural Collection or an encyclopaedia
Annotate existing resources elsewhere on the web
Submit a data collection to the LDaCA Portal
Edit a Schema that contains a set of vocabulary terms, such as the terms used by LDaCA.

RO-Crate Collection Hierarchy

The diagram below shows the hierarchical relationship between collections, objects and files in a corpus, together with the metadata categories which track these relationships.

Self-contained corpus crate with all resources
Image Source: LDaCA

The metadata is organised according to Schema.org entity types.

Entity	Definition
Class	`rdfs:Class` is used to classify resources. Classes in the Language Data Commons (LDAC) schema include CollectionEvent, CollectionProtocol, DataDepositLicense, DataLicense and DataReuseLicense (see https://w3id.org/ldac/terms).
Property	`rdfs:Property` is an attribute of an instance of a Class. For example, on an entity that is an instance of Class Person the property “name” would be their name, expressed as a text string, while “affiliation” would be a property that referenced another entity, their university.
DefinedTerm	A ‘word, name, acronym, phrase, etc. with a formal definition’, ‘often used in the context of category or subject classification.’ DefinedTerms allow us to a) have accurate definitions of the values we want to give to properties, and b) group such definitions in DefinedTermSets, which can function as controlled vocabularies.

The table below shows an example of the relationship between each of these entities:

Level	Example
Class	Annotation
↓	↓
Property	annotationType
↓	↓
Defined Term Set	AnnotationTypeTerms
↓	↓
Defined Terms	Gestural, Phonemic, Phonetic, Phonological, Prosodic, Semantic, Syntactic, Transcription, Translation

For more details on these and other metadata entities, see Metadata for Language Data.

Schemas, Profiles and Modes

This diagram shows the relationship between the three main components used by Crate-O and other tools employed by LDaCA for specifying and validating RO-Crates. This section explains what these components are and how they relate.

The three main components for RO-Crate editing with Crate-O
Image Source: LDaCA

A Schema specifies a metadata vocabulary of Classes and Properties, based on the RO-Crate specification’s use of Schema.org classes.
An RO-Crate Mode is a set of lightweight syntactic rules for combining Schema.org Style Schema (SOSS) Classes, Properties and DefinedTerms, expressed in a JSON file that can be:
- loaded into an editor such as Crate-O
- imported into another program and used for RO-Crate validation
- used to summarise the rules for an RO-Crate Profile.
An RO-Crate Profile has (at least) a document that explains how metadata entities from the Schema are used for a particular purpose.

These are all inter-related, and can be developed together or separately using tools.

See the links below to the LDAC schema, profile and modes:
LDAC Schema
LDAC Profile
LDAC Modes

Schema.org Style Schemas (SOSSs) and RO-Crate Profiles and Modes

Schema.org, which provides the basic vocabulary for RO-Crate, has a light-touch approach to describing what it refers to as its schema (with a small-s), which might also be thought of as an ontology. Schema.org is defined as a set of Classes and Properties, each of which has an online definition. The below example illustrates that the base class Thing and its subclass Person has properties such as birthDate.

Class: Thing → Sub-Class: Person → Property: birthDate

Schema.org specifies which Classes can have particular Properties.

While Schema.org has terms for Class and Property, it does not use these for defining the classes and properties in Schema.org itself (possibly as this would be circular). Rather, it uses the equivalent Classes from the rdf: and rdfs: vocabularies.

Here is the definition for Person:

{
  "@id": "schema:Person",
  "@type": "rdfs:Class",
  "owl:equivalentClass": {
    "@id": "foaf:Person"
  },
  "rdfs:comment": "A person (alive, dead, undead, or fictional).",
  "rdfs:label": "Person",
  "rdfs:subClassOf": {
    "@id": "schema:Thing"
  },
  "schema:source": {
    "@id": "http://www.w3.org/wiki/WebSchemas/SchemaDotOrgSources#source_rNews"
  }
}

The Class definition does not have any information about the occurrence of properties – that is found in a Property definition:

{
  "@id": "schema:sibling",
  "@type": "rdf:Property",
  "rdfs:comment": "A sibling of the person.",
  "rdfs:label": "sibling",
  "schema:domainIncludes": {
    "@id": "schema:Person"
  },
  "schema:rangeIncludes": {
    "@id": "schema:Person"
  }
}

A SOSS is a Flattened JSON-LD graph, just like an RO-Crate. Some members of the RO-Crate community are beginning to define its basic schema and RO-Crate Profiles using the SOSS’s same approach.

To make an RO-Crate Mode File, we transform the flat graph of a schema into something optimised for driving an editor or a validator; it creates a list of Classes, and what properties each may have.

Base Mode File creation, combining the Schema.org schema and RO-Crate additions using the rocsoss script
Image Source: LDaCA

General Information

Crate-O Use Cases

RO-Crate Collection Hierarchy

Schemas, Profiles and Modes

Schema.org Style Schemas (SOSSs) and RO-Crate Profiles and Modes

Crate-O Use Cases

RO-Crate Collection Hierarchy

Schemas, Profiles and Modes

Schema.org Style Schemas (SOSSs) and RO-Crate Profiles and Modes

About

Resources

News

Contact