See the LDaCA Glossary for definitions of key terms and concepts.
This document outlines the LDaCA Access Policy, which is developed to accommodate the goal of making data appropriately accessible, in accordance with legal, moral and ethical considerations of data sharing, and tailored to meet the needs and requirements of different data collections.
Access conditions for data in LDaCA are determined by the Data Steward. To assist with this, LDaCA provides: information to help Data Stewards make informed decisions about appropriate access conditions for the data; tools and resources to facilitate this process as collections are onboarded and made available (including for applying standards as required); and mechanisms for managing these once data is onboarded.
The Access Policy comprises three key components, as outlined in this document:
- Access types and licensing
- Onboarding data to LDaCA
- Ongoing data management
1. Access Types and Licensing
The foundation of the LDaCA Access Policy is licensing, i.e. the action of setting out conditions for accessing and using data in a license that is attached to all data in our systems. Access conditions are defined by the Data Steward for each collection, and they may restrict access as little or as much as desired, from open access, to an automated click-through license, to case-by-case approval prior to gaining access to the data.
Classification of Access Types
|Type of Access
|Data is openly available under a Creative Commons license or similar (including for data in the public domain).
|The user can read the license and directly access the data.
Different levels of implementation:
|Data is available with some restrictive conditions of use, as outlined in the license. Users must authenticate their identity to access these materials.
In some cases, access to the license is limited to specific users, who can be defined by invitation and/or application (note that these are not mutually exclusive – either or both options may be implemented). Such authorisation requires ongoing engagement from the Data Steward in order to manage access lists and approve/decline access requests.
|A number of levels of authorization can be implemented depending on the access conditions:
2. Onboarding Data to LDaCA
In accordance with the Access Policy, LDaCA has an established process for onboarding to LDaCA. Throughout this process, the LDaCA team will work with the Data Steward to develop an effective strategy for managing access, in accordance with relevant legal, moral and ethical considerations applying to that data.
- Supporting the Data Steward to standardise aspects of the language collection as required for successful onboarding, including with regards to metadata, data governance and data preparation.
- Adapting the onboarding process as relevant to the specific needs and requirements of the collection and Data Steward, and working to facilitate a successful and efficient onboarding process.
- Providing clear information to the Data Steward to ensure comprehension of the purpose of each step in the onboarding process, and responding to questions and concerns.
Data Steward Responsibilities
- Providing a persistent identifier for the data. If the collection does not have an existing persistent identifier, LDaCA recommends getting a DOI (Digital Object Identifier), which is becoming the default identifier for research datasets. A DOI makes a collection citable; it ensures that it is findable even if moved to a different location; and it establishes its relationship to other objects and entities in the academic research environment (e.g. researchers, funders, organisations, academic publications, software and other datasets). See Obtaining a DOI for more information.
- Providing metadata that is organised with a consistent structure, and includes descriptors, definitions and contextual information where relevant. This includes metadata at the level of the collection (e.g. collection name, a narrative description of the corpus, the subject language(s), author(s) or Collector(s), publication year, access conditions, etc.), file (e.g. length in minutes, words) and participants (e.g. age, gender).
LDaCA technologies enable programmatic detection, extraction and summarisation of existing metadata in a dataset; in order for this to work effectively, the metadata must be standardised.
Metadata are mapped to open schemas including the Open Language Archives Community (OLAC) vocabularies for describing language data and Schema.org for generic descriptions (e.g. Name, Identifier, Description). This approach allows many metadata terms to be linked to openly available and widely used definitions.
Some metadata may be specific to a corpus or difficult to map to existing vocabulary terms. LDaCA makes use of the Language Data Ontology, which has been developed in consultation with OLAC and metadata specialists, to ensure consistency across terms. See Metadata for more information.
3. Ongoing Data Management Strategy
A strategy for ongoing access and data management of the collection in LDaCA must be developed in collaboration with the Data Steward. Responsibilities and processes must be clearly outlined specifying that Data Stewards will respond appropriately to access requests, system updates and user feedback. This is key for collections that require access approval on a case-by-case basis and for those collections that may introduce data updates, such as additional data, edits to existing data, transcription, or annotation etc.
Both LDaCA and the Data Steward are jointly responsible for ongoing maintenance of the collection in LDaCA, and updating the Work Plan as needed.
4. Supporting Information
- Users must adhere to conditions set by the Data Steward.
- Users must exercise ethical standards, paying particular attention to potential issues around sensitive data and participant privacy.
- Users must uphold the moral rights of data owners.
- Publicly available metadata in the LDaCA catalogue are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
- Data is provided ‘as is’ and LDaCA provides no guarantee to the accuracy or completeness of the material.
- LDaCA may change the site and suspend or cancel user access without notice.
- LDaCA is not responsible for any breaches of legal, moral or ethical standards by users.
Information collected by LDaCA may include home organisation, social login provider and personal data such as name, email address, affiliation, and information about how a user accesses and uses LDaCA. This information is necessary to facilitate the functioning of the site and is required to provide access to restricted language collections.
The policy document is currently under development.
The LDaCA Takedown Policy outlines the steps to be taken by users in making a request for data to be removed, or access to be adjusted in some way. LDaCA recognises that licensing and decisions surrounding access to data are made by the Data Steward. Therefore, the Data Steward is also responsible for assessing and determining the outcome of takedown requests.
The Takedown Request mechanism supports FAIR and CARE Principles by facilitating a process by which data access may be questioned and discussed by relevant stakeholders.
The policy document is currently under development.