The Language Data Commons of Australia (LDaCA) will make nationally significant language data available for academic and non-academic use and provide a model for ensuring continued access with appropriate community control.

Australia is a massively multilingual country in one of the world’s most linguistically diverse regions. Significant collections of this intangible cultural heritage have been amassed, including collections of Australian Indigenous languages, regional languages of the Pacific, and of Australian English, as well as collections important for cyber-security and for emergency communication. LDaCA will integrate this existing work into a national research infrastructure while also securing collections which remain under-utilised or at risk. LDaCA will thus ensure long-lasting access for analysis and reuse of these invaluable data, and will manage the data in a culturally, ethically and legally appropriate manner guided by FAIR and CARE principles.

To accomplish these goals, LDaCA will:

  • Develop a comprehensive language data access policy framework,
  • Develop shared technical infrastructure and standards across institutions,
  • Build a sustainable long-term repository for ingesting and curating existing language data collections of national significance, and
  • Build a portal for discovery and access of language data.

The result will be an integrated national technical infrastructure to analyse language collections at scale which will open up the social and economic possibilities of Australia’s rich linguistic heritage. The project will build connections to other HASS RDC projects and assist in laying the foundation for the establishment of a broader HASS Research Data Commons as well as positioning Australia internationally as a leading contributor of language collections and digital infrastructure.

/AcknowledgeARDC.png

The Language Data Commons of Australia (LDaCA) project received investment (1 and 2) from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS).