LDaCA Newsletter Quarter 2 2024



LDaCA Newsletter — Quarter 2 2024
LDaCA logo with text Language Data Commons of Australia on a colourful background with black to green shading

LDaCA Newsletter — Quarter 2 2024

Welcome

Welcome to the second issue for 2024 of this newsletter about the activities of the Language Data Commons of Australia (LDaCA) and the Australian Text Analytics Platform (ATAP). This quarter, we highlight the LDaCA draft project plan and co-design workshop report published by the Australian Research Data Commons (ARDC), and report back from several exciting events, including the ARDC’s Computational Skills Summer School 2024 and a tech team visit to Batchelor Institute of Indigenous Tertiary Education. If you have any questions or feedback, please email us at ldaca@uq.edu.au or message us on our LinkedIn page.

News

New content on website

We have added two new pieces of content to our website:

  • a case study about data management in language technology, based on projects at tech company Appen.

  • a blog post about why the Australian National Corpus is not a single collection within LDaCA.

LDaCA draft project plan

On 14 March, the ARDC released draft project plans for the next phase of the HASS and Indigenous Research Data Commons (HASS&I RDC), including a plan for the LDaCA project. They were seeking feedback from the public on project activities planned for the next four years. Feedback closed on 22 March, and once this feedback has been reviewed and incorporated, final project plans will be released in the coming months. Meanwhile, the draft plans are still available to view online.

Co-designing a digital future for Indigenous language materials

Robert McLellan and his team are seeking Indigenous language champions who are working with language materials for a research project called “Co-designing a digital future for Indigenous language materials”. We plan to run a series of Zoom and face-to-face interviews to find out what works well and what doesn’t work when finding, accessing and using Indigenous language materials.


We know that there are challenges involved with accessing and using language materials stored in various locations. We want to contribute to making that data more findable and usable so that it can support and enhance the language work currently being undertaken. So, we are seeking input to help build language platforms, spaces and tools that suit Indigenous language workers and communities.


If you are interested or would like to know more, please take a minute to get in touch through our Contact Form. All interviewees will receive a gift voucher worth $70 as a thank you for their time.

Graduate Digital Research Fellowship program

The Graduate Digital Research Fellowship (GDRF) program ran very successfully in 2023. We are hoping to build on this success, with three fellows from The University of Queensland (UQ) participating in the 2024 program:

  • Lu Jin is a PhD candidate in architecture and urban design. Her multidisciplinary work will design urban green infrastructural networks in a circular food system for city resilience and sustainability. In the GDRF program, Lu will be applying machine learning methods to identify and assess green cover in street view images.

  • David Gilchrist is a PhD candidate in journalism. His research looks at how journalists connect with audiences, and in the GDRF program, he hopes to explore methods for finding and gathering relevant data from digital platforms.

  • Quy Pham is a PhD candidate in applied linguistics. He is researching the errors produced by learners of English, working with an already-existing corpus of recordings. In the GDRF program, he will be experimenting with using speech recognition methods to automate coding the data, especially automatic identification of pauses.

New team member

We have two new team members at the Australian National University (ANU). Here is an introduction from the first team member; the second introduction will follow in the next issue:


Greetings, I am Gan Qiao, and I am thrilled to be appointed as a Research Data Officer with the LDaCA team, located in Canberra on Ngunnawal Country. As a variationist linguist, my passion lies in language variation and change, learner corpus research, and language technology. Having recently achieved my PhD in Linguistics from ANU, I am eager to bring my expertise to LDaCA, where I aim to streamline language data onboarding processes, and create resources, such as scripts and notebooks, to enhance data management for linguists and beyond.

Events

Upcoming Events

Online seminar on social meaning of language variation in Australian English

When: 2 May 2024, 1:00 pm AWST/3:00 pm AEST

Where: Online seminar

Run by: The University of Western Australia (UWA) Linguistics and Language Lab


LDaCA Chief Investigator Catherine Travis (ANU) will present a seminar titled “What's in an accent? Understanding the social meaning of language variation in Australian English”. Use password 250801 to access the seminar online. Note that sessions in this seminar series are usually not recorded.

ARDC Digital Research Skills Summit 2024

When: 21–23 May 2024, Day 1: 1–5 pm AEST, Day 2 and 3: 9:30 am – 4 pm AEST

Where: Woodward Conference Centre, Law Building, The University of Melbourne, Carlton, VIC | Online

Run by: The ARDC


Find out how digital infrastructure providers and research communities are upskilling researchers in emerging research technologies through a three-day summit in Naarm/Melbourne or online. Share, learn and network with thought leaders, digital skills trainers and researchers by registering online for all three days or the days that interest you most:

  • Day 1 — ARDC Skills Leadership Forum (The Skilled Research Infrastructure workforce: Pathways and support to enable effective research): Explore digital research skills challenges and opportunities with thought leaders.

  • Day 2 — Researcher Challenges: Hear from researchers and learn how they navigate their skills needs and gaps.

  • Day 3 — Carpentry Connect: Participate in regional conversations with digital research skills trainer communities.

Australian Historical Association Conference – Digital History Stream

When: 1–4 July 2024

Where: Flinders University, South Australia | Online

Run by: The Australian Historical Association (AHA)


Through the HASS&I RDC, the ARDC is sponsoring the digital history stream of the 2024 AHA Conference. The stream will explore the possibilities and pitfalls of using digital tools and methods to explore historical data. Registration information for the conference can be found online.

Recent Events

HASS and Indigenous Research Data Commons Computational Skills Summer School 2024

The ARDC’s HASS&I RDC held a Computational Skills Summer School on the lands of the Kulin Nation in Naarm/Melbourne on 7–9 February. More than 100 participants had the opportunity to learn about research infrastructure through talks, case studies and workshops. LDaCA team members Ben Foley and Simon Musgrave delivered content in a stream shared with IDN, with the invaluable assistance of Levi Murray (IDN) and Karen Manton (Batchelor Institute).


On the first day, LDaCA presented two sessions on making data FAIR into the future, discussing long-term storage of data (spoiler alert — there are not many suitable solutions in Australia) and data governance decisions, with a special emphasis on properly documenting access conditions. Throughout the discussion, Levi (IDN) ensured that we took CARE of how these issues apply when handling Indigenous data.

Corpus Spotlight

The PAC Corpus (Phonologie de l’Anglais Contemporain ‘Phonology of Contemporary English’ Corpus) is based on reading and conversational tasks completed in native and non-native varieties of English spoken worldwide. The research program that produced the corpus is led by a network of French universities that partnered with international institutions, including Griffith University in QLD.


The PAC-Australia sub-corpus is part of the PAC Corpus and is based on data collected from 2003 to 2023 in Australia from speakers reading wordlists, reading a text, and participating in semi-guided interviews. The 240 spoken recordings in the corpus come with orthographic and phonetic transcriptions, as well as the place of recording. Most of the recordings can be accessed freely online and downloaded as WAV or MP3 files; only access to the semi-guided interviews is restricted.

Team Member’s Tip

Learn More

No Office Hours

The Joint Office Hour run by LDaCA and the Australian Digital Observatory (ADO) will not take place in 2024. The teams from the two projects are working towards an alternative way to provide targeted advice to researchers — watch this space!

We welcome any feedback to make future issues more useful for you. If the newsletter was forwarded to you, you can subscribe here.

Share this with a friend

LDaCA acknowledges Traditional Owners of Country throughout Australia and recognises the continuing connection to lands, waters and communities. We pay our respects to their Ancestors and their descendants, who continue cultural and spiritual connections to Country.


You are receiving this email because you have provided us with your email address for promotional purposes.


Republishing is encouraged — CC BY text and infographics.

If you have questions about republishing, please contact ldaca@uq.edu.au

©LDaCA — 2024

Australian Research Data Commons logo