WYRED Project: Data Sharing Satellite Event
Big data has now become a ‘buzz phrase’ used by many different academic communities. Every discipline will have a slightly different interpretation of the term, but for those in sociophonetics and forensic speech science, big data is generally the availability of large quantities of audio or phonetic data. Forensic speech science and sociophonetics both benefit from the availability of data. Forensics uses data to help inform investigations and improve their methodologies. Sociophonetics uses big data to improve understanding of the relationships that exist between spoken language and society. And from both of these disciplines there are an infinite number of further applications that speech data could lend itself to.
Unfortunately, the time and expense involved in the collection of speech data means that available databases are limited, and individuals are often reluctant to share their data (perhaps more so in the forensics community). For these reasons, the ESRC supported WYRED project is hosting a free one-day Data Sharing Workshop (lunch and tea/coffee breaks included). We have put together a programme that includes four invited speakers and two hour-long discussion sessions.
The first half of the day includes a talk on the creation of large databases and the impact they can have on research, casework, and the real world. The first half of the day will also see a talk on the use/impact of population data on forensic casework, followed by a facilitated discussion on obstacles related to sharing data. After lunch, the second half of the workshop looks towards potential solutions that will aid in the sharing of data. Talks will discuss the potential for a Wikidialects and also looking at the potential for software that would aid in data sharing and research at the same time. The final facilitated group discussion will focus on future steps towards a data sharing culture.
It is our hope that we can create a platform for academics/practitioners in the forensic speech science and sociophonetic fields to begin discussing the obstacles and possibilities that face data sharing. We hope that this workshop may lead to a culture where data sharing begins a norm rather than an exception.