詳細書目資料

22
0
0
0

Language Dataset Documentation Design: Learning from Deaf and Indigenous Communities.

館藏資訊

本館視聽資料與可外借圖書提供預約服務,每張借閱證總共可預約10冊(件)。
密集書庫中不可外借圖書提供調閱服務,每張借閱證總共可調閱10冊(件)。
如預約/調閱圖書或視聽資料因破損、遺失等因素無法借閱時,本館將以電子郵件或電話簡訊通知讀者取消該筆預約/調閱申請。

摘要註

This dissertation investigates how engaging with stakeholder groups, namely natural language processing (NLP) practitioners and language communities, can contribute to the development of documentation toolkits that are more responsive to the needs of these groups. The development process follows value sensitive design in conducting a series of investigations to learn what are the needs of these groups and how iterative improvements to technology can help address those needs. Building from the data statements for NLP Version 1 schema proposed in Bender and Friedman (2018), Dr. Emily M. Bender, Dr. Batya Friedman, and I conduct an empirical investigation and a technical investigation to develop the data statements. Version 2 schema by engaging with natural language processing professionals. To learn about the needs of indigenous and deaf communities with respect to collaborating with researchers, in a retrospective technical investigation I analyze ethical guidelines and licenses for the values frequently expressed in these communities’ stated expectations for research collaborations. I then conduct a technical investigation to meld the data statements Version 2 schema, aspects of datasheets for datasets (Gebru et al., 2021), and the results of the retrospective technical investigation into a single toolkit. Rather than documenting existing datasets, the Collaborative Discussions for the Documentation and Design of Linguistic Archival Resources (C3DAR) toolkit is designed to facilitate collaborative partnerships between communities and researchers working to develop language datasets. I conclude with possible future investigations, focusing on community researchers as key stakeholders, and considerations for uptake.

延伸查詢 Google Books Amazon
回到最上