Swahili questions and answers dataset for aflatoxin knowledge domain

关于黄曲霉毒素知识领域的斯瓦希里语问答数据集

阅读:1

Abstract

Aflatoxin contamination is a challenge facing food security, health, and trade in Tanzania and other parts of the world. This contamination affects maize, groundnuts, and other crops and animal products. Once contamination occurs, the contaminated crops and animal products become toxic causing illness or death to humans and animals who consume them. Lack of awareness and knowledge of the contamination is seen to be one of the reasons for its continued occurrence. Various awareness-creation and knowledge-sharing techniques have been used but the situation is still not appealing. For this case, the use of a Natural Language Processing (NLP) chatbot in sharing aflatoxin knowledge is proposed. This is because NLP chatbots have been successful in knowledge sharing in various contexts. This data article presents a Swahili text-based aflatoxin knowledge questions and answers dataset. Data were collected through 7 focus group discussion (FGD) sessions conducted in Arusha, Dodoma, Mtwara, Tabora, Morogoro, and Iringa regions in Tanzania. Respondents for the study were farmers, traders, and consumers of maize and groundnuts. The collected data were processed and analyzed using R qualitative data analysis tool. This allowed the identification of 6 themes with respective questions under each theme. The questions were shared with experts through 9 interview sessions and the experts gave answers to the questions. The set of questions and answers were then translated into Swahili language using google translate and manual verification. Finally, an aflatoxin knowledge dataset containing 221 paired questions and answers organized into 6 knowledge areas Swahili dataset was developed. With this dataset, an NLP-based chatbot that uses Swahili language can be developed. This will be beneficial to farmers, traders, consumers, researchers, and policymakers. They can use it to learn more about aflatoxin and be able to make informed decisions. Moreover, the dataset can be adopted and modified to create NLP chatbots that can share aflatoxin knowledge in other languages apart from Swahili. The dataset also contributes to the availability of Swahili language datasets.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。