This study evaluates the impacts of slot tagging and training data length on joint natural language understanding (NLU) models for medication management scenarios using chatbots in Spanish. In this study, we define the intents (purposes of the sentences) for medication management scenarios and two types of slot tags. For training the model, we generated four datasets, combining long/short sentences with long/short slots, while for testing, we collect the data from real interactions of users with a chatbot. For the comparative analysis, we chose six joint NLU models (SlotRefine, stack-propagation framework, SF-ID network, capsule-NLU, slot-gated modeling, and a joint SLU-LM model) from the literature. The results show that the best performance (with a sentence-level semantic accuracy of 68.6%, an F1-score of 76.4% for slot filling, and an accuracy of 79.3% for intent detection) is achieved using short sentences and short slots. Our results suggest that joint NLU models trained with short slots yield better results than those trained with long slots for the slot filling task. The results also indicate that short slots could be a better choice for the dialog system because of their simplicity. Importantly, the work demonstrates that the performance of the joint NLU models can be improved by selecting the correct slot configuration according to the usage scenario.
Keywords: intent detection; joint natural language understanding; medication management scenario; slot filling; training data.