¿De qué temas hablamos en JENUI? Modelado de topics con Latent Dirichlet Allocation (LDA)

Abstract

The JENUI (Jornadas en Enseñanza Universitaria en Informática) have been held for three decades, from 1994 to the present day. During this period, the subject matter has evolved, changing the scope and topic of the papers, depending on the progress of computer science and its teaching at university level. The unsupervised text-based learning technique, known as the topic model, improves the understanding of large amounts of textual data by grouping documents into topics. This paper applies this technique by processing the complete proceedings of JENUI with its 1745 documents. Starting from the extraction of text from titles and abstracts, the Latent Dirichlet Allocation (LDA) algorithm is applied, estimating the optimal number of topics. The work constructs a topic classifier with the JENUI articles. In addition, it analyses the distribution of topics and the probabilities of the terms of each topic together with the topic evolution of the papers over time. From a more objective and scientific perspective, it is concluded that there is a thematic evolution over the 27 editions with JENUI proceedings.

Publication
XXX Jornadas sobre la Enseñanza Universitaria de la Informática
José Luis Garrido-Labrador
José Luis Garrido-Labrador
Assistant Lecturer in Computer Languages and Systems

PhD in Machine Learning, researching in semi-supervised learning and restricted set classification. Assistant Lecturer in Computer Languages and Systems at Universidad de Burgos.