¿De qué temas hablamos en JENUI? Modelado de topics con Latent Dirichlet Allocation (LDA)

Raúl Marticorena Sánchez, Carlos López Nozal, Jose Miguel Ramirez-Sanz, José Luis Garrido-Labrador

June, 2024

Abstract

The JENUI (Jornadas en Enseñanza Universitaria en Informática) have been held for three decades, from 1994 to the present day. During this period, the subject matter has evolved, changing the scope and topic of the papers, depending on the progress of computer science and its teaching at university level. The unsupervised text-based learning technique, known as the topic model, improves the understanding of large amounts of textual data by grouping documents into topics. This paper applies this technique by processing the complete proceedings of JENUI with its 1745 documents. Starting from the extraction of text from titles and abstracts, the Latent Dirichlet Allocation (LDA) algorithm is applied, estimating the optimal number of topics. The work constructs a topic classifier with the JENUI articles. In addition, it analyses the distribution of topics and the probabilities of the terms of each topic together with the topic evolution of the papers over time. From a more objective and scientific perspective, it is concluded that there is a thematic evolution over the 27 editions with JENUI proceedings.

Type

Conference paper

Publication

XXX Jornadas sobre la Enseñanza Universitaria de la Informática

¿De qué temas hablamos en JENUI? Modelado de topics con Latent Dirichlet Allocation (LDA)

Abstract

José Luis Garrido-Labrador

Assistant Lecturer in Computer Languages and Systems