![]() Using geotagged tweets, the geometry-driven longitudinal model reveals temporal patterns in the evolution of topics of conversation as they are affected by the outbreak. The utility of this framework is demonstrated on Twitter data collected during the COVID-19 outbreak in the United States. Second, we introduce a modular approach to a complex statistical modeling problem, separating topic modeling and temporal dynamics. First, we are able to apply machine learning and data analytic tools that preserve geometric information intrinsic to the structure of the data. Under this framework, we achieve two goals. Our framework combines latent Dirichlet allocation (LDA) and computational geometric representations to time-align thematic information extracted from each time slice of the data. ![]() This article introduces a flexible, scalable framework to extract patterns from time-evolving, complex, high-dimensional data associated with documents and social media archives. More recently there has been increasing interest in developing geometry-based methods that automatically learn signals from these complex and large data sets. Traditionally, machine learning approaches for these problems relied on user-defined heuristics to extract features encoding structural information about the data. These necessarily large data sets must be amenable to efficient processing, analysis, and implementation in a variety of settings such as multidimensional modeling and high-resolution visualization. Keywords : topic models, latent Dirichlet allocation, computational geometry, social media analysis, COVID-19Īs technology advances, it prompts an ever-increasing demand to acquire, analyze, and generate complex, unstructured data. We end by arguing that Twitter data, when analyzed within the proposed framework, can serve as a valuable supplementary data stream for COVID-related studies. The analysis demonstrates that the proposed framework is able to capture granular-level impact of COVID-19 on public discussions. In addition, the framework permits study of spatial variation in Twitter behavior for learned topics. Interpretability of the trajectories is achieved by comparing to real-world events. Practical application of the proposed framework is demonstrated through its ability to capture and effectively visualize natural progression of latent COVID-19–related topics learned from Twitter data. The proposed framework permits visualization of the low-dimensional embedding, which provides clear interpretation of the complex, high-dimensional trajectories that may exist among latent topics. Then shortest path distances on the manifold are used to link together these topics. Dimensionality reduction tools from computational geometry are applied to learn the intrinsic manifold on which the latent, temporal topics reside. A simple and scalable framework for longitudinal analysis of Twitter data is developed that combines latent topic models with computational geometric methods.
0 Comments
Leave a Reply. |