A brief explanation of technical terms used in the analysis
A non-linear dimensionality reduction algorithm that is particularly good at approximating the distance between points when transferring them from a high dimensional space to 2D.
Topic is a collection of Meetup tags that can be found in similar context or are used to describe the same thing. In our analysis, we have created a hierarchy of topics. Initially, 42 sub-categories were created which were then aggregated into a 9 broad topics.
TF-IDF is a weight used to evaluate the importance of a word in a document of a corpus. The former part, TF, shows the raw frequency of a word in a document, while the latter, IDF, measures how often the word is found across the corpus.
Tokens are the meaningful entities of minimum length that are created after splitting a piece of text to its parts. For instance, phrases, symbols, words, numbers and punctuation are some commonly created tokens. In Arloesiadur, tokens are the preprocessed tags that are used by the Meetup groups.