Glossary.

A brief explanation of technical terms used in the analysis

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

t-distributed stochastic neighbor embedding (t-SNE)

A non-linear dimensionality reduction algorithm that is particularly good at approximating the distance between points when transferring them from a high dimensional space to 2D.

Tech topic

Topic is a collection of Meetup tags that can be found in similar context or are used to describe the same thing. In our analysis, we have created a hierarchy of topics. Initially, 42 sub-categories were created which were then aggregated into a 9 broad topics.

Term frequency, inverse document frequency (TF-IDF)

TF-IDF is a weight used to evaluate the importance of a word in a document of a corpus. The former part, TF, shows the raw frequency of a word in a document, while the latter, IDF, measures how often the word is found across the corpus.

Tokens

Tokens are the meaningful entities of minimum length that are created after splitting a piece of text to its parts. For instance, phrases, symbols, words, numbers and punctuation are some commonly created tokens. In Arloesiadur, tokens are the preprocessed tags that are used by the Meetup groups.

Back to the top