Does t-SNE have to result in clear clusters / structures?

I have a data set which, no matter how I tune t-SNE, won't end in clearly separate clusters or even patterns and structures. Ultimately, it results in arbitrary distributed data points all over the plot with some more data points of the one class there and some of another one somewhere else. Is it up to t-SNE, me and/or the data? I'm using Rtsne(df_tsne , perplexity = 25 , max_iter = 1000000 , eta = 10 , check_duplicates = FALSE)
Topic: tsne r clustering
Category: Data Science

How do I calculate a similarity matrix with a Student-t kernel?

As the title says, how do I calculate a similarity matrix with an un-normalized Student-t kernel? I'm attempting to calculate Kullback-Leibler divergence for different t-SNE runs, but need a Q-matrix for that. A few steps before the Q-matrix, I need the similarity matrices made using the un-normalized Student-t kernel. I'm using r, not sure if that's relevant to an answer.
Category: Data Science

What Clustering Method Should I Use?

My data is a group of 10 thousand points (each having an node location (x,y)) that are spread across a plane. They are also chromatically-colored based on their weight. I need to finalize a bayesian nonparametric clustering method that groups points on mainly weight, but also on distance: that is, clusters are by defintion have some relevance to distance, but there are clear topological distinguishing factors between the first quarter and the last quarter of data (I say quarter as …
Category: Data Science

Is it always possible to get well-defined clusters from the data?

I have TV watching data and I have been trying to cluster it to get different sets of watchers. My dataset consists of 64 features (such as total watching time, percent of ads skipped, movies vs. shows, etc.). All the variables are either numerical or binary. But no matter how I treat them (normalize them, standardized, leave them as is, take a subset of features, etc.), I always end up getting pictures similar to this: This particular picture was constructed …
Category: Data Science

t-SNE on extremely high-dimensional spaces

I successfully applied t-SNE to the number handwriting dataset. n=3823 data points (i.e. handwritten numbers) in an D=64 dimensional space (i.e. 8x8 pixels). Worked great. Now I would like to cluster n≈60 data points in an D≈3000 dimensional space. Even after many iterations, t-SNE fairs far worse than say PCA. Is there an upper bound on the number of dimensions (relative to the number of data points) above which applying t-SNE is not adviced?
Topic: tsne pca
Category: Data Science

Grouping already clustered data (with a pre-defined x and y)

I have an already clustered data set (I wanna keep my x and y), where there's clearly a small group of elements in the middle that don't follow the expected patterns. I can select them manually, but I wonder if there's a way of automating the selection part of these elements, efficiently. Something like using just the grouping part of a clustering algorithm, I've been trying it with a threshold, but it doesn't produce good results in cases that won't …
Category: Data Science

Visualizing outliers using T-SNE

I'm trying to visualize outliers in my data using T-SNE and it seems like the outliers appear as three different clusters. The original data has 7 different columns but I chose to plot the outliers on a two dimensional graph. I expected the outliers to be clustered into one single group but I have three different clusters (red dots) on my graph. Is it normal to see different groups of outliers? For example, the red cluster on the far left …
Category: Data Science

How to reduce position changes after dimensionality reduction?

Disclaimer: I'm a machine learning beginner. I'm working on visualizing high dimensional data (text as tdidf vectors) into the 2D-space. My goal is to label/modify those data points and recomputing their positions after the modification and updating the 2D-plot. The logic already works, but each iterative visualization is very different from the previous one even though only 1 out of 28.000 features in 1 data point changed. Some details about the project: ~1000 text documents/data points ~28.000 tfidf vector features …
Category: Data Science

what information can we obtain from t-SNE?

I see that t-SNE can help us reduce dimensions and visualize the data. But what information are we gaining from this visualization? As we know that the new axis don't have a meaning in our context. Moreover, if we have a class labeled data, then what information can we gain from the visualization? We already know that there are some 'n' classes and that we have to classify new examples in one of these classes. Or am I wrong to …
Category: Data Science

Can t-SNE be applied to visualize time series datasets

I have multiple time-series datasets containing 9 IMU sensor features. Suppose I use the sliding window method to split all these data into samples with the sequence length of 100, i.e. the dimension of my dataset would be (number of samples,100,9). Now I want to visualize those splitted samples to find out the patterns inside. Can I treat it as tabular data and transform the original dimension firstly to (number of samples, 900), then apply t-sne method directly on that …
Category: Data Science

PCA vs t-SNE in asset pricing

So I am trying dimensionality reduction techniques on the S&P500 FY2020 data. I understand the CAPM model and the fact that doing a PCA determines my market variability factor (the first PCA component). What I am wondering is, what intuitions (if any) does t-SNE give on the same data? Using scikit-learn I have embeddings for the first component, but does the embedding relate to CAPM in any way? Or for that matter any other asset pricing model? What I have …
Category: Data Science

t-SNE - how variance is set and how it affects dense vs sparse clusters in HD space

When learning about t-SNE, I found a resource saying "width of the normal curve (a gaussian centered at $x_i$) depends on the density of data near the point of interest". Which is why we do the normalization with $\sum_{k\neq i} e^{(-||x_i - x_k||^2/2\sigma^2)}$ in $p_{j|i}= \frac { e^{(-||x_i - x_j||^2/2\sigma^2)}} {\sum_{k\neq i} e^{(-||x_i - x_k||^2/2\sigma^2)}}$. I know that the gaussian's width depends on the variance, ${\sigma}^2$. However there was no mention of calculating the variance and I read that variance …
Topic: gaussian tsne
Category: Data Science

Good classification, poor separation with TSNE/UMAP

I have been working on a classification problem for which I have been able to achieve good results across various classification metrics. I have been careful to ensure that I am not leaking information at any stage in my pipeline, but its always possible that I've missed something. Recently, I ran the data through TSNE and UMAP and found that my classes do not separate well at all. This is surprising given the success of my models. Is this discrepancy …
Category: Data Science

TSNE interpreration and separability

I have a binary classification problem where I train a neural network on a training and validation data sets. But I am not satisfied with the performance of my trained classifier (the NN above). The loss function (a binary cross entropy) did not get lower than 0.1280 the on validation set and on the test set it is about 0.1340. I tried to somehow debug my data with a TSNE to visualize how "separable" my training data is. My question …
Category: Data Science

TSNE parameters

Trying to tune the parameters of sklearn.manifold.TSNE(n_components=2, *, perplexity=30.0, early_exaggeration=12.0, learning_rate=200.0, n_iter=1000, n_iter_without_progress=300, min_grad_norm=1e-07, metric='euclidean', init='random', verbose=0, random_state=None, method='barnes_hut', angle=0.5, n_jobs=None, square_distances='legacy') even I tried a number of combinations, the visual is not showing a clear seperation between two classes. Is there a way to tune tsne automatically or manually to find the best parameters?
Category: Data Science

Why it is recommended to use T SNE to reduce to 2-3 dims and not higher dim?

According to wiki it is recommenced to use T-SNE to map to 2-3 dimensional. I can understand this , if we want to visualizing the data. If we want to reduce the number of features (i.e from 30 features to 5 dims), is it recommended to do this with T-SNE ? or we should use other dimensional reduction algorithm ?
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.