agglomerative

Unsupervised Hierarchical Agglomerative Clustering

David Waterworth

2022年4月16日 07:04

I've read a number of papers where the authors talk about "Unsupervised Hierarchical Agglomerative Clustering". They seem to imply that the algorithm determines the number of clusters based on a hyper-parameter: We define the hetereogeneity metric within a cluster to be the average of all-pair jaccard distances, and at each step merge two clusters if the heterogeneity of the resultant cluster is below a specified threshold When I search for python implementations of Agglomerative Clustering I keep coming up with …

Topic: agglomerative clustering

Category: Data Science

How to score different clusters of features for predictiveness?

Edvard-D

2021年12月22日 18:04

I have a set of true/false data that represents whether or not a given feature was or was not active when the data snapshot was recorded. Data snapshots are recorded when the user takes an action. The goal is to find clusters of features that were true at the same time that are predictive of the user taking said action. To provide some more context, I'm working on a program that is meant to analyze data recorded while players play …

Topic: agglomerative unsupervised-learning clustering

Category: Data Science

Understanding hierarchical clustering features importance

Alex Dana

2021年10月30日 08:34

I made a hierarchical clustering with scikit : selected_model = AgglomerativeClustering(n_clusters=8) hierarchical_clustering8 = selected_model.fit_predict(answers) This classification was done on the basis of 50 features and led me to 8 clusters. How can I proceed to determine the importance of each feature in this classification ? My goal is to determine the most important and least important features for each cluster, and to be able to explain each cluster.

Topic: agglomerative explainable-ai scikit-learn python clustering

Category: Data Science

ggplot2 for Cluster analysis (non-readible row names)

Lola Kpiani

2021年10月25日 21:30

I have made a cluster analysis and ended up with dendrogram; however the row names are not readible (made a red rectangle). May I ask if there is way to adjust it? library("reshape2") library("purrr") library("dplyr") library("dendextend") dendro <- as.dendrogram(aggl.clust.c) dendro.col <- dendro %>% set("branches_k_color", k = 5, value = c("darkslategray", "darkslategray4", "darkslategray3", "gold", "gold2")) %>% set("branches_lwd", 0.6) %>% set("labels_colors", value = c("darkslategray")) %>% set("labels_cex", 0.5) ggd1 <- as.ggdend(dendro.col) ggplot(ggd1, theme = theme_minimal()) + labs(x = "Num. observations", y = "Height", …

Topic: agglomerative ggplot2 r clustering

Category: Data Science

Results interpretation of AgglomerativeClustering labelling

dao.foa

2021年7月1日 20:32

First of all I would like to say that I'm quite new to python and even more new to scikit, and I'm also a self learner, so please forgive my banal question, but it doesn't look banal to me. So, I have the following cosine similarity matrix as a DataFrame: m1 m2 m3 m4 m5 m1 1.000 0.179 0.775 0.673 0.544 m2 0.299 1.000 0.333 0.521 0.232 m3 0.656 0.440 1.000 0.444 0.722 m4 0.578 0.154 0.623 1.000 0.891 m5 …

Topic: agglomerative cosine-distance scikit-learn python clustering

Category: Data Science

Unsupervised Hierarchical Agglomerative Clustering

How to score different clusters of features for predictiveness?

Understanding hierarchical clustering features importance

ggplot2 for Cluster analysis (non-readible row names)

Results interpretation of AgglomerativeClustering labelling

About