graphs

Building a graph out of a large text corpus

kevin_was_here

2022年5月28日 10:19

I'm given a large amount of documents upon which I should perform various kinds of analysis. Since the documents are to be used as a foundation of a final product, I thought about building a graph out of this text corpus, with each document corresponding to a node. One way to build a graph would be to use models such as USE to first find text embeddings, and then form a link between two nodes (texts) whose similarity is beyond …

Topic: similar-documents graphs text-mining nlp similarity

Category: Data Science

Data Analytics how to read ECDF graph

Yavuz Bozkurt

2022年5月25日 18:08

Hi there, My question is about how to read ECDF graphs. I am still quite unsure what the jumps / zig-zags in the graph mean and what is happening when there is a horizontal line and so on. I would be happy if someone can explain me how I am suppose to read this graph and what information I can get from it. Thank you

Topic: data-analysis data graphs

Category: Data Science

Task of regression on graphs

Tereso del Río Almajano

2022年5月16日 12:50

Which tools are available to extract features from a graph. After that, I would like to perform regressions on those features. Initially, I was thinking about using the adjacency matrix of the graph. But maybe there is a smarter way of doing feature extraction on graphs.

Topic: regression graphs feature-extraction

Category: Data Science

Are there any graph embedding algorithms like this already?

monomonedula

2022年5月13日 13:00

I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples. The question is: Am I reinventing the wheel? Does this approach have any practical advantages over existing solutions for embeddings generation? Yes, I'm aware there are many algorithms for this based on random walks, but this one is pure deterministic linear algebra and it is quite simple, from my perspective. In short, the algorithm …

Topic: numpy representation embeddings graphs python

Category: Data Science

How can I store sources, effective dates, and confidence for every property in a knowledge graph?

AJAr

2022年5月9日 16:00

What I am wanting to do is ensure that every property in a knowledge base comes from at least one source. I would like to ensure that every edge is spawned (or at least explained) by some event, like a "claim" or "measurement" or "birth." I'd like to rate on a scale the confidence that some property is correct, which could also be inherited from the source's confidence rating. Finally, I want to ensure that effective date(s) are known or …

Topic: uncertainty inference graphs knowledge-base databases

Category: Data Science

How to apply K-Medoids in many CFG?

Prateek

2022年5月2日 11:04

I am having around 1000 DAG(Directed Acyclic Graph) of different files showing java.io.BufferedReader usage. Following is representation of one of the graphs digraph G { 9 [ label="9 : ROOT:setup()#0" ]; 10 [ label="10 : START IF" ]; 12 [ label="12 : java.net.URL.openConnection()#1" ]; 11 [ label="11 : END IF" ]; 13 [ label="13 : java.net.URL.openConnection()#0" ]; 14 [ label="14 : START IF" ]; 16 [ label="16 : java.net.HttpURLConnection.setRequestProperty()#2" ]; 15 [ label="15 : END IF" ]; 17 [ label="17 …

Topic: graphs clustering

Category: Data Science

Growth Edge in Link Prediction

Raphael Bellahsen

2022年4月28日 13:56

I have 2 CSV files representing edge in social networks in 2 consecutive generations. I am trying to predict future edges. My initial tough is to train a linear regression on the first generation with some indicators like Adar Index or Cosine Similarity between the node of the edge I am trying to predict. I can not run all the combinations possible between 2 nodes, so I was wondering how many edges I need to add between 2 generations? Is …

Topic: mathematics graphs machine-learning

Category: Data Science

Return the gradient and y intercept (m, b) to create two lines to best fit the data

Sultan

2022年4月24日 20:54

I have been working on this task for a few hours now and have been unsuccessful with getting the target result. I have tried using multiple methods of trying to split the dataset using different clustering methods and logistical regression with no luck. I thought noncontinuous piecewise linear regression may work however found no good resources on how to implement it. The taks is given a 2D NumPy array of x, y data points determine the gradient and y-intercept for …

Topic: linear-regression graphs dataset

Category: Data Science

How does one feed graph optimization problems into Python's anneal function in SciPy?

user2896468

2022年4月9日 15:50

I am interested in graph problems like 2-color, max-clique, stable sets, etc but the documentation for scipy.optimize.anneal seems to be for ordinary functions. How would one apply this library towards graph formulations?

Topic: scipy graphs optimization python

Category: Data Science

How to tell how much information I lose when I simplify the graph data structure with respect to unsimplified graph?

Daniel Wiczew

2022年4月9日 11:01

I have the following problem: I have some sort of data (that I can't publish here, but they are in the form of points with XYZ coordinates) and I can represent them as a collection of graphs i.e. $Q = \{G_1, G_2 ... G_t\}$, where for every node there is an associated set of features, e.g. node $u_i$ has feature vector $\mathcal{F}_i$ and the features are changing between graphs (but graph structure does not). The resulting graphs are big in …

Topic: pca graphs

Category: Data Science

How to perform node classification using Graph Neural Networks

Andrew

2022年4月8日 17:00

I'm am trying to perform node classification using graph neural network methods. My initial plan was to convert my graphs to adjacency matrices and train my network on that, with the node features being my target. However, my graphs all have a different number of nodes, so I believe adjacency matrices will not work. I then found information about node embeddings and applications in biology (see here). It infers here that embedding your nodes no longer matters about graph size. …

Topic: graphs deep-learning neural-network

Category: Data Science

Using iGraph to build a Distribution Model

James

2022年3月28日 19:06

I would like to analyze the distribution of the Customers from a Shop, if the Shop is closed or terminated. Consider the following sample data-set; | ShopID | MonthlyCVisitCount | Lat | Lng | -------------------------------------------------------- | A1 | 15000 | 39.84349 | 116.33986 | | A2 | 24560 | 39.84441 | 116.33995 | | A3 | 14789 | 39.84615 | 116.34012 | | A4 | 35479 | 39.84891 | 116.34039 | I would like to build a distribution model using …

Topic: graphs data-mining machine-learning

Category: Data Science

Why is sliding window evaluation important in time series analysis?

A-nak Wannapaschaiyong

2022年3月14日 06:43

I have been working dynamic graph neural newtork survey, and what I realized is that all of the well known paper (from pretegious university) do not use sliding window evaluation on dynamic graph model. They only use simple train-test splits. I find this very confusing. Then I start asking question why sliding window is important in time analysis in the first place. From my own experience, I know for a fact that dynamic graph models are VERY VERY sensitive to …

Topic: graphs time-series

Category: Data Science

How to perform inductive train/test split for GraphSAGE classification

Tomaž Bratanič

2022年3月12日 22:14

Let's say I have a network that consists of a single weakly connected component. From various papers I've seen that if you want to use inductive GNNs like GraphSAGE, it is advisable to split your train/test data into two separate graphs or components. Since I've seen that there are different approaches for node classification and link prediction tasks, I am specifically interested in node classification tasks, possible multiclass classification. So the train/test split graphs would need to ensure some sort …

Topic: graph-neural-network graphs

Category: Data Science

Semi supervised learning on graphs

Bruno Mello

2022年3月9日 22:31

I have the following semi-supervised problem: I have a graph of persons and their relations. Some of those persons have a predefined risk classification. Classify the risk of the other nodes. I know risk is kind of arbitrary that's why I'm open to any ideas. An example is, suppose I have a person with classification critical (10) and I wanted to find the risk classification of their neighborhood. I thought on doing something like for every node, for every fixed …

Topic: semi-supervised-learning graphs

Category: Data Science

Algorithms for Vertex or Node Correspondence

Matthew Crawford

2022年2月28日 22:37

Given a graph G, and another graph with the same number of vertices G’, one can define a vertex correspondence function f, from the vertex set of G to the vertex set of G’. The correspondence function f needs to be bijective, and it’s purpose is to give information about the relationship between the two graphs. One example of this would be given two isomorphic graphs G and G’, the actual isomorphism would serve as the vertex correspondence function. I …

Topic: graphs dataset machine-learning

Category: Data Science

How to approach mapping families of vectors on a lattice and forecast resulting value

user305883

2022年2月27日 04:03

I describe here a model to describe how neighbours influence a node. I wish to implement it to attempt forecasting to values associate nodes; I post here asking for suggestions on mathematical model and machine learning techniques that could have already considered a similar approach, but I am not aware of, and hints for their implementation (python). Suppose you a have a squared 2D lattice (a grid of 9 squares for simplicity), and: for each time t from each cell …

Topic: regression graphs deep-learning neural-network time-series

Category: Data Science

What are graph embedding?

Volka

2022年2月26日 07:47

I recently came across graph embedding such as DeepWalk and LINE. However, I still do not have a clear idea as what is meant by graph embeddings and when to use it (applications)? Any suggestions are welcome!

Topic: graphs

Category: Data Science

Fraud risk propagation in large scale network

Naveed

2022年2月25日 00:05

What's the best approach to do some graph analytics and risk propagation in a network using python where multiple accounts are connected through a relationship and few of the accounts in the network are marked as bad accounts and the rest are unknown? I tried using networkx but it seems to run forever. I have about 8MM edges and 40K nodes

Topic: networkx graphs python

Category: Data Science

Probability distributions for Directed Cyclic Graphs

Jonas Hjulstad

2022年2月23日 20:06

Given a directed cyclic graph where vertex A is 'infected', and there are different infection probabilities between each node, what is the best approach towards computing the conditional probability $p(F|A)$? Do I have to transform it into asyclic graph and use bayesian net-methods? How would I proceed in order to design an algorithm for computing probabilities like this one, and are there approaches to this that are computationally feasible for very large networks?

Topic: bayesian bayesian-networks graphs

Category: Data Science

About