Intro I need an input file of 5 letter English words to train my Bayesian model to infer the stochastic dependency between each position. For instance, is the probability of a letter at the position 5 dependent on the probability of a letter at position 1 etc. At the end of the day, I want to train this Bayesian network in order to be able to solve the Wordle game. What is Wordle? It’s a game where you guess 5 …
I'm trying to determine whether it's best to use linear or quadratic discriminant analysis for an analysis that I'm working on. It's my understanding that one of the motivations for using QDA over LDA is that it deals better with circumstances in which the variance of the predictors is not constant across the classes being predicted. This is true for my data, however I intend to carry out principal components analysis beforehand. Because this PCA will involve scaling/normalising the variables, …
What I am wanting to do is ensure that every property in a knowledge base comes from at least one source. I would like to ensure that every edge is spawned (or at least explained) by some event, like a "claim" or "measurement" or "birth." I'd like to rate on a scale the confidence that some property is correct, which could also be inherited from the source's confidence rating. Finally, I want to ensure that effective date(s) are known or …
I have some voting or polling data that is listed by voting district. I also have detailed demographics of each voting district. How can I combine this to get an estimation of how the different demographics voted? I want to be able to make a chart of "percent yes" over age, or income bracket. (In the end, I want to use these relations to try and predict the outcome in a place with different demographics). One approach I've seen is …
So my decoder is a transformer-decoder and in training I don't have any issue. I have all the input from the beggining and correctly masked. However, in inference I have to get a new token at a time and keep adding it to the target and only stop when the latest token outputted is <eos>. Well, in a batch I find it difficult because each sequence will end at a different point and so I'd have to keep going with …
I am looking for a R package that does continuum regression. More concrete I need a function that does continuum regression s.t. I can evaluate the values afterwards. At least extracting MSE or RMSE of the predictions should be pssible.
Before doing principle component regression it is important to scale the data. But which data exactly? Is it enough if I just scale X or do I have to scale the whole data set, containing X and Y (=regressor and regressand)? The advantage of scaling just X is, that I do not have to backtransform Y. But is this valid? Whats the difference between scaling just X and scaling the whole data set?
I need to calculate statistical significance of difference between two time series, each with 4500 terms. My null hypothesis is that $H_0: \mu=\mu_0$. How can I calculate p-value? Is Z-statistic useful for p value calculation? How to get p-value after computing Z-statistic? I have $\alpha = 0.05$.
This is a usual situation I meet recently that customers gave me a database with many tables they don't quite understand too, then ask me to make a model predict the future revenue, classify which user may be valuable or something else. To be honest, extracting useful data from an unknown database made me exhausted. For example, I need to figure out which table is the user table, product table, or transaction table ... which column can use to join(there …
Is it correct to say that the lower the p-value is the higher is the difference between the two means of the two groups in the t-test? For example, if I apply the t-test between two groups of measurements A and B and then to two groups of measurements B and C and I find that in the first case the p-value is lower than the second case, could one of the possible interpretations be that the difference between the …
I've been reading this blog (https://deepmind.com/blog/article/Causal_Bayesian_Networks) and am just getting into Causal inference. I have a question regarding causal network graphs. Let's say given data, how exactly are causal network graphs generated? How does it know which features are causal to other features?
I'm trying to make a mobile app on image recognition(Computer Vision Application) . Does anyone know whether modern day smartphones have enough processing power/memory to recognize, say about 1 million classes from their real-time camera feed (30 fps) ? (On-device inference or YOLO neural net concepts needed.) What is the maximum number of classes it can reliably classify on a mobile device? Some insight into figuring out computational loads/times in general would be helpful.
I am a very beginner in deep learning and am playing with voice cloning project. I trained my dataset and used the trained model to synthesize some sentences and was surprised to get a very different output each time I ran the synthesis (output ranging for very good quality to very poor with unintelligible content). I understood that this was due to the initial state of the model that was set up randomly thanks to a random seed, but in …
Can a framework use both CPU and GPU in parallel for inferencing a model? It seems possible but wondering if any of the frameworks like TensorFlow or PyTorch done this? To explain further, can we use CPU to execute part of model graph and another parallel subgraph will use GPU to do the same inference?
I have recently started studying GNN's. I have covered GCN and GraphSage so far. But I am confused regarding the process when testing occurs. Now suppose in the graph above I am using the nodes as train and test set as shown in the figure. Suppose I am using the GraphSage model for a supervised node-classification task , now during training I am providing the sub-graph with blue nodes and the weights(parameters) gets calculated using the neighbourhood information of the …
I have been looking into outputting a model explainer artefact at time of training my Keras+Tensorflow Neural network. Lime seems like a great choice however my data is very big and I am reading from disk one batch at a time as it is impractical and inefficient to store in memory. Lime appears to require the whole training dataset to be inputted for it to be able to create a surrogate model. Is it appropriate to use only a sample …
I use Tensorflow C++ API. I have a Tensorflow model. I give some inputs to this model. There is a parameter called max_position_embeddings This parameter determines maximum acceptable input dimensions When I give a very long input for inference I get the exception: {{function_node __inference__inference_10663}} {{function_node __inference__inference_10663}} indices[0,2048] = 2049 is not in [0, 2049) [[{{node decoder/position_embeddings/Gather}}]] [[StatefulPartitionedCall/StatefulPartitionedCall]] So far, everything is normal. However, after catching this exception, I try a second inference with a small size input and again …
https://en.wikipedia.org/wiki/Causal_model#Definition Wikpedia defines causal models as: an ordered triple $\langle U, V, E\rangle$, where $U$ is a set of exogenous variables whose values are determined by factors outside the model; $V$ is a set of endogenous variables whose values are determined by factors within the model; and $E$ is a set of structural equations that express the value of each endogenous variable as a function of the values of the other variables in $U$ and $V$. I'm confused what the …
I am looking into seq2seq model in keras, for example, this blog post from keras or this. All the examples I have seen have some inference model, that depicts the original model. That inference model is then used to make the predictions. My question is why can't we just do the model.predict(). I mean, we can because I have used it and it works but what is the difference between these two approaches. Is it wrong to use model.predict() and …
I have trained a VAE to generate a style transferred sentence, from a negative sentence to a positive sentence. The underlying concept of VAE tells us that the sampling is done randomly, to which Mean and Variance are added corresponding to the original input. However, with my trained VAE, I am observing that at test time, my VAE is generating the same output (style transferred sentence) given an input sentence, no matter how many times I test. My question is: …