Confusion regarding which distribution Monte Carlo considers for sampling

Considering Bayesian posterior inference, which distribution does Monte Carlo sampling take samples from: posterior or prior? Posterior is intractable because the denominator (evidence) is an integration over infinite theta values. So, if Monte Carlo samples from posterior distribution, I am confused as to how the posterior distribution is known as it is intractable. Could someone please explain me what I am missing? If Monte Carlo samples from prior distribution, how does the samples approximate to posterior distribution?
Category: Data Science

Why not use max(returns) instead of average(returns) in off-policy Monte Carlo control?

As I understand it, in reinforcement learning, off-policy Monte Carlo control is when the state-action value function $Q(s,a)$ is estimated as a weighted average of the observed returns. However, in Q-learning the value of $Q(s, a)$ is estimated as the maximum expected return. Why is this not used in Monte Carlo control? Suppose I have a simple 2-dimensional bridge game, where the objective is to get from a to b. I can move left, right, up or down. Lets say …
Category: Data Science

Action-value estimation of deterministic policies with Monte Carlo method

In Monte Carlo-based action value estimation problem for a deterministic policy (estimation of $q_{\pi}(s,a)$),the estimation problem seems not to be well-defined because $q_{\pi}(s,a)$ by definition means the value of an arbitrary action $a$ at a given state $s$ when initial action $a$ is applied at that state and then following actions from policy $\pi$ at the next states. But, in a real application under a given deterministic policy $\pi$, how can you choose the initial action $a$ arbitrarily at state …
Category: Data Science

How to perform a Monte Carlo simulation with continuous sampling using discrete quantiles?

Assume I have registered the duration of 10 tasks and built the table below with using this data: Duration For how many tasks it happened 4 days 5 task 6 days 2 task 8 days 2 task 10 days 1 task Looking at this table, one can easily conclude that there's a 50% chance that a task would last 4 days. Therefore, my Monte Carlo simulation will yield "4 days" as the task duration 50% of the time. However, there's …
Category: Data Science

Monte Carlo Markov Chain

I was trying to figure out what is a Monte Carlo Markov Chain. From what I understand it is a way of computing an approximation of a probability distribution, which cannot compute exactly. So we keep sampling from a probability distribution, in order to be more accurate, reducing variance of the samples by increasing the number of examples, and this samples are given by Gibbs Sampling. This step-to-step process is a Markov Chain, but I don't really get the details …
Category: Data Science

How do I find the optimal dropout rate for Monte Carlo Dropout?

I have a text classifier with 3 dropout layers. I tried to use Monte Carlo Dropout (MCD) technique to improve its performance, however its performance hasn't improved. MCD improved performance when classifying hand-written digits for MNIST dataset. Now I wonder whether there is simply no space/potential for improving my text classifier or I have selected incorrect dropout rate. How do I find the optimal dropout rate for Monte Carlo Dropout? In particular: Should I use same dropout rate during both …
Topic: monte-carlo
Category: Data Science

How to resolve IndexError while doing Monte Carlo for 1000 runs?

Below code runs without any problem, however when I run the same code using Monte Carlo Analysis for 1000 runs, it gives IndexError. Can someone explain why this happens. Thanks X = df1.drop("Gender", axis = 1) y = df1.Gender X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2) nb = CategoricalNB() nb.fit(X_train,y_train) nb_pred = nb.predict(X_test) nb_accuracy = accuracy_score(y_test,nb_pred) nb_accuracy output: 0.6279486413854882 X = df1.drop("Gender", axis = 1) y = df1.Gender for i in range(1000): X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2, random_state = i) #CategoricalNB Naive Bayes Model nb = …
Category: Data Science

Evaluating a trained Reinforcement Learning Agent?

I am new to reinforcement learning agent training. I have read about PPO algorithm and used stable baselines library to train an agent using PPO. So my question here is how do I evaluate a trained RL agent. Consider for a regression or classification problem I have metrics like r2_score or accuracy etc.. Are there any such parameters or how do I test the agent, conclude that the agent is trained well or bad. Thanks
Category: Data Science

Is it a good idea to use the mean and standard deviation of coefficients from other models as my prior in Bayesian Regression?

I have a dataset that I’ve been playing around with for school I have gotten very good results with a bunch of methods (Ridge, Lasso, ElasticNet, SVM, Bagging, Stacking and NN even) Now I’m having a range of different coefficients of my predictors, is it a good idea to use them as my priors (I did so, I think the result has been ok) or should I use noninformative priors instead. If it is a bad idea, could you explain …
Category: Data Science

MCMC algorithm -- understanding some paremeters

I am trying to understand an MCMC program. I manage to run it, but I am trying to understand the meaning of the some parameters in the analysis. The code is something like this #Nsamples nsamp = 50000 #Burn-in skip = 300 #temperature at which to sample temp = 2 #Gelman-Rubin for convergence GRstop = 0.01 #every number of steps check the GR-criteria checkGR = 500 #1 if single cpu , otherwise is giving by the nproc-> mpi -np # …
Category: Data Science

Pull Random Numbers from my Data (Python)

Let's imagine I have a series of numbers that represents cash flows into some account over the past 30 days in some time window. This data is non-normal but it does represent some distribution. I would like to pull "new" numbers from this distribution in an effort to create a monte-carlo simulation based on the numerical data I have. How can I accomplish this? I've seen methods where you assume the data is normal & pull numbers based on some …
Category: Data Science

In first visit monte carlo are we assuming the environment is the same over episodes?

Watching this video (11:30) that presents the simplest algorithm for reinforcement learning: Monte Carlo Policy Evaluation, which says in general: The first time a sate is visited: increment N(s): N(s) = N(s) + 1 increment total state's return function by current episode's return so far S(s) = S(s) + G_t State's value is estimated by mean return over many episodes: V(s) = S(s) / N(s) by law of large numbers, V(s)-->V_true(S) as N(S)-->inf My question is - should the environment …
Category: Data Science

How can I build a simulation environment that assess different risk policies?

I work in fin-tech and would like to build some sort of simulation program to assess how different inputs will impact net revenue. For example, if we create new policies based on ML scores, how would those have impacted our loss and revenue metrics? While we can and do run online experiments, it would be desirable to simulate these impacts ahead of time. Aside from something like reinforcement learning, I was thinking that Monte Carlo simulations might be the best …
Category: Data Science

Test data for statistical t-test in Python

first of all sorry if this is not the proper place to ask but i have been trying to create some dummy variables in order to run a students t-test as well as a welch t-test and then run a monte-carlo simulation.Problem is, I am only given the sample size and standard deviation of the 2 populations. How can I go about creating some sort of representation for this data in order for me to run these tests? I wish …
Category: Data Science

Having a reward structure which gives high positive rewards compared to the negative rewards

I am training an RL agent using PPO algorithm for a control problem. The objective of the agent is to maintain temperature in a room. It is an episodic task with episode length of 9 hrs and step size(action being taken) for every 15 mins.During the training of an agent, from a given state, agent takes an action.Then I check the temperature of the room after 15 mins(step size) and if this temperature is within limits, I give the action …
Category: Data Science

How to handle differences between training and deploying of an RL agent

Hi I am training an RL agent for a control problem. The objective of the agent is to maintain temperature in a zone. It is an episodic task with episode length of 10 hrs and actions being taken every 15 mins. Ambient weather is one of the state variable during the training. For training process a profile of ambient temperature has been generated for each hour of the day and used for training. I have trained the agent using PPO …
Category: Data Science

Different results every time I train a reinforcement learning agent

I am training an RL agent for a control problem using PPO algorithm. I am using stable-baselines library for it. The objective of an agent is to maintain a temperature of 24 deg in a zone and it takes actions every 15 mins.The length of episode is 9 hrs. I have trained the model for 1 million steps and the rewards have converged. I assume that the agent is trained enough. I have done some experiments and have few questions …
Category: Data Science

Best Method for Data Analysis on a 100 numerical IVs and 200 numerical DVs

I think I might need the help of this valuable community for a task. I have been given a dataset for 100 numerical independent variables (IVs) that predict output for 200 numerical values (from monte carlo simulation results). Which statistical technique should I start exploring and trying on my dataset? The number of observations can be increased for more points to enhance learning an algorithm. From this, I would like to learn a few insights, such as the multiple collinearity …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.