simulation

Similarity Measure of Simulated Time Series vs Observed time Series

Wiesel

2022年2月23日 05:02

In my work I have an observed Time Series and Simulated ones. I want to compare the Light Curves and check for similarityto find out which simulated curve fits best respectivley which parameters simulate the Light Curve the best. At the moment I do it with the Cross-Correlation function from numpy. But I am not sure if that is the best option, due to the fact that the Light Curve with the highest Cross-Correlation Coefficient not always looks like the …

Topic: simulation correlation time-series similarity statistics

Category: Data Science

Stack as many industrial components as possible in a crate

Xiiryo

2022年1月25日 10:42

The exact problem is a crate of industrial parts, made by injection molding in very high quantities. The objective is to put as much parts as possible in one crate. This is done by a small robotic arm that take the part from the injection molding machine, cold it a little and put it in the crate. The shape of the parts can be a bit complex. It can be basically anything that can be molded into a 2 parts …

Topic: 3d-object-detection openai-gym genetic-algorithms simulation reinforcement-learning

Category: Data Science

How to interpret data projected on the sum of first few principal components weighted by eigen values?

Phil

2021年12月21日 09:01

I have simulation time series data of a molecule from Molecular dynamics and I want to visualize the very high-dimensional trajectory in two dimensions and also identify some clusters. The problem is that when I do PCA, the first 20 eigenvectors are needed to explain 80% of the variance. Is it possible to add the first 10 eigenvectors and get a single vector V1 and add up the 11th to 20th vector as V2 all components weighted by their eigen …

Topic: pca simulation dimensionality-reduction

Category: Data Science

Building a simulator for continuous state, discrete action reinforcement learning

user126806

2021年10月22日 01:12

I am trying to build a simulator that optimizes the performance and temperature of a device. I want the device to perform well, but without making the device too hot. If the device becomes too hot, I want the internal circuitry to push down the device performance to reduce the temperature. It is hard to perform repeated ground truth experiments on the device so I need to build a simulator in which to train the agent. I am new to …

Topic: simulation reinforcement-learning

Category: Data Science

Any books or resources about how to approach "purely synthethic expressions" of physical phenomena?

mavavilj

2021年9月2日 06:16

Over and over again I come to think that "it's cumbersome to collect empirical data". Yet it's often viewed as a necessity for explaining empirical phenomena. But then I idealize that: It would be so nice if I could describe a phenomenon simply by describing a simple model with variable parameters and then generate instances from it to describe the empirical phenomenon. But I've been puzzled by particularly the validition phase of this, since often "validation" means "to compare to …

Topic: validation simulation

Category: Data Science

How to find out if two datasets are close to each other?

Kartikeya Sharma

2021年7月10日 20:26

I have the following three datasets. data_a=[0.21,0.24,0.36,0.56,0.67,0.72,0.74,0.83,0.84,0.87,0.91,0.94,0.97] data_b=[0.13,0.21,0.27,0.34,0.36,0.45,0.49,0.65,0.66,0.90] data_c=[0.14,0.18,0.19,0.33,0.45,0.47,0.55,0.75,0.78,0.82] data_a is real data and the other two are the simulated ones. Here I am trying to check which one (data_b or data_c) is closest or closely resembles to data_a. Currently I am doing it visually and with ks_2samp test (python). Visually I graphed the cdf of real data vs cdf of simulated data and try to see visually that which one is the closest. Above is the cdf of data_a …

Topic: simulation visualization python statistics

Category: Data Science

How to build a simulator for a physical machine given a set of datapoints of its behaviour?

user221200

2021年1月30日 07:54

I have a database with millions of datapoints describing the behaviour of a heat pump. For every second, I know various temperature, pressure, mass flow and power measurements as a response to the signals sent by a controller. In other words, I have records of what the machine is being told to do and what actually does. I would like to build a simulator that, given a set of artificial inputs (e.g. coming from a web page) attempts to simulate …

Topic: machine-learning-model simulation

Category: Data Science

Pull Random Numbers from my Data (Python)

Oliver Foster

2020年10月6日 22:16

Let's imagine I have a series of numbers that represents cash flows into some account over the past 30 days in some time window. This data is non-normal but it does represent some distribution. I would like to pull "new" numbers from this distribution in an effort to create a monte-carlo simulation based on the numerical data I have. How can I accomplish this? I've seen methods where you assume the data is normal & pull numbers based on some …

Topic: monte-carlo simulation python statistics

Category: Data Science

How can I build a simulation environment that assess different risk policies?

Kevin

2020年7月19日 22:41

I work in fin-tech and would like to build some sort of simulation program to assess how different inputs will impact net revenue. For example, if we create new policies based on ML scores, how would those have impacted our loss and revenue metrics? While we can and do run online experiments, it would be desirable to simulate these impacts ahead of time. Aside from something like reinforcement learning, I was thinking that Monte Carlo simulations might be the best …

Topic: monte-carlo simulation reinforcement-learning time-series python

Category: Data Science

What is the difference between domain randomization and data augmentation?

diyImma

2020年6月14日 08:38

Domain randomization (https://arxiv.org/abs/1703.06907) is used to create a synthetic dataset with enough variance that it will encompass unseen real data, as just one variation. I am trying to understand how this is different from applying data augmentation techniques to a synthetically generated dataset.

Topic: variance data-augmentation simulation neural-network dataset

Category: Data Science

Understand how to simulate a statistics

maindola

2020年5月25日 15:40

This solution describes how to simulate statistics to find a confidence interval. A journalist called 1000 people in town to ask who will they be voting for out of candidates A and B. The observed value came out to be 511 votes for A and 489 votes for B. this makes us think that candidate A will win. But we need to know if this sample is truly representative of the underlying population distribution. To find this, we simulate this …

Topic: confidence distribution simulation python

Category: Data Science

Visualization of multiple Markov models

Andrew Brown

2020年3月5日 07:22

I am working on a project where we compare over 10 different Markov models, each representing a different treatment plan. Most often single models are visualized with a decision tree or transition state diagram. However, with multiple different models what are potential visualizations that could communicate the transition states that differentiate each model? I have seen other people use a table to depict different models and the transition states. For clarity, I am not referring to a transition probabilities chart …

Topic: simulation markov-process visualization python r

Category: Data Science

Ways to simulate weather data over several periods (Python or R)?

Shiv

2020年1月24日 08:54

I have a time series dataset that has several variables for a state/province for fixed periods of time. That is for state A, there are samples from April 2017 to July 2019. Of course, I thought adding precipitation and temperature variables would be a great idea. I tried finding some relevant external data but most of it is abstract and spread out. How would one simulate dynamic data in Python with varying means, highs and lows for say six months …

Topic: numpy data simulation python r

Category: Data Science

Finding similarity between two datasets

Kartikeya Sharma

2019年10月25日 00:38

I have two datasets. One is actual percentage of white population in counties in an american state and the other is the simulated percentage of white population in counties in an american state. Bits about my simulation: It is a random simulation done on California map with two different agents, white and minority. Their total population is based on the real white to minority ratio in California. For example if there is 70% white and 30% minority in California then …

Topic: simulation python statistics machine-learning

Category: Data Science

Similarity Measure Time Series

Wiesel

2019年8月20日 22:17

In my work I have an observed Time Series and Simulated ones. I want to compare the Light Curves and check for similarityto find out which simulated curve fits best respectivley which parameters simulate the Light Curve the best. At the moment I do it with the Cross-Correlation function from numpy. But I am not sure if that is the best option, due to the fact that the Light Curve with the highest Cross-Correlation Coefficient not always looks like the …

Topic: simulation correlation time-series similarity statistics

Category: Data Science

Estimating the value of $\pi$ with a Monte Carlo dartboard: $<$ or $\leq$?

I_Don't_Code

2019年5月10日 05:09

I'm trying to figure out which is the proper way to estimate $\pi$ using the Monte Carlo method randomly distributing points in a square that also contains an inscribed circle. Some sources say to use the comparison of $\sqrt{x^2+y^2}\le 1$, while others use $\sqrt{x^2+y^2}<1$.                                            Here's some example code from a wikipedia article: def monte_carlo_pi(nsamples): acc = 0 for i in range(nsamples): x = random.random() y = random.random() if (x**2 + y**2) < 1.0: acc += 1 return 4.0 * …

Topic: mathematics monte-carlo simulation

Category: Data Science

ML/Statistical Model to Analyse the Distribution

James

2019年2月25日 12:19

Consider a Sample Data-set provided below; |ShopID| |Transactions| |dist_to_shop| S1 15478 0 S2 12345 0.41 S3 17865 0.11 S4 35479 0.57 S5 74589 0.35 The data-set consist of ShopID, Transactions and dist_to_shop (In Meters) fields. Assuming all the Shops belong to one retailer, I would like to find out the distribution of Transaction/People Visits to the other shops, by assigning weights/business rules on the basis of the distance. For Example, the weights can be given as; 0-200 Meters = 40% …

Topic: machine-learning-model simulation predictive-modeling r

Category: Data Science

Modeling uncertainty from Logistic Regression

Jan van der Vegt

2018年2月17日 14:38

Logistic regression is a part in a simulation pipeline that I use for some scenario analysis. The dataset that this is based on is not small but relatively noisy, and only one explanatory variable/feature. Of course I can say something about this uncertainty using frequentist or Bayesian methods but I would like to use this in the sequential simulation step as well, to get a fairer final estimate. What I'm planning on doing should work but is somewhat computationally expensive …

Topic: bayesian simulation logistic-regression

Category: Data Science

Testing Multi-Arm Bandits on Historical Data

Pavan Sangha

2018年1月5日 08:07

Suppose I want to test a multi-arm bandit algorithm in the contextual setting on a set of historical data. For simplicity, let's assume there are only two arms A and B and suppose the rewards are binary. Furthermore, suppose I have a data set where users were shown one of the two arms and I have a record of the rewards. What would be the best approach to simulating the scenario of running the algorithm online? I was thinking of …

Topic: randomized-algorithms simulation online-learning reinforcement-learning machine-learning

Category: Data Science

What visualization I should choose for Monte Carlo simulations in timeline events?

Tasos

2017年11月18日 01:52

I wasn't sure if I should open this question in Cross Validated or here. But since the question belongs to a bigger project related with Data Science, I chose this one. I will present a simplified version of my working project, since the original is too complicated and domain specific. Let's say that we have a timeline of 1 hour (60 minutes). During this period a job starts running and create user notifications in random points. I have written a …

Topic: monte-carlo simulation visualization

Category: Data Science

About