Model Parallelism not working in Inception v3 with Keras and TensorFlow

I have been stuck with a problem like this for a while now. I have an AWS setup with 500 GB of RAM and about 7 GPUs. Now the issue is that each time I try to run my Keras with TensorFlow as back-end code, it runs out of memory. I have found out the reason for this as well. The reason is that each GPU just has 12GB of memory, whereas my model needs more than that. So, how …
Category: Data Science

How to vectorize this loop process

Hi guys I want to ask if anyone knows how to vectorize this code to make it more optimal and faster. loss = 0 total_steps = 0 for i in range(len(distances)): for j in range(len(distances)): for k in range(len(distances)): if not ((i == j) | (i == k) | ( j==k )): if similarities[i][j] >= similarities[i][k]: loss += (distances[i][j] - distances[i][k]).clip(min=0) else: loss += (distances[i][k] - distances[i][j]).clip(min=0) total_steps +=1 return (loss/total_steps)
Category: Data Science

Specifying number of threads using XGBoost.train

When using the xgboost.train() function, all the threads are used. I would like to use a specific amount. Unfortunately, this function does not accept the parameters nthread nor n_jobs. How can I control the number of threads being used? Thanks. // Edit It seems that I found a solution. In contrast with the method, how one provides the nthread (or n_jobs) parameter to XGBClassifier of XGBRegressor, by adding this parameter directly to the brackets as xgb.XGBRegressor(nthread=n) then as indicated on …
Category: Data Science

How to loop through multiple lists/dict?

I have the following code which finds the best value of k parameter in the KNNImputer. Basically it is looping through the list of k_value and for each element, it is fitting the KNNImputer to the model and in the end appending the result to an empty dataframe. lire_model = LinearRegression() k_value = [1,3,5,7,9,11, 13, 15, 17, 19, 21] k_value_results = pd.DataFrame(columns = ['k', 'mse', 'rmse', 'mae', 'r2']) scoring_list = ['neg_mean_squared_error', 'neg_root_mean_squared_error', 'neg_mean_absolute_error', 'r2'] for s in k_value: imputer = …
Category: Data Science

Methodology for parallelising linked data?

If I have some form of data that can have inherent links to all other data in the set but I wish to parallelise out this data in order to increase computation time or to reduce the size of any particular piece of data currently being worked on, is there a methodology to split this out into chunks without reducing the validity of the data? For example, assume I have a grid of crime across the whole of a country. …
Category: Data Science

How to load and run feature selection on a dataset with 5,000 samples and 500,000 features?

I have a dataset with 5000 samples and 500,000 features (all categorical with a cardinality of 3). Two problems I'm trying to solve: Loading the dataset - I can't load it in memory despite using a computing cluster, so I'm assuming I should use a parallelization library like Dask, Spark, or Vaex. Is this the best idea? Feature selection - how to run feature selection within a parallelization library? Can this be done with Dask, Spark, Vaex?
Category: Data Science

Parallel hyperparameter optimization techniques?

Most hyperparameter optimization technique want to evaluate points one by one. I have an expensive optimization problem, but i can run hundreds of evaluations in parallel. The dimension of the problem is around 20-30. My variables are mostly continuous. Is there any technique with open source, documented implementation available for this kind of problem?
Category: Data Science

Efficiently Sending Two Series to a Function For Strings with an application to String Matching (Dice Coefficient)

I am using a Dice Coefficient based function to calculate the similarity of two strings: def dice_coefficient(a,b): try: if not len(a) or not len(b): return 0.0 except: return 0.0 if a == b: return 1.0 if len(a) == 1 or len(b) == 1: return 0.0 a_bigram_list = [a[i:i+2] for i in range(len(a)-1)] b_bigram_list = [b[i:i+2] for i in range(len(b)-1)] a_bigram_list.sort() b_bigram_list.sort() lena = len(a_bigram_list) lenb = len(b_bigram_list) matches = i = j = 0 while (i < lena and j …
Category: Data Science

Is there a straightforward way to run pandas.DataFrame.isin in parallel?

I have a modeling and scoring program that makes heavy use of the DataFrame.isin function of pandas, searching through lists of facebook "like" records of individual users for each of a few thousand specific pages. This is the most time-consuming part of the program, more so than the modeling or scoring pieces, simply because it only runs on one core while the rest runs on a few dozen simultaneously. Though I know I could manually break up the dataframe into …
Category: Data Science

What needs to be done to make n_jobs work properly on sklearn? in particular on ElasticNetCV?

The constructor of sklearn.linear_model.ElasticNetCV takesn_jobs as an argument. Quoting the documentation here n_jobs: int, default=None Number of CPUs to use during the cross validation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details. However, running the below simple program on my 4 core machine (spec details below) shows performance is best when n_jobs = None, progressively deteriorating as you increase n_jobs all the way to n_jobs = -1 (supposedly requesting all …
Category: Data Science

Pytorch Distributed Computing - Recomendations/Resources/Courses?

I would like to get into some distributed computing for processing Pytorch CNN models. I am completely fresh in this field and want to get some recommendations as to where I should start researching and learning techniques in distributed computing specifically for Deep Learning. My motivation is that I have access to a lot of personal Windows 10 Desktops with great hardware, a few Ubuntu Linux machines of my own and then my personal desktop that is rigged with great …
Category: Data Science

Parallelization of a MIMO linear filter

I would like to implement a Multi Input Multi Output filtering operation, acting as fast as possible on batches of data. Here is my current implementation: def lfilter_mimo(b, a, u_in): batch_size, seq_len, in_ch = u_in.shape # [B, T, I] out_ch, _, _ = a.shape y_out = np.zeros_like(u_in, shape=(batch_size, seq_len, out_ch)) for out_idx in range(out_ch): for in_idx in range(in_ch): y_out[:, :, out_idx] += scipy.signal.lfilter(b[out_idx, in_idx, :], a[out_idx, in_idx, :], u_in[:, :, in_idx], axis=-1) return y_out # [B, T, O] For another …
Category: Data Science

Multiple keras models parallel - time efficient

I am trying to load two different keras models in parallel. I tried to use the functional API model: input1 = Input(inputShapeOfModel1) input2 = Input(inputShapeOfModel2) output1 = model1(input1) output2 = model2(input2) parallelModel = Model([input1,input2], [output1,output2]) This works but it does not run in parallel actually. Inference time is just the sum of each model's individual inference time. My question is should this run concurrently? I also tried to load them in different py files with gpu memory options. Still I …
Category: Data Science

Parallel active optimization

I'm trying to optimize an expensive function for which I can choose sample points. The difficulty is that many function evaluations may be computed in parallel, taking varying amounts of time. I don't know which keywords to search for to find existing literature(/implementations). So at a time, I might have already computed function values at 18 points, with 15 still being computed, and I want to start evaluating the function another point. Without the running jobs, I could make a …
Category: Data Science

Would writing a decision tree algorithm in Pytorch or Tensorflow be faster than with Numpy?

Since these libraries can turn CPU arrays into GPU tensors, could you parallelize (and therefore accelerate) the calculations for a decision tree? I am considering making a decision tree class written in Tensorflow/Pytorch for a school project, but I want to be certain that it makes sense.
Category: Data Science

GPU Accelerated Data Processing for R in Windows

I'm currently taking a paper on Big Data which has us utilising R heavily for data analysis. I happen to have a GTX1070 in my pc for gaming reasons. Thus, I thought it would be really cool if I could use that to speed up some of the processing for some of the stuff my lecturers have me doing, but it really doesn't seem easy to do this at all. I've installed gpuR, CUDA, Rtools, and a few other bits …
Topic: gpu parallel r
Category: Data Science

CUDA 8.0 is compatible with my GeForce GTX 670M Wikipedia says, but TensorFlow rises an error: GTX 670M's Compute Capability is < 3.0

According to Wikipedia, the GeForce GTX 670M has a Compute Capability of 2.1 (and a Fermi micro-architecture), which is confirmed by TensorFlow (I can read &quot;2.1&quot; in the error it rises). Wikipedia says that CUDA 8.0 supports compute capabilities from 2.0 to 5.x (Fermi micro-architecture included). It even says that it's the &quot;last version with support for compute capability 2.x (Fermi)&quot;. However, the error rised by TensorFlow says that my being-used CUDA version support at least compute capability of... 3.0... …
Category: Data Science

Updating Weight Using Updates on Related Data

Suppose $$ x=Ay $$ The $x$ is $M\times 1$, $y$ is $N \times 1$ and $A$ is $M\times N$ We have the data $x$ and would like to know what $y$ is. However, the matrix $A$ is too large for pseudo-inverse. And thus we would like to approximate $A^{-1}$ using machine learning as it is possible to parallelize it. Here for parallelization, we divide the given problem into: $$ x^l = A^l y $$ where $x = [x^1 , x^2,\dots,x^L]^T$ …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.