I want to implement an FPGA code or hardware code of a Keras model. As a first step, I want to find the number of mathematical operations required to evaluate a predicted output given a model. The model below is a two-class classifier and a sample of input is a vector of size 232X1. The model is: model.add(keras.layers.Dense(5, input_dim=232, activation='relu')) model.add(keras.layers.Dense(1, activation='sigmoid')) The question is given in the model above, how many mathematical operations (plus, minus, multiplication, division) are required …
I need to detect objects from multiple video streams at realtime (or close to it, like 10 FPS). How many GPUs do I need to detect objects using YOLOv3 or MobileNet for, say, 10 video streams? Is it possible to use CPU or something else? I don't need an exact number. I just need to understand scalability perspective and costs per single stream.
I have 2 questions: how would you approach the storage of datasets (with millions of small files) on local network? And how would you take that into account in pytorch code? Hello, I need to store large datasets (could be in the TB) on local network. However, training using a dataset on the 3 different NAS servers I tested on was consistently 4 time longer while GPU usage was 25% on average, I guess that's because the GPU isn't fed …
I know this question is very vendor specific and as time passes it might change but I am wondering how NVIDIA available GPU cards nowadays (2022) are restricted in any way license wise or hardware wise to be used for training and interference? Is it prohibited to use these cards in production systems? For example there are several RTX 3060 Gaming cards available in shops. Is it allowed to use these for AI? Side question: Is there any CUDA restriction …
A paper, Survey and Benchmarking of Machine Learning Accelerators, mentions Conversely, pooling, dropout, softmax, and recurrent/skip connection layers are not computationally intensive since these types of layers stipulate datapaths for weight and data operands What does this exactly means, stipulate datapaths for weight and data operands? What are this specific datapaths, how are they stipulated? Those operations are compared to fully connected and conv layers, which might benefit more from dedicated AI accelerators, Overall, the most emphasis of computational capability …
Oftentimes even moderate size models, such as DeepMind AlphaFold2 (which requires 20GB RAM) can't fit into the video RAM (such as TPUv3 with 16GB RAM) and have to re-calculate activations during the backward pass, essentially sacrificing FLOPS for RAM. Which leads to my question on video RAM. Nowadays regular RAM is the cheapest part of your server, I worked with machines with Terabytes of DRAM and basically ignored the cost of RAM in the bottom line compared to the costs …
I am currently dealing with a relatively big data set, for which I have some memory usage concerns. I am dealing with most of the different data types : floats, integers, Booleans, characters strings for which some were changed to factors. From a memory usage (RAM) standpoint it is pretty clear what is happening when I switch column type from float64 to float32: the memory usage is divided by two. For string conversion to factor, it is a bit less …
I trained a model and froze it into a PB (protocol buffer) file and a directory of some variables, and the total size is about 31M. We deployed it using a GPU card and followed this answer and set the per_process_gpu_memory_fraction to a very little number to make the memory to be about 40M. The program performs very well but when we check the GPU usage by nvidia-smi which shows that the memory usage is about 500M. Then my question …
I've always wondered with flashlight apps, especially ones that use the LED light(s): Is there any risk of shortening the life of the LEDs? I noticed one app ("ASettings") gives a warning that doing so can "harm your phone"... which to me sounds even more ominous than burning out the bulb. I thought LEDs had a crazy long lifespan to begin with, so do I need to worry?
I just started to learn the deep learning in my free time. I was hoping to buy a laptop where I want to implement some small(alexnet) to medium(GoogleNet) networks maybe something bigger. I searched for the GPU. Everyone is suggesting minimum RTX 2060. However, it's vague about the CPU. I heard people saying that it's better to have more number of core in CPU. All I understand CPU only do the data pre-processing. What I want to know is it …
There is a project which contains models in DLC format (Snapdragon Neural Processing Engine - SNPE) that I guess are optimized for the Qualcomm Snapdragon 820 chipset (see second link). The project newly also introduced models in Keras format. My question is - will the Keras model run on Google Coral with Edge TPU? If so, how fast? Is it needed to optimize the model for TPU?
[I strongly agree this is totally very opinionated question, thus narrators feel free to vote to close it if you feel it is right, but I find endless pros and cons on the Internet, I've decided to ask the community here.] Surface Pro 6 or Macbook Pro for Data Scientist Job? About 8 years ago I was a Windows user. The most annoying part was that it was a quite unstable. It is noted that I was not a developer …
I have noticed that more often when training huge Deep Learning models on consumer GPUs (like GTX 1050ti) The network often doesn't work. The reason is that the GPU just doesn't have enough memory to train the said network. This problem has solutions like using the CPU's cache memory for storage of things that are not being actively being used by GPU. So my question is - Is there any way to train models on CUDA with memory drawn from …
I need to put together a proposal for buying a better computer for machine learning. Is there any good way to estimate the general training speeds of computer hardware? Basicly, I want to be able to say that if we purchase a certain computer it in trains x times faster than our current computers. Currently I have a i5 63000U 2.4ghz 2 core cpu, with no gpu. I know switching to a computer with a gpu will speed things up …
Currently, I learn deep neural networks on my CPU (i7-6700K) using TensorFlow without AVX2 enabled. The networks need about 3 weeks to be learned. Therefore, I am searching for a (cheap) way to speed up this process. Is it better to compile TensorFlow enabling AVX2 or to buy a cheap[1] GPU like the GeForce GTX 1650 Super (about 180€ and 1408 CUDA cores)? What is the estimated performance gain of using a cheap[1] GPU? [1] Cheap compared to current top …
I mean all possible work with tagging data: GIS, tagging data pre-processing, visualisation, different types of modelling, simulations and modern analysis. I think it will be about 40 tagged animals and maximum 30-40 thousand total locations to work with. The tools that I will probably use are described here: https://www.movebank.org/panel_software
I'm writing an application for a project where we intend to teach a model to predict one aspect of an environment (traffic safety) using a database with 10 images (about 300x300px and, say, 256 colors) for each of either 100 000 or 15 million locations. I must come to grasp with if both, one or none of these projects are feasible with our hardware constraints. What can I expect? Is there some formula or benchmark that one can refer to? …
I see many Data Science (DS) tutorials done on MACS, and many DS blogs recommend MACS as the best developing platform, thus the quote "Data Science is statistics on a Mac" came more than once into my mind. I'm quite fascinated by MACS (be it iMAC or MacbookPro) but I never could get a valid reason to why Data Scientists in particular use them (if that is true, of course) All who I asked said: "because it has Unix", but …
I would like to know what type of computer configuration to use to perform datascience on medium to low quantity of data (< 100 000 matrix line x 50 parameters). I have read some blog on people building their own system, but I am performing regressions essentially. Sklearn is quite well designed to me and 1 GPU with cuda would be enough for the rest. I am surprise to read no feedback about "standard" computer designed for gaming. They are …
I've found that because of recession in crypto mining business a lot of mining rigs are available on sale in pretty reasonable prices. For around $1000 we can buy used machine. Will rig like this work fine for machine learning: 6x NVIDIA GTX 1070 8GB Intel Celeron G3900 120GB SSD DDR4 8GB Asus Z 170 Pro Gaming We are a small development company and looking for good priced rig efficient enough to teach some models on it. Such hardware seems …