Coding a Content Addressable Memory on a GPU

I´m trying to code a CAM or more simply a dictionary storing the pointer of the data accessible by a key. I try to do it with a GPU but all attempts have been inefficient compared on using System.Collections.Generic.Dictionary. Does anybody know how to implement this with CUDA to obtain a better performance than with a CPU?

Topic dynamic-programming gpu clustering

Category Data Science


I similarity matched a billion 8 character strings at about 1.5fps on a gtx980 - is that fast enough for you?

similarity matching -> do it very block list like, and charge through all the patterns in the whole corpus/memory, match in lots of 32 bits and get the similarity score by getting the pop sum is the fastest way to do it I think. (gpu or cpu)

exact matching -> you get a lot more if you exact match because it needent access the whole memory per access, its approximately a root more optimized, put it in a tree, reading is simple, just go node to node, but writing is a problem, youll get thread conflicts if it tries to append the tree twice at the same place.

Interlocked operations make it alot simpler to do, but its easier if you read in parallel and just dont bother to parallel the write, unless its a necessity.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.