Are there decisive leaders in programming with tabular data?

Question

Are there decisive leaders in programming with tabular data?

Monolithguy

2021年1月16日 04:08

What are the most effective bread-and-butter in-memory open source tabular data frameworks today? I have been working with tabular data for years with an in-house solution that integrates with Excel well, but falls short of many other expectations. I would like to (if possible/true) demonstrate that our solution has fallen behind the times.

In other words, assuming an SQL-like platform is responsible for persistence of a data set, but cycle intensive calculations need to be performed on that dataset (E.g. stochastic simulation processes), an efficient framework to program in is . Advantageous features, to give an idea

efficient use of memory for common operations like random access, sorting, map/reduce/filter.
performant when serializing and deserializing
offers good expressiveness or extensibility
Isn't bound by oppressive commercial licensing agreements

The dataframe of pandas is the best product I can find in the community, but it is a mixed bag as far as performance is concerned. Matlab comes to mind but is excessively commercialized to the point that using it for most distributed applications on homegrown cloud application becomes a nightmare.

Topic data-table sql pandas nosql

Category Data Science

Are there decisive leaders in programming with tabular data?

About