redshift

To datawarehouse or not to data warehouse?

Yonathan Mizrachi

2022年5月25日 06:05

I was wondering if you will be as so kind to assist me with a quick question (will to be happy to explain more if you are willing to...). I am researching and setting up a system to do a machine learning job (training) to find correlations between Social Media (or other digital trails from wearables etc.) information of a user and his scores on personality tests. The scores are in my Postgresql (on AWS) and I need to decide …

Topic: redshift

Category: Data Science

Out of Memory Error when Selecting Data from Redshift Table

TigSh

2021年6月19日 22:04

I am selecting data from Amazon Redshift Table with 500 millions rows. I have 64bit python installed. code import psycopg2 from sqlalchemy import create_engine import pandas as pd engine = create_engine('postgresql://'username':pwd@host/dbname') data_frame = pd.read_sql_query('SELECT * FROM table_name ;', engine) Everytime I run the code I get a "Out of Memory error". I have 16gb Ram. I am not sure how to resolve this issue. Would really appreciate any help on this! Thanks

Topic: redshift python

Category: Data Science

Finding change maximum change in the value using Redshift

Pavan Kumar

2018年12月2日 12:27

Following is the problem I want to solve. But I don't know how to implement it. I am using Redshift to store data. Following is the format of the data stored in Redshift. It is sales history for every product for all year by month. ProductId Year Month Sales A 2018 1 ... A 2018 2 ... A 2018 3 ... A 2018 4 ... A 2018 5 ... B 2018 1 ... B 2018 2 ... B 2018 3 …

Topic: data-analysis redshift

Category: Data Science

Big Data - Data Warehouse Solutions?

user1157751

2018年1月11日 09:41

I have a dozen of databases that stores different data, and each of them are 100TBs in size. All of the data is stored in AWS services such as RDS, Aurora and Dynamo. Many times I find myself need to perform "joins" across databases, for example a student ID that appears in multiple databases with data that I want to gather. The joins are usually done after data is streamed out of the database, since the data is not located …

Topic: redshift bigdata databases

Category: Data Science

Using regex in redshift to find dollar values

ScottieB

2017年3月15日 07:49

I have a field in a Redshift table that has user-generated text. The field is where users can say how much they think something costs. Ideally it'd just be a decimal, but it's varchar. So users can type "I think this is worth \$25", or "I'd pay 55" or "\$117". So I'm trying to use regexp_substr to pull this out. Specifically regexp_substr(f.comment_text, '\\$?[0-9]*'). But this doesn't work on a subset of entries for some reasons (eg Could do for $115). …

Topic: redshift regex

Category: Data Science

To datawarehouse or not to data warehouse?

Out of Memory Error when Selecting Data from Redshift Table

Finding change maximum change in the value using Redshift

Big Data - Data Warehouse Solutions?

Using regex in redshift to find dollar values

About