Monday, July 24, 2017

Redshift Spectrum



http://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum-query-s3-data.html



Dabbling with Redshift Spectrum






Keep your larger fact tables in Amazon S3 and your smaller dimension tables in Amazon Redshift, as a best practice. If you loaded the sample data in Getting Started with Amazon Redshift, you have a table named EVENT in your database. If not, create the EVENT table by using the following command.


Amazon Redshift uses massively parallel processing (MPP) to achieve fast execution of complex queries operating on large amounts of data. Redshift Spectrum extends the same principle to query external data, using multiple Redshift Spectrum instances as needed to scan files. Place the files in a separate folder for each table.