You are not logged in.

Dear visitor, welcome to Palo Community Forum. If this is your first visit here, please read the Help. It explains in detail how this page works. To use all features of this page, you should consider registering. Please use the registration form, to register here or read more information about the registration process. If you are already registered, please login here.

tobia

Intermediate

  • "tobia" started this thread

Posts: 5

Date of registration: May 5th 2011

Location: Italy

  • Send private message

1

Thursday, May 5th 2011, 6:35pm

Best practices for importing a star schema

Hi

What are the best practices for loading a cube from a pre-existing star schema RDBMS?

In the ETL Server manual I can see various points where data can be normalized / de-normalized and in many cases it's not obvious what the best choice would be. For example one can de-normalize data using JOINs in the Relational extract, using the Lookup FieldTransform, the TableJoin Transform, and so on.

Does anybody have a sample, or best-practice ETL project that loads a star schema from a DB into Palo?

I'm using Palo CE 3.1

Posts: 163

Date of registration: Feb 4th 2009

Location: berlin

  • Send private message

2

Friday, May 6th 2011, 8:42am

Palo has also a star schema!
So for each Dim in your Relational DB, you schould create a Dim in Palo.
With your fact table you can fill your cube.

With Transformation-Functions you can setup your dimension how you like them.
sivgin

tobia

Intermediate

  • "tobia" started this thread

Posts: 5

Date of registration: May 5th 2011

Location: Italy

  • Send private message

3

Friday, May 6th 2011, 10:21am

Ok, let me restate my question.

I need to load a pre-made Mysql star schema into a Palo cube, exactly as it is.

Is it better to have:
  • only one Relational Extract, with left joins on all the tables in the schema; or
  • many Relational Extracts, one per table, with some kind of Transform (eg. Lookup or TableJoin) to join them?

axi

Sage

Posts: 661

Date of registration: Mar 5th 2009

Location: Germany / Bargteheide

Occupation: BI

  • Send private message

4

Friday, May 6th 2011, 10:43am

the ETL-Server is very good for E and L

I do not like the T that much

having said that I always craft one relational extract:

- less clutter in ETL-Server
- should be much faster
- easier to debug (IMO)

Posts: 173

Date of registration: Jan 4th 2008

Location: Freiburg / Germany

  • Send private message

5

Friday, July 29th 2011, 11:43am

If you have several Relational Extracts it's faster to use a Lookup-Function for each Dimension in a FieldTransform instead of the transform TableJoin.
Note: With the next release 3.3 the TableJoin will be fastened up with an in-memory variant.

Rate this thread