Dear visitor, welcome to Palo Community Forum.
If this is your first visit here, please read the Help. It explains in detail how this page works.
To use all features of this page, you should consider registering.
Please use the registration form, to register here or read more information about the registration process.
If you are already registered, please login here.
Best practices for importing a star schema
Hi
What are the best practices for loading a cube from a pre-existing star schema RDBMS?
In the ETL Server manual I can see various points where data can be normalized / de-normalized and in many cases it's not obvious what the best choice would be. For example one can de-normalize data using JOINs in the Relational extract, using the Lookup FieldTransform, the TableJoin Transform, and so on.
Does anybody have a sample, or best-practice ETL project that loads a star schema from a DB into Palo?
I'm using Palo CE 3.1
Palo has also a star schema!
So for each Dim in your Relational DB, you schould create a Dim in Palo.
With your fact table you can fill your cube.
With Transformation-Functions you can setup your dimension how you like them.
sivgin
Ok, let me restate my question.
I need to load a pre-made Mysql star schema into a Palo cube, exactly as it is.
Is it better to have:
- only one Relational Extract, with left joins on all the tables in the schema; or
- many Relational Extracts, one per table, with some kind of Transform (eg. Lookup or TableJoin) to join them?
the ETL-Server is very good for E and L
I do not like the T that much
having said that I always craft one relational extract:
- less clutter in ETL-Server
- should be much faster
- easier to debug (IMO)
If you have several Relational Extracts it's faster to use a Lookup-Function for each Dimension in a FieldTransform instead of the transform TableJoin.
Note: With the next release 3.3 the TableJoin will be fastened up with an in-memory variant.