You are not logged in.

Dear visitor, welcome to Palo Community Forum. If this is your first visit here, please read the Help. It explains in detail how this page works. To use all features of this page, you should consider registering. Please use the registration form, to register here or read more information about the registration process. If you are already registered, please login here.

  • "Nick_Knatterton" started this thread

Posts: 2

Date of registration: Jun 6th 2012

  • Send private message

1

Wednesday, June 6th 2012, 10:14am

How to lookup a "range value"

Hello,

i Have a source table with thousands of products. Every product has an unique ID. In another table the products or grouped. Every group is represented by a range of product IDs. Now I want to join these tables like the SQL statement

MySQL queries

1
2
3
SELECT source.ProductID,productgroup.GroupID,... 
FROM source, productgroup 
WHERE source.ProductID >= productgroup.StartProductID AND source.ProductID <= productgroup.EndProductID
. Unfortunately the Join transform and the LookUp-function can only look for equal values -not for ranges - in the used tables and I see no way to use more than one table in a SELECT-clause at the extract-step in the ETL process. So I've tried to create a groovy function to solve my problem. It works basically but the peformance is so horrible that nobody can really use it - it runs more than one hour! (the source table has thousands of products and every product needs a loop in the group table).

For interested people: here is the code which is responsible for the horrible runtime:

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
PF=ProductID_Format; 
xGroupID="UNDEF"; 
processor = API.getProcessor("TR_Product_Basis_Field"); 
def row = processor.next(); 
while (row != null) { 
if (PF >= row.getColumn("StartProductID").getValueAsString() && PF <= row.getColumn("EndProductID").getValueAsString()) { 
xGroupID= row.getColumn("GroupID").getValueAsString(); 
break; 
}; 
row = processor.next(); 
}; 
processor.setFirstRow(0); 
return xGroupID;


Has anyone an idea how to solve my problem by a performant way? (By the way: I have no possibility to change the tables before starting the ETL process.)

Best regards

Posts: 217

Date of registration: Dec 2nd 2009

Location: Berlin

Hobbies: Born to work :-)

  • Send private message

2

Thursday, June 7th 2012, 9:37am

Hi,

I had the same problem (with start dates and end dates) and unfortunately there are no other possibilities than looping on the range of values (as you did). This is rather a classical issue in Business Intelligence.

For that I used Talend (sorry for the ad) that allows you to do that rather easily (using a tMap component)
laloune

"To understand recursion, one must understand recursion"

  • "Nick_Knatterton" started this thread

Posts: 2

Date of registration: Jun 6th 2012

  • Send private message

3

Thursday, June 7th 2012, 1:16pm

Hello laloune,

thank you for the answer very much. But I don't want to abondon my hope to solve the problem with Palo. Because of security circumstances in my company I'm not able to use third party software.

Maybe another expert has an idea. It would be a big step, if there is existing a trick to use more than one table in a SELECT-clause at the extract-step in the ETL process.

Best regards

Posts: 217

Date of registration: Dec 2nd 2009

Location: Berlin

Hobbies: Born to work :-)

  • Send private message

4

Thursday, June 7th 2012, 7:54pm

Hi Nick,

actually I may not have understood your problem very well.

do you have an example of the input data and what should go out ?
laloune

"To understand recursion, one must understand recursion"

Posts: 173

Date of registration: Jan 4th 2008

Location: Freiburg / Germany

  • Send private message

5

Friday, June 8th 2012, 2:10pm

There will be a simpler solution in a forthcoming ETL release - either with specific Function for Interval Lookup or enhancement to TableJoin transform. Thanks for this input.

ethssn

Master

  • "ethssn" is male

Posts: 54

Date of registration: Feb 16th 2012

Location: Hamburg

Occupation: BI-Consultant

  • Send private message

6

Friday, June 8th 2012, 4:50pm

Range by rounding ('better than nothing' - solution)

Hi,

I had the same problem and solved it roughly, maybe it helps you, too:

http://www.jedox.com/community/palo-foru…ight=#post14421

I wanted to lookup a target value out of a storage-cube for the complete amount of these values, based on a calculated value out of another cube. But the palo.data-formula just accepts a hardcoded reference or a !'xyz'-dimension reference, which i don't have due to the calculation source of my reference value. So I used the round function to streamline the calculated value (but then just with 1 digit after comma, less than desired) and made my reference data in the storage cube more detailed (more work than desired). The palo.data looks now like this:

Source code

1
2
PALO.DATA("DB_all","ReferenceStore",LEFT(!'Client',3),
     STR(ROUND(PALO.DATA("DB_all","Sales",!'Country',!'Fullfiller',!'Product',!'Date',!'Customer',!'Store',"M-Share"),1),3,1))


The M-Share is the calculated value. The STR() is needed, because the palo.data don't seem to accept it otherwise due to the number-type resulting from round. Like this it works...
If M-Share is calculated as 47.5678%, the reference becomes 47.6 and the ReferenceStore contains then a dimension element named (!) 47.6 and a cell value found by this (e.g. 97) to align the calculated value to a certain scale which users use to judge about the value.

Best regards,

Stefan

Rate this thread