Issues with star schema

Discussion forum for PivotCube VCL users. Common questions. Bug and problem reports

Moderator: Alex Zencovich

Issues with star schema

Postby confusu » Thu Nov 23, 2006 5:36 am

My apologies if this issue has been covered before. I have a problem setting up a cube for use with a datawarehouse using a star schema approach. Whether this problem is conceptual, programmer error, or a bug with the component I'm not sure. That is what I hope someone here can answer.

For purposes of this post I will limit this example to one fact table and one dimension table:

Code: Select all
  Table "A"
  ----------------
  B_ID
  SOME_VALUE

  Table "B"
  ----------------
  ID
  SOME_TEXT


The cube is configured such that the fact table is table "A", the measurement is A.SOME_VALUE, and there is a single dimension linked by A.B_ID to B.ID and the displayed value for the dimension is B.SOME_TEXT. The data in table "B" has many more fields and SOME_TEXT on its own is not unique. IOW, this is not a fully normalized database.

When activated the cube displays all records from the fact table where B_ID is unique. Records with the same SOME_TEXT value are not grouped together and are shown separately on their own lines.

However, if I configure the cube to use a single query instead of the star schema with a query such as this:

Code: Select all
  select SOME_TEXT, SOME_VALUE
    from A inner join B on A.B_ID = B.ID


... and I configure the measurement to be SOME_VALUE (same as before) and a single dimension to be SOME_TEXT then all records where SOME_TEXT is the same are grouped together. This is the behavior I hope to achieve via the star schema approach.

The database design is not in the control of our company and our software must be able to use the star schema approach to fulfill the requirements.

Is there some property that I must set in order for the cube to group values together in this manner? If not, is there a reason that records are not grouped (aside from performance reasons)?

Thank you for your time,
- Clayton
confusu
Guest
 
Posts: 3
Joined: Wed Nov 22, 2006 11:30 pm

Postby AlexZencovich » Thu Nov 23, 2006 7:23 am

1 Just workround

You may set TPivotCube.Dimension[].EmptyItems = deiRemoveFromCube so all unused dimension items will removed after cube build, so you will not see(mostly) duplicate lines in dimension.

2.
But if B.SOME_TEXT values duplicated, you may expect if some day you (or end-user) will found in grid or report lines like

Mr.A 100000
Mr.A 200000

even if there different ID for Mr.A and Mr.A nobody will know difference.
Best regards,

Alex Zencovich
-------------------------------------------------------------------------------------------------------------------------------------------
www.pivotcube.com - OLAP solution for Delphi, C++ Builder and ActiveX environment
AlexZencovich
Site Admin
 
Posts: 580
Joined: Sun Jun 18, 2006 10:09 am

Postby confusu » Thu Nov 23, 2006 9:58 am

Thank you for your reply Alex.

I tried your suggestion however there was no visible change. I am also a little confused how EmptyItems should relate to the problem that I am having. To attempt to clear up any confusion for what I'm trying to see in the cube here is some example data with some changes to table "B". Instead of SOME_TEXT I am going to now call it "GENDER" and I've added a new field to make the example more real:

Code: Select all
Table "A"
B_ID  SOME_VALUE
----- ----------
1     24
2     18
3     100
4     13

Table "B"
ID    GENDER  INCOME
----- ------- --------------------
1     Female  Less than 20,000
2     Female  20,000 to 50,000
3     Female  50,000 to 100,000
4     Male    Less than 20,000


With the measurement set to calculate via "summa" and the dimension added to the dimension rows then the cube should show the following results:

Code: Select all
GENDER  SOME_VALUE
Female  142
Male    13


Instead, this is what I'm seeing:

Code: Select all
GENDER  SOME_VALUE
Female  24
Female  18
Female  100
Male    13


The puzzling thing to me is that this all works using the table schema approach and I expected it to work the same way using star schema. Is that assumption wrong?

Thanks again,
- Clayton
confusu
Guest
 
Posts: 3
Joined: Wed Nov 22, 2006 11:30 pm

Postby AlexZencovich » Thu Nov 23, 2006 11:12 am

Yes, your assumption wrong. There no affect on string captions because you set links (foreighn key or FiledName->KeyField link) over ID. And ID are diffrent for 3 "Female" - 1,2,3 so engine could not group them anyway.

Mostly you may said you work with 'prebuild' snowflake schema, not star. If you will try to nomalize that database you should get something like

A
B_ID Some_Value

B
B_ID, GENDER_ID,INCOME_ID

C
GENDER_ID,GENDER_CAPTION

D
INCOME_ID, INCOME_LABEL


So, there are no way to get correct result except male link not over ID's bu over text captions - like you did when you work with Table scheme. Internally Table schema may look like Star but with links over Text fields, without ID's
Best regards,

Alex Zencovich
-------------------------------------------------------------------------------------------------------------------------------------------
www.pivotcube.com - OLAP solution for Delphi, C++ Builder and ActiveX environment
AlexZencovich
Site Admin
 
Posts: 580
Joined: Sun Jun 18, 2006 10:09 am


Return to PivotCube VCL

Who is online

Users browsing this forum: No registered users and 5 guests

cron