Hi,
I have what I imagine is a pretty common data warehouse scenario. There is a table containing items by business date and source system id. There are a number of source systems and the data is loaded and stored for each business date.
Items from the same source system with the same name are allocated the same surrogate key so that an item can be tracked over business dates. So itemID is not a unique key.
The table looks roughly like this: Item (Name varchar, ItemID int, SourceID int, BusinessDate datetime)
The table is not currently partitioned, but it would make a lot of sense to partition it on business date.
Most of the select queries will restrict or group on BusinessDate and SourceID
I am assuming that the clustered index should reflect the logical tree structure of the data and so the column order should be:
BusinessDate, SourceID, ItemID
However, I have also read that wide index keys shold have the most selective colum first, which would be ItemID.
Does anyone have any recommendations on the index column order and why ?