sql server - MS-Sql Selecting from Horizontally Partitioned Tables -
i have horizontally partitioned table system, using check-constraints on date_key reference date yyyymmdd integer version of date (so check-constraints between yyyy0101 , yyyy1231).
i have view uses union all tables.
if execute
select * mydatedtable dt inner join mydates m on dt.date_key = md.date_key , md.date_key = 20120115 the optimizer "knows" scan , read correct 2012 table (or index), , ignores other tables unioned.
however
if use lookup-value in mydates table (for example year) not use check-constraint on key-relationship, ie:
select * mydatedtable dt inner join mydates md on dt.date_key = md.date_key , md.year = 2012 , md.month = 1 , md.day = 15 (the optimizer "knows" 0 rows come tables outside range, show needs check index...)
is there way ms-sql (2012) optimize correctly?
assuming existence of following objects (two tables , 1 view):
create table dbo.mydates2013 ( date_key int primary key, check (date_key between 20130101 , 20131231), [year] smallint not null, [month] tinyint not null, [day] tinyint not null ); insert dbo.mydates2013 (date_key, [year], [month], [day]) values (20130101, 2013, 1, 1); create table dbo.mydates2014 ( date_key int primary key, check (date_key between 20140101 , 20141231), [year] smallint not null, [month] tinyint not null, [day] tinyint not null ); insert dbo.mydates2014 (date_key, [year], [month], [day]) values (20140101, 2014, 1, 1); go create view dbo.my dates select * dbo.mydates2013 union select * dbo.mydates2014; go the following query
select * dbo.mydates md md.date_key = 20140115; is (indeed) optimized sql server, execution plan

including 1 index seek (on primary key of dbo.mydates2014) because @ compile time sql server knows date_key = 20140115 can within 1 base table: dbo.mydates2014. possible because of check constraint defined on dbo.mydates2014: check (date_key between 20140101 , 20141231).
the next query
select * dbo.mydates md md.[year] = 2014 , md.[month] = 1 , md.[day] = 15; go is different

because:
{1} there no indexes on year, month , day columns within every table (and causes clustered index scans) because
{2} sql server read both tables used dbo.mydates view. happens because doesn't know correlation between date_key values , [year], [month], [day] values , (i suppose) can't infer constraint check (date_key between 20140101 , 20141231) , new rule/constraint [year] = 2014.
solution #1:
so, 1 solution add these constraints within every table:
alter table dbo.mydates2013 add constraint ck_mydates2013_year check ( [year] = 2013 ); go alter table dbo.mydates2014 add constraint ck_mydates2014_year check ( [year] = 2014 ); go now, execution plan includes 1 scan: clustered index scan on dbo.mydates2014:

this way solved problem #2. #1 need indexes.
solution #2:
another solution translate md.[year] = 2014 , md.[month] = 1 , md.[day] = 15 predicates md.date_key = 20140115. following example use recompile query hint force sql server generate execution plan optimized every execution (for current values of parameters):
declare @year smallint, @month tinyint, @day tinyint; select @year = 2014, @month = 1, @day = 15; select * dbo.mydates md md.date_key = (@year * 100 + @month) * 100 + @day option(recompile) go 
thus sql server remove unnecessary index seeks / index scans operators (for example index seek on dbo.mytable2013 when @year = 2014).
even without option(recompile)
declare @year smallint, @month tinyint, @day tinyint; select @year = 2014, @month = 1, @day = 15; select * dbo.mydates md md.date_key = (@year * 100 + @month) * 100 + @day you performance because execution plan includes filter operators prevent unnecessary reads (index seek/scan):

note #1: should test these solutions before choosing 1 of them (if).
note #2: have used sql server 2012.
Comments
Post a Comment