sql server - MS-Sql Selecting from Horizontally Partitioned Tables -
i have horizontally partitioned table system, using check-constraints on date_key reference date yyyymmdd integer version of date (so check-constraints between yyyy0101 , yyyy1231).
i have view uses union all tables.
if execute
select * mydatedtable dt inner join mydates m on dt.date_key = md.date_key , md.date_key = 20120115
the optimizer "knows" scan , read correct 2012 table (or index), , ignores other tables unioned.
however
if use lookup-value
in mydates
table (for example year
) not use check-constraint
on key-relationship
, ie:
select * mydatedtable dt inner join mydates md on dt.date_key = md.date_key , md.year = 2012 , md.month = 1 , md.day = 15
(the optimizer "knows" 0 rows come tables outside range, show needs check index...)
is there way ms-sql (2012) optimize correctly?
assuming existence of following objects (two tables , 1 view):
create table dbo.mydates2013 ( date_key int primary key, check (date_key between 20130101 , 20131231), [year] smallint not null, [month] tinyint not null, [day] tinyint not null ); insert dbo.mydates2013 (date_key, [year], [month], [day]) values (20130101, 2013, 1, 1); create table dbo.mydates2014 ( date_key int primary key, check (date_key between 20140101 , 20141231), [year] smallint not null, [month] tinyint not null, [day] tinyint not null ); insert dbo.mydates2014 (date_key, [year], [month], [day]) values (20140101, 2014, 1, 1); go create view dbo.my dates select * dbo.mydates2013 union select * dbo.mydates2014; go
the following query
select * dbo.mydates md md.date_key = 20140115;
is (indeed) optimized sql server, execution plan
including 1 index seek
(on primary key of dbo.mydates2014
) because @ compile time sql server knows date_key = 20140115
can within 1 base table: dbo.mydates2014
. possible because of check
constraint defined on dbo.mydates2014
: check (date_key between 20140101 , 20141231)
.
the next query
select * dbo.mydates md md.[year] = 2014 , md.[month] = 1 , md.[day] = 15; go
is different
because:
{1} there no indexes on year
, month
, day
columns within every table (and causes clustered index scan
s) because
{2} sql server read both tables used dbo.mydates
view. happens because doesn't know correlation between date_key
values , [year], [month], [day]
values , (i suppose) can't infer constraint check (date_key between 20140101 , 20141231)
, new rule/constraint [year] = 2014
.
solution #1:
so, 1 solution add these constraints within every table:
alter table dbo.mydates2013 add constraint ck_mydates2013_year check ( [year] = 2013 ); go alter table dbo.mydates2014 add constraint ck_mydates2014_year check ( [year] = 2014 ); go
now, execution plan includes 1 scan
: clustered index scan
on dbo.mydates2014
:
this way solved problem #2. #1 need indexes.
solution #2:
another solution translate md.[year] = 2014 , md.[month] = 1 , md.[day] = 15
predicates md.date_key = 20140115
. following example use recompile
query hint force sql server generate execution plan optimized every execution (for current values of parameters):
declare @year smallint, @month tinyint, @day tinyint; select @year = 2014, @month = 1, @day = 15; select * dbo.mydates md md.date_key = (@year * 100 + @month) * 100 + @day option(recompile) go
thus sql server remove unnecessary index seek
s / index scan
s operators (for example index seek
on dbo.mytable2013
when @year
= 2014
).
even without option(recompile)
declare @year smallint, @month tinyint, @day tinyint; select @year = 2014, @month = 1, @day = 15; select * dbo.mydates md md.date_key = (@year * 100 + @month) * 100 + @day
you performance because execution plan includes filter
operators prevent unnecessary reads (index seek
/scan
):
note #1: should test these solutions before choosing 1 of them (if).
note #2: have used sql server 2012.
Comments
Post a Comment