nested excel functions with conditional logic -
just getting started in excel , working database extract need count values if items in column unique.
so- below starting point:
=sumproduct(countif(c3:c94735,{"sharable content object reference model 1.2","authored scorm/aicc content","authored external web content"}))
what i'd figure out syntax this-
=sumproduct (countif range1 criteria..., range2 criteria="is unique value")
am getting right? syntax bit confusing, , i'm not sure i've chosen right functions task.
i had solve same problem week ago.
this method works when can't sort on grouping column (j in case). if can keep data sorted, @miked 's solution scale better.
firstly, know frequency trick counting unique numbers? frequency designed create histograms. takes 2 arrays, 'data' , 'bins'. sorts 'bins', creates output array that's 1 longer 'bins'. takes each value in 'data' , determines bin belongs in, incrementing output array accordingly. returns array. here's important part: if value appears in 'bins' more once, 'data' value meant bin goes in first occurrence. trick use same array both 'data' , 'bins'. think through, , you'll see there's 1 non-zero value in output each unique number in input. note counts numbers.
in short, use this:
=sum(sign(frequency(<array>,<array>)))
to count unique numeric values in <array>
from this, need construct arrays containing numbers appropriate , text elsewhere.
in example below, i'm counting unique days when color red , fruit citrus:
this conditional array, returning 1 or true rows i'm interested in:
($a$2:$a$10="red")*isnumber(match($b$2:$b$10,{"orange","grapefruit","lemon","lime"},0))
note requires ctrl-shift-enter
used array formula.
since value i'm grouping uniqueness text (as yours), need convert numeric. use:
match($c$2:$c$10,$c$2:$c$10,0)
note requires ctrl-shift-enter
so, array of numeric values within i'm looking uniqueness:
if(($a$2:$a$10="red")*isnumber(match($b$2:$b$10,{"orange","grapefruit","lemon","lime"},0)),match($c$2:$c$10,$c$2:$c$10,0),"")
now plug uniqueness counter:
=sum(sign(frequency(<array>,<array>)))
to get:
=sum(sign(frequency( if(($a$2:$a$10="red")*isnumber(match($b$2:$b$10,{"orange","grapefruit","lemon","lime"},0)),match($c$2:$c$10,$c$2:$c$10,0),""), if(($a$2:$a$10="red")*isnumber(match($b$2:$b$10,{"orange","grapefruit","lemon","lime"},0)),match($c$2:$c$10,$c$2:$c$10,0),"") )))
again, must entered array formula using ctrl-shift-enter
. replacing sum sumproduct not cut it.
in example, you'd use like:
=sum(sign(frequency( if(isnumber(match($c$3:$c$94735,{"sharable content object reference model 1.2","authored scorm/aicc content","authored external web content"},0)),match($j$3:$j$94735,$j$3:$j$94735,0),""), if(isnumber(match($c$3:$c$94735,{"sharable content object reference model 1.2","authored scorm/aicc content","authored external web content"},0)),match($j$3:$j$94735,$j$3:$j$94735,0),"") )))
i'll note, though, scaling might problem on data sets large yours. tested on larger data sets, , fast on order of 10k rows, slow on order of 100k rows, such yours. internal arrays plenty fast, frequency function slows down. i'm not sure, i'd guess it's between o(n log n) , o(n^2) depending on how sort implemented.
maybe doesn't matter - none of volatile, it'll need calculate once upon refreshing data. if column data changing, though, painful.
Comments
Post a Comment