statistics - Intuition about the kernel trick in machine learning -


i have implemented kernel perceptron classifier, uses rbf kernel. understand kernel trick maps features higher dimension linear hyperplane can constructed separate points. example, if have features (x1,x2) , map 3-dimensional feature space might get: k(x1,x2) = (x1^2, sqrt(x1)*x2, x2^2).

if plug perceptron decision function w'x+b = 0, end with: w1'x1^2 + w2'sqrt(x1)*x2 + w3'x2^2which gives circular decision boundary.

while kernel trick intuitive, not able understand linear algebra aspect of this. can me understand how able map of these additional features without explicitly specifying them, using inner product?

thanks!

simple.

give me numeric result of (x+y)^10 values of x , y.

what rather do, "cheat" , sum x+y , take value 10'th power, or expand out exact results writing out

x^10+10 x^9 y+45 x^8 y^2+120 x^7 y^3+210 x^6 y^4+252 x^5 y^5+210 x^4 y^6+120 x^3 y^7+45 x^2 y^8+10 x y^9+y^10 

and compute each term , add them together? can evaluate dot product between degree 10 polynomials without explicitly forming them.

valid kernels dot products can "cheat" , compute numeric result between 2 points without having form explicit feature values. there many such possible kernels, though few have been getting used lot on papers / practice.


Comments

Popular posts from this blog

html - Sizing a high-res image (~8MB) to display entirely in a small div (circular, diameter 100px) -

java - IntelliJ - No such instance method -

identifier - Is it possible for an html5 document to have two ids? -