Search

Ranking search query results

Queries qQ: bag of words, Set of documents D: a seq of words. For query qi, retrieved relevant documents DiD; Got full or partial orderings Li. Let Q={qi}. Maybe want to complete Q vs D relevence level matrix. Or, given docs D’ relevant for unseen query q, want to rank D'.

Feature extraction

ϕ:Q×DXRD; D is usually 10. ϕi(q,d) could be TF/IDF, common word count etc.. Let f:XR be scoring function; it induces a relevance order {zi} on D for every query qi; let F={f}.

Objectives to optimize

Let yi be ground truth relevance-ordering of D corresponding to query qi; let y be induced by g:XR.

Loss function: L(f,qi) measures deviation of zi from yi.\ Risk: R=QL(f,q)fQ(q)dq. Want to find f which minimizes L. But Pr(q) usually unknown; so minimize empirical risk: R^=n1iL(f,qi). Assume that |RR^|0.

Various Loss functions L

Pointwise

\tbc

Pairwise

Checks if this is violated: yi(d)>yi(d)f(d)>f(d); so (yi(d)yi(d))(f(d)f(d))0. Widely used. Eg: SVM ordinal regression; ranknet (best pairwise function as of 2009); lambda-rank: f(d)=wTd.

Listwise

Measure badness of a bigger part of the ranking, not just correctness of ranking of pairs of documents. qi uniquely identified by (yi,D); so can consider L(f,yi,D) or L(zi,yi).

Using a probability distribution on permutations

Define prob distribution over permutations Prf(π) with good properties: favors permutations rated highly according to f to permutations rated lower, best permutation according to f has max probability. So find f whose Prf is closest to Prg. Can minimize cross entropy: L(f,g)=πPrg(π)logPrf(π) (List net); or L(f,g)=logPr(yi|D,f): thereby maximizing likelihood (List-MLE). These Listwise loss calculation is hard: may need to consider n! permutations: so instead consider only ranking of top k docs induced by f. List-MLE known to be best Listwise loss function based ranking method (2009), also better than all pairsise loss function based methods.